| |
|
| |
| DETAILS |
|
|
Open Source Data Streaming Meetup!nJoin us on March 17th (Tuesday) from 5:30-8:30 PM at the Snowflake Menlo Park Office.nConnect with fellow community members, share insights, & dive into the latest developments in the world of data streaming around Apache Kafka, Apache Flink, & more!nNote on Parking: There is free parking available at the event.nAgendan5:30 PM - 6:15 PM: Doors Open & Networkingn6:15 PM - 8:00 PM: Welcome Remarks & Presentations!n8:00 PM - 8:30 PM: More NetworkingnnSessionsnDeep Dive into Flink's Disaggregated State Management, Vasia Kalavri, Boston UniversitynIf you've operated Flink jobs with large state, you've probably hit some familiar pain points: long recovery times, unexpected CPU spikes from RocksDB compactions, or running out of local disk space at the worst possible time. This talk explores how Apache Flink 2.x fundamentally changes the game by separating compute from storage, enabling faster scaling & recovery. We'll dive into the internals of disaggregated state management, discuss why naively combining a remote state backend with Flinks synchronous execution model is a bad idea, & explain how to make the runtime asynchronous, while ensuring Flinks out-of-order execution semantics & fault-tolerance guarantees are preserved.nUnder the Hood: The Evolution of Snowflake's Streaming Ingest (V1 to V2), Tyler Jones, SnowflakenSnowflake's streaming ingest has gone through a significant architectural evolution. In this talk, we'll take you inside the journey from Snowpipe Streaming V1 to its high-performance successor, V2 how they work under the hood, what changed, & why.nWe'll start with the V1 architecture, walk through how it maps to Kafka's topic/partition model via Kafka Connect, & explain how we achieve exactly-once semantics at scale. We'll share real-world lessons from operating V1 in production particularly its impact on downstream query performance & how those learnings drove the design of V2.nFrom there, we'll dig into the V2 + Kafka Connect integration: the new challenges that came with the architecture shift, including data validation at ingest time, handling schema evolution across heterogeneous topics, & the trade-offs we made along the way. If you're running Kafka pipelines that land data in an analytical store, this talk is a look at the hard problems behind making that fast, correct, & reliable.
|
|
|
|
|
|
|
|