Apache Iceberg Meetup | SF Tech Events - GarysGuide

COMING UP

NY Tech Week
(May 31 - Jun 08)

Apache Iceberg Meetup Popular Event

With Roni Burd (Dir. Product Engg, Amazon AWS), Yingjun Wu (Founder, RisingWave Labs), Snehal Chennuru (Engg Mgr, Netflix), Kevin Wang (Enggr, Eventual), Jack Ye (Sr S/w Enggr, Amazon AWS Open Data), Bryan Keller & Tim Jiang (S/w Enggrs, Netflix).

	Venue, To Be Announced, San Francisco
	Nov 04 (Mon) @ 05:00 PM
	FREE

DETAILS

Ice{berg} over Drinks nWere partnering with AWS, Snowflake, & the Apache Iceberg Community to co-host the next Bay Area Apache Iceberg Community Meetup!nConnect with fellow enthusiasts, share insights, & dive into the latest developments in the Apache Iceberg ecosystem! Whether you're a seasoned pro or new to Apache Iceberg, this meetup is the perfect place to exchange ideas & spark innovation.nnAgendan5:00p - 6:00p: Doors Open & Networking n6:00p - 7:45p: Welcome Remarks & Presentations!n7:45p - 8:30p: More Networking nnAbout DaftnDaft is an open source framework that powers ETL, analytics, & ML/AI at scale. Its familiar Dataframe API is built to outperform Spark in performance & ease of use.nJoin Distributed Data Community SlacknCheck out Daft Engineering BlognFollow Daft on LinkedIn & TwitternSubscribe to Daft YouTubenWere hiring, join our teamnAbout AWSnApache Iceberg is an open-source table format that simplifies table management while improving performance. AWS analytics services such as Amazon EMR, AWS Glue, Amazon Athena, & Amazon Redshift include native support for Apache Iceberg, so you can easily build transactional data lakes on top of Amazon Simple Storage Service (Amazon S3) on AWS.nAdditional Resources & Information:nWorkshop: Running Apache Iceberg on AWSnBlogs: Apache Iceberg on AWSnAWS Prescriptive Guidance: Using Apache Iceberg on AWSnSubscribe to AWS Events & AWS DevelopersnWere hiring, join our teamnnPresentationsnLessons From Building Iceberg Capabilities In Daft, A Distributed Query EnginenIn this talk, we will share our experience building distributed Iceberg operations in Daft. We will walk through how we adapted PyIceberg for distributed workloads, including how we were able to build features such as partitioned writes into Daft. We will also discuss our challenges of using existing Python/Rust Iceberg tooling & what workarounds we implemented. Finally, we will talk about what it means for an Iceberg library to provide useful abstractions while giving the query engine proper control over execution, & what API interfaces we propose may enable that.nKevin Wang is a founding engineer at Eventual & a primary contributor to the Daft open-source project. Prior to Eventual, he completed an undergraduate degree at UC Berkeley where he did research in AI & LLM systems & worked in quantitative finance at Arrowstreet & Akuna.n Accelerate Your Iceberg Workloads on S3nThis talk discusses the recent improvements that Amazon S3 team has been doing in Iceberg FileIO & LocationProvider to improve Iceberg user experience on S3. This includes better retry & fault tolerant executions (#10433 & #11052), better hashing scheme to reduce throttling (#11112), & integration with S3 Data Acceleration Toolkit & AWS CRT client to improve read performance.nJack Ye is a Sr. Software Engineer at AWS Open Data Analytics. His team focuses on the integration of open source storage layer solutions including Iceberg, Hudi, Delta, Parquet, Avro, etc. with AWS analytics products. Jack is also a PMC member of the Iceberg project.nRoni Burd is Dir of Product Engineering at AWS, & builds platform & developer tools. Roni brings 15+ years of experience working in the query engines, storage engines, & compute platform for database systems & ML processing.n How We Implemented the Iceberg Connector in Rust!nIn this talk, we will discuss how we implemented the Iceberg connector in Rust, replacing the original Java-wrapped version to address performance bottlenecks in serialization & memory usage. By following the Apache Iceberg specification, we built a native Rust connector that supports Icebergs advanced features, such as multi-catalog compatibility & streaming updates. Weve contributed this new version to the apache/iceberg-rust repository, & will share insights into the architectural improvements & best practices for leveraging Iceberg in streaming environments.nYingjun Wu is the founder of RisingWave Labs, a database company developing RisingWave, a distributed SQL database for stream processing. Before running the company, Yingjun was a software engineer at the Redshift team, Amazon Web Services, & a researcher at the Database group, IBM Almaden Research Center. He has been working in the field of stream processing & database systems for over a decade.n Iceberg at NetflixnNetflix's Iceberg past, present, & future (call out to community for where they see the technology challenges). Netflix will briefly cover our journey from Hive to Iceberg, current systems with catalog, compaction, & replication, & the improvements we're making.nSnehal Chennuru is an engineering manager for the Big Data Warehouse team at Netflix, with over a decade of experience building distributed systems at Netflix, Skyhigh Networks, & Clearwell Systems.nBryan Keller is a software engineer on the Big Data Warehouse team at Netflix, with over a decade of experience building big data systems. He is also an early Iceberg advocate & Iceberg committer.nTim Jiang is a software engineer on the Big Data Warehouse team at Netflix. Over the past few years, he has focused on strengthening data security for Iceberg & query engines.n Lakekeeper: Rust based Iceberg CatalognThe Rust ecosystem in the data space is evolving quickly. With Lakekeeper, we are filling the gap of a Rust-native modular Iceberg Rest Catalog designed for decentralized deployments.nChristian Thiel is the CTO of HANSETAG GmbH & a data enthusiast building the future of Data Collaboration with Iceberg.