NYC  SF        Events   Jobs   Deals  
    Sign in  
 
 
NYC Tech
Events Weekly Newsletter!
*
 
COMING UP

SF Open Source AI Week
(Oct 18 - Oct 26)

TechCrunch Disrupt SF
(Oct 25 - Oct 31)
 
 
 
 
 
 
 
 
 
 
With Owen Xiao (Founder, VeloDB), Songqiao Su (S/w Enggr, StarTree AI).
Adobe Founders Tower, 333 W San Fernando St, San Jose
Oct 27 (Mon) , 2025 @ 06:00 PM
FREE
 
Register
 
 

 
DETAILS

Welcome to another edition of South Bay Systems! This time, we'll have a double feature! First we'll have Songqiao Su & Raghav Yadav talking about optimizing Apache Pinot for real-time analytics, then we'll have Owen Xiao talking about variants & semi-structured data in Apache Doris.

Agenda
6:00 PM: Doors open, food & socializing

6:30 PM - 7:00 PM: Apache Pinot Talk

7:00 PM - 7:30 PM: Apache Doris Talk

7:30 PM onward : Community socializing!

Food & beverages will be provided, courtesy of our hosts, Adobe.

Low-Latency Serving on Cloud Object Stores with Apache Pinot
In this talk, we present the evolution of Apache Pinot's architecture: first from tightly coupled storage & compute, to decoupled cloud storage, & now toward native support for Parquet as a first-class segment format. We will discuss key technical innovations such as the implementation of a Parquet-compatible forward index reader, which enables all of Pinot's indexing strategies to operate directly on Parquet files. Additional optimizations include index pinning, Parquet page-level selective reads, page prefetching for efficient I/O parallelism, & page caching. Together, these enhancements allow Pinot's indexing & query execution framework to deliver sub-second performance directly on Parquet data, going far beyond conventional metadata-based pruning approaches.

Speaker Bio
Songqiao Su is a Staff Software Engineer at StarTree.AI, working on building tiered storage & improving compute-storage decoupling in Apache Pinot & StarTree Cloud. His work focuses on large-scale, high-performance distributed systems. Before joining StarTree, he worked on network & RPC infrastructure at Facebook & Databricks.

Raghav Yadav is a Staff Software Engineer at StarTree.AI, working on building a low-latency serving layer on Iceberg in Apache Pinot & StarTree Cloud. His expertise spans distributed databases & large-scale systems, with experience in cloud-scale data infrastructure at Microsoft Azure, real-time streaming databases as a founding engineer at Grainite, & now real-time OLAP analytics at StarTree.

The Evolution of Semi-Structured Data Analytics: From Text, JSON to VARIANT
Abstract
Semi-structured data, such as JSON, is gaining widespread adoption due to its flexibility. However, traditional databases & data warehouses are built for structured schemas, creating new challenges in storing & analyzing semi-structured formats. In this session, we'll explore:

Characteristics & challenges of semi-structured data

Limitations of traditional approaches

Apache Doris' native solution for semi-structured analytics

Comparison with Snowflake, Iceberg (VARIANT type), & Elasticsearch

Real-world applications in Log Analytics, Distributed Tracing, & IoT

Speaker Bio
Owen Xiao is a co-founder of VeloDB & a PMC member of Apache Doris, where he leads product strategy, observability, & AI-driven R&D for both open-source & enterprise data platforms. With over 10 years of experience in database kernel development & distributed systems architecture, he has helped scale analytical databases for global enterprises.
 
 
 
 
About    Feedback    Press    Terms    Gary's Red Tie
 
© 2025 GarysGuide