This is an in-person event! Registration is required to get in.
Topic: Connecting your unstructured data with Generative LLMs
What well do:
Have some food & refreshments. Hear three exciting talks about unstructured data, vector databases & generative AI.
5:30 - 6:00 - Welcome/Networking/Registration
6:00 - 6:20 - Tim Spann, Principal DevRel, Zilliz
6:20 - 6:45 - Uri Goren, Urimax
7:00 - 7:30 - Lisa N Cao, Product Manager, Datastrato
7:30 - 8:00 - Naren, Unstract
8:00 - 8:30 - Networking
Intro Talk:
Hiring?
Need a Job?
Cool project?
Meetup Logistics
Trick-Or-Treat
Using Milvus as a Ghost Trap
Tech talk 1: Introduction to Vector search
Uri Goren, Argmx CEO
Deep learning has been a game-changer for modern AI, but deploying it in production environments poses significant challenges. Vector databases (VDBs) have become the go-to solution for real-time, embedding-based queries. In this talk, well explore the problems VDBs address, the trade-offs between accuracy & performance, & what the future holds for this evolving technology.
Tech talk 2: Metadata Lakes for Next-Gen AI/ML
Lisa N Cao, Product Manager, Datastrato
![img](https://images.lumacdn.com/editor-images/n2/d5322175-dfc6-4b2d-8fe9-300432673f39.jpeg)
As data catalogs evolve to meet the growing & new demands of high-velocity, unstructured data, we see them taking a new shape as an emergent & flexible way to activate metadata for multiple uses. This talk discusses modern uses of metadata at the infrastructure level for AI-enablement in RAG pipelines in response to the new demands of the ecosystem. We will also discuss Apache (incubating) Gravitino & its open source-first approach to data cataloging across multi-cloud & geo-distributed architectures.
Tech talk 3:
Unstructured Document Data Extraction at Scale with LLMs: Challenges & Solutions
Unstructured documents present a significant challenge for businesses, particularly those managing them at scale. Traditional Intelligent Document Processing (IDP) systemslet's call them IDP 1.0rely heavily on machine learning & NLP techniques. These systems require extensive manual annotation, making them time-consuming & less effective as document complexity & variability increase.
The advent of Large Language Models (LLMs) is ushering in a new era: IDP 2.0. However, while LLMs offer significant advancements, they also come with their own set of challenges, particularly around accuracy & cost, which can become prohibitive at scale. In this talk, we will look at how Unstract, an open source IDP 2.0 platform purpose-built for structured document data extraction, solves these challenges. Processing over 5 million pages of unstructured documents per month, Unstract uses various techniques to extract structured data with accuracy & cost efficiency, chief among themthe use of vector databases.
Naren H - Co-founder/COO, Unstract
Naren H is the co-founder at Unstract, an open source startup building an LLM-powered platform that extracts data from unstructured documents, helping automate critical business processes. Before Unstract, Naren founded Mediavak, a digital marketing agency, & co-founded Social Animal & Tweeple Search, building tools that made social media analytics & content marketing a breeze. He holds a Masters in Computer Science from the State University of New York at Buffalo. He has a knack for turning data chaos into order occasionally, he even manages to keep his emails under control.
Speaker LinkedIn Profile: https://www.linkedin.com/in/naren87/
![img](https://images.lumacdn.com/editor-images/lw/7ebd4917-239a-4c95-918f-6511c836cd2b)
Who Should attend:
Anyone interested in talking & learning about Unstructured Data & Generative AI Apps.
When:
October 23, 2024
5:30PM
Where:
This is an in-person event! Registration is required to get in. Registration will close 2 days before the event. Sponsored by Zilliz maintainers of Milvus.