Unstructured Data In LLMs | NYC Tech Events - GarysGuide

Unstructured Data In LLMs Gary Event

With Lisa Cao (Product Mgr, Datastrato), Uri Goren (CEO, Argmx), Naren Hariparanthaman (Co-Founder/COO, Unstract).

	Jay Suites, 159 W 25th St
	Oct 23 (Wed) @ 09:30 PM
	FREE

DETAILS

This is an in-person event! Registration is required to get in.

Topic: Connecting your unstructured data with Generative LLMs

What well do:

Have some food & refreshments. Hear three exciting talks about unstructured data, vector databases & generative AI.

5:30 - 6:00 - Welcome/Networking/Registration

6:00 - 6:20 - Tim Spann, Principal DevRel, Zilliz

6:20 - 6:45 - Uri Goren, Urimax

7:00 - 7:30 - Lisa N Cao, Product Manager, Datastrato

7:30 - 8:00 - Naren, Unstract

8:00 - 8:30 - Networking

Intro Talk:

Hiring?

Need a Job?

Cool project?

Meetup Logistics

Trick-Or-Treat

Using Milvus as a Ghost Trap

Tech talk 1: Introduction to Vector search

Uri Goren, Argmx CEO

Deep learning has been a game-changer for modern AI, but deploying it in production environments poses significant challenges. Vector databases (VDBs) have become the go-to solution for real-time, embedding-based queries. In this talk, well explore the problems VDBs address, the trade-offs between accuracy & performance, & what the future holds for this evolving technology.

Tech talk 2: Metadata Lakes for Next-Gen AI/ML

Lisa N Cao, Product Manager, Datastrato

![img](https://images.lumacdn.com/editor-images/n2/d5322175-dfc6-4b2d-8fe9-300432673f39.jpeg)

As data catalogs evolve to meet the growing & new demands of high-velocity, unstructured data, we see them taking a new shape as an emergent & flexible way to activate metadata for multiple uses. This talk discusses modern uses of metadata at the infrastructure level for AI-enablement in RAG pipelines in response to the new demands of the ecosystem. We will also discuss Apache (incubating) Gravitino & its open source-first approach to data cataloging across multi-cloud & geo-distributed architectures.

Tech talk 3:

Unstructured Document Data Extraction at Scale with LLMs: Challenges & Solutions

Unstructured documents present a significant challenge for businesses, particularly those managing them at scale. Traditional Intelligent Document Processing (IDP) systemslet's call them IDP 1.0rely heavily on machine learning & NLP techniques. These systems require extensive manual annotation, making them time-consuming & less effective as document complexity & variability increase.

The advent of Large Language Models (LLMs) is ushering in a new era: IDP 2.0. However, while LLMs offer significant advancements, they also come with their own set of challenges, particularly around accuracy & cost, which can become prohibitive at scale. In this talk, we will look at how Unstract, an open source IDP 2.0 platform purpose-built for structured document data extraction, solves these challenges. Processing over 5 million pages of unstructured documents per month, Unstract uses various techniques to extract structured data with accuracy & cost efficiency, chief among themthe use of vector databases.

Naren H - Co-founder/COO, Unstract

Naren H is the co-founder at Unstract, an open source startup building an LLM-powered platform that extracts data from unstructured documents, helping automate critical business processes. Before Unstract, Naren founded Mediavak, a digital marketing agency, & co-founded Social Animal & Tweeple Search, building tools that made social media analytics & content marketing a breeze. He holds a Masters in Computer Science from the State University of New York at Buffalo. He has a knack for turning data chaos into order occasionally, he even manages to keep his emails under control.

Speaker LinkedIn Profile: https://www.linkedin.com/in/naren87/

![img](https://images.lumacdn.com/editor-images/lw/7ebd4917-239a-4c95-918f-6511c836cd2b)

Who Should attend:

Anyone interested in talking & learning about Unstructured Data & Generative AI Apps.

When:

October 23, 2024

5:30PM

Where:

This is an in-person event! Registration is required to get in. Registration will close 2 days before the event. Sponsored by Zilliz maintainers of Milvus.