Unstructured Data In LLMs | NYC Tech Events - GarysGuide

Unstructured Data In LLMs Gary Event

With Lisa Cao (Product Mgr, Datastrato), Tim Spann (Principal DevRel, Zilliz), Chris Joynt (Sr PMM, Cloudera).

	Venue, 101 5th Ave
	Jul 25 (Thu) @ 09:30 PM
	FREE

DETAILS

This is an in-person event! Registration is required to get in.

Topic: Connecting your unstructured data with Generative LLMs

What well do:

Have some food & refreshments. Hear three exciting talks about unstructured data & generative AI.

5:30 - 6:00 - Welcome/Networking/Registration

6:05 - 6:30 - Tim Spann, Principal DevRel, Zilliz

6:35 - 7:00 - Chris Joynt, Senior PMM, Cloudera

7:05 - 7:30 - Lisa N Cao, Product Manager, Datastrato

7:30 - 8:30 - Networking

Tech talk 1: Unstructured Data Processing From Cloud to Edge

Speaker: Tim Spann, Principal Dev Advocate, Zilliz

In this talk I will do a presentation on why you should add a Cloud Native vector database to your Data & AI platform. He will also cover a quick introduction to Milvus, Vector Databases & unstructured data processing. By adding Milvus to your architecture you can scale out & improve your AI use cases through RAG, Real-Time Search, Multimodal Search, Recommendations Engines, fraud detection & many more emerging use cases.

As I will show, Edge devices even as small & inexpensive as a Raspberry Pi 5 can work in machine learning, deep learning & AI use cases & be enhanced with a vector database.

Tech talk 2: RAG Pipelines with Apache NiFi

Speaker: Chris Joynt, Senior PMM, Cloudera

Executing on RAG Architecture is not a set-it-and-forget-it endeavor. Unstructured or multimodal data must be cleansed, parsed, processed, chunked & vectorized before being loaded into knowledge stores & vector DB's. That needs to happen efficiently to keep our GenAI up to date always with fresh contextual data. But not only that, changes will have to be made on an ongoing basis. For example, new data sources must be added. Experimentation will be necessary to find the ideal chunking strategy. Apache NiFi is the perfect tool to build RAG pipelines to stream proprietary & external data into your RAG architectures. Come learn how to use this scalable & incredible versatile tool to quickly build pipelines to activate your GenAI use case.

Tech Talk 3: Metadata Lakes for Next-Gen AI/ML

Speaker: Lisa N Cao, Datastrato

Abstract: As data catalogs evolve to meet the growing & new demands of high-velocity, unstructured data, we see them taking a new shape as an emergent & flexible way to activate metadata for multiple uses. This talk discusses modern uses of metadata at the infrastructure level for AI-enablement in RAG pipelines in response to the new demands of the ecosystem. We will also discuss Apache (incubating) Gravitino & its open source-first approach to data cataloging across multi-cloud & geo-distributed architectures.

Who Should attend:

Anyone interested in talking & learning about Unstructured Data & Generative AI Apps.

When:

July 25, 2024

5:30PM

Where:

This is an in-person event! Registration is required to get in. Registration will close 2 days before the event. Sponsored by Zilliz maintainers of Milvus.