| |
Agentic + AI Observability Meetup
|
| With Jules Damj (Dev Advocate, Databricks), Corey Zumar (S/w Enggr, Databricks), Mengying Li (Head of Data & Product Growth, Braintrust). |
| Databricks, 160 Spear St, 15th Fl, San Francisco |
|
Feb 17 (Tue) , 2026 @ 05:00 PM
| |
FREE |
|
|
|
|
|
|
|
|
| |
| DETAILS |
|
Join us for Agentic + AI Observability meetup on Tuesday. February 17 from 5pm - 8pm PST at the Databricks SF office, an evening focused on agentic architectures & AI observability: how to design, ship, & monitor AI agents that actually work in production.
This meetup is built for engineers, ML practitioners, & AI startup founders who are already experimenting with agents (or planning to) & want to go deeper into the tech. We'll cover real-world patterns, failure modes, & tooling for building reliable agentic systems in the broader open-source ecosystem.
Whether you're at an early-stage startup or an established company, if you care about getting AI agents into production, & keeping them healthy, this meetup is for you.
Why you should attend
See real architectures: Learn how teams are designing agentic systems on top of data/feature platforms, retrieval, & tools, not just calling a single LLM endpoint.
Learn how to observe what agents are doing: Go beyond logs & dashboards to structured traces, evals, & metrics that help you understand & improve agent behavior over time.
Get hands-on with MLflow & observability tools: Watch live demos of MLflow, tracing integrations, & evaluation workflows for agentic systems.
Connect with other builders: Meet engineers, founders, & practitioners working on similar problems, swap patterns, & find collaborators & potential hires.
Agenda
5:00pm: Registration/Mingling
6:00pm: Welcome Remarks by Jules Damj, Databricks, Staff Developer Advocate
6:15pm: Talk #1 - Building Trustworthy, High-Quality AI Agents with MLflow
6:45pm: Talk #2 - Evaluating AI in Production: A Practical Guide
7:15pm: Mingling with bites + dessert
8:00pm: Night Ends
Speakers
Corey Zumar
Databricks
Staff Software Engineer
Mengying Li
Braintrust
Head of Data & Product Growth
Session Descriptions
Building Trustworthy, High-Quality AI Agents with MLflow
Building trustworthy, high-quality agents remains one of the hardest problems in AI today. Even as coding assistants automate parts of the development workflow, evaluating, observing, & improving agent quality is still manual, subjective, & time-consuming.
Teams spend hours vibe checking agents, labeling outputs, & debugging failures. But it doesn't have to be this slow or tedious. In this session, you'll learn how to use MLflow to automate & accelerate agent observability for quality improvement, applying proven patterns to deliver agents that behave reliably in real-world conditions.
Key Takeaways & Learnings
Understand the development lifecycle of Agent development for better observability
Use MLflow key components along the development lifecycle to enhance general observability: tracking & debugging, evaluation with MLflow judges, & a prompt registry for versioning
Select appropriately from a suite of over 60+ built-in & custom MLflow judges for evaluation, & use Judge Builder for automatic evaluation.
Use MLflow UI to compare & comprehend evaluation scores & metrics
Evaluating AI in Production: A Practical Guide
Evaluations are essential for shipping reliable AI products, but many teams struggle to move beyond manual testing. In this talk, I'll walk through how to build a production-ready evaluation framework - from choosing the right metrics & creating effective test cases to setting up continuous evaluation pipelines that catch issues before your users do. You'll walk away with practical patterns you can apply right away.
|
|
|
|
|
|
|
|