| |
High-Performance Inference For Open LLMs
|
| With Charles Frye (Dev Advocate, Modal), Qiaolin Yu (Performance Optimization, SGLang), Nishant Agrawal (Solns Architect, Alibaba Cloud).
|
| Venue, 375 Alabama St, San Francisco |
|
Jan 29 (Thu) , 2026 @ 05:30 PM
| |
FREE |
|
|
|
|
|
|
|
|
| |
| DETAILS |
|
When do open models & inference engines beat proprietary solutions?
Join Modal, Qwen, & SGLang for an evening optimizing performance & cost for LLM inference. Our speakers will cover:
The Cold Start Issue - how can AI infrastructure enable seamless AI experiences, right from the start?
Accelerating Open Models - how do inference engines work with model developers to achieve benchmarking goals?
Choosing the Best Model - how should developers choose the most effective model for their use case?
With:
Charles Frye, GPU Enjoyer & Developer Advocate at Modal
Qiaolin Yu, Performance Optimization at SGLang
Nishant Agrawal, Senior Solutions Architect at Alibaba Cloud
Agenda
We're excited to bring together founders, AI engineers, & ML systems researchers for an evening with:
Demos & Lightning Talks
Community, Pizza, Drinks
Your hosts,
Modal, Qwen & SGLang
|
|
|
|
|
|
|
|