Image

On-demand agenda

Speakers

Image

Rebekah Westerlind

Software Engineer
Snorkel AI

Rebekah Westerlind is a full-stack software engineer at Snorkel AI on the product engineering team. She graduated from Cornell University in 2022 with degrees in Computer Science and Operations Research & Information Engineering. Driven by a desire to always be learning, Rebekah loves jumping in on new projects and surrounding herself with experts.

Image

Venkatesh Rao

Staff Product Manager
Snorkel AI

Venkatesh Rao is a Product Manager at Snorkel AI with a strong background in tech product leadership. Previously, he was Director of Product Management at Microsoft and Senior Product Manager at NVIDIA. He holds an MBA from UC Berkeley and specializes in driving innovation and delivering impactful AI solutions.

Evaluating LLM Systems

LLM evaluation is critical for generative AI in the enterprise, but measuring how well an LLM answers questions or performs tasks is difficult. Thus, LLM evaluations must go beyond standard measures of “correctness” to include a more nuanced and granular view of quality.

In practice, enterprise LLM evaluations (e.g., OSS benchmarks) often come up short because they’re slow, expensive, subjective, and incomplete. They leave AI initiatives blocked because there is no clear path to production quality.

In this session, Venkatesh Rao, Staff Product Manager at Snorkel AI, and Rebekah Westerlind, Software Engineer at Snorkel AI, will discuss the importance of LLM evaluation, highlight common challenges and approaches, and explain the core concepts behind Snorkel AI’s approach to data-centric LLM evaluation.

Join us to learn more about:

  • Understanding the nuances of LLM evaluation
  • Evaluating LLM response performance at scale
  • Identifying where additional LLM fine-tuning is needed