<iframe src="https://www.googletagmanager.com/ns.html?id=GTM-WXF5MSZ" height="0" width="0" style="display:none;visibility:hidden"></iframe>

Reading group

The Snorkel AI Reading Group (SARG) is a recurring forum for researchers and practitioners to explore the latest frontier developments in AI while building meaningful connections within the community.

We’ll dive into the most talked-about research in benchmarking and evaluation, pressure-test the ideas, share our POV, and bring in the authors themselves for open discussion.

Past events

Past

Olmix: A Framework for  Data Mixing Throughout  LM Development

with Mayee Chen, PhD Candidate, Stanford AI Lab; Founding Scientist, Stealth Startup

↳ Watch recap

Past

Improving LLM Agents with Code World Models & AutoHarness

with Carter Wendelken, Google Deepmind

↳ Watch recap

Past

Collaborative Gym: A Framework for Enabling and Evaluating Human-Agent Collaboration

with Yijia Shao, PhD candidate, Stanford NLP

↳ Watch recap

Past

JUDGMENTBENCH: Comparing Rubric and Preference Evaluation for Quality Assessment

with Russell Yang, AI Engineering Fellow, Stanford Law School

↳ Watch recap

Past

Agents’ Last Exam

with Yiyou Sun, Postdoctoral Researcher, UC Berkeley

with Xinyang Han, PhD Student, UC Berkeley

↳ Watch recap

Past

Senior SWE-Bench

with Henry Ehrenberg, Lead Researcher, Senior SWE-Bench