Snorkel at AI Engineer World's Fair
Join Snorkel and thousands of your peers for 800+ sessions, keynotes and training at the world’s largest data, analytics and AI conference.
June 29-July 2, 2026
Booth L-G12
Featured session
Towards Reliable Financial Agents: How a 4B Model Outsmarted a 235B Giant
June 30, 2026 · 3:45-4:05pm
Expo Stage 3
Bigger models often reason better, but they don’t always behave better – especially with tools. This talk shows how a 4B model was fine-tuned to outperform a 235B model on financial analysis tasks by learning strong tool discipline with reinforcement learning, demonstrating that better behavior – not bigger models – can drive stronger real-world results.


Charlie Dickens
Senior Applied Research Scientist
Featured session
From Agent Traces to Agent Simulations: The next era of agent evaluation
July 1, 2026 · 12:05-12:25pm
Evals / Room 2005
This talk explores how executable simulation environments let teams repeatedly test agents across realistic tasks, compare models and harnesses, and uncover failure modes that trace review alone misses. Drawing from Snorkel's experience building simulation datasets at scale for major labs and contributions to projects like Agents' Last Exam and Terminal-Bench, we'll cover concrete engineering patterns for building these environments.


Speaker
Rustem Feyzkhanov
Senior Engineering Manager, AI Platform
Accepted paper
Benchmarking Agents in Insurance Underwiting Environments
UNDERWRITE is an expert-first benchmark for evaluating AI agents in insurance underwriting, built in close collaboration with domain practitioners to capture enterprise-realistic complexity: proprietary business knowledge, noisy tool interfaces, and imperfect data. It fills the gap left by open-domain benchmarks that overemphasize code and narrow accuracy metrics.
Amanda Dsouza, Ramya Ramakrishnan, Charles Dickens,
Bhavishya Pohani, Christopher M Glaze









