Snorkel at AI Engineer World's Fair

Join Snorkel and thousands of your peers for 800+ sessions, keynotes and training at the world’s largest data, analytics and AI conference.

San Francisco, CA

June 29-July 2, 2026

Booth L-G12

Accepted paper

Benchmarking Agents in Insurance Underwiting Environments

UNDERWRITE is an expert-first benchmark for evaluating AI agents in insurance underwriting, built in close collaboration with domain practitioners to capture enterprise-realistic complexity: proprietary business knowledge, noisy tool interfaces, and imperfect data. It fills the gap left by open-domain benchmarks that overemphasize code and narrow accuracy metrics.

Amanda Dsouza, Ramya Ramakrishnan, Charles Dickens,
Bhavishya Pohani, Christopher M Glaze

Featured session

Towards Reliable Financial Agents: How a 4B Model Outsmarted a 235B Giant

Bigger models often reason better, but they don’t always behave better – especially with tools. This talk shows how a 4B model was fine-tuned to outperform a 235B model on financial analysis tasks by learning strong tool discipline with reinforcement learning, demonstrating that better behavior – not bigger models – can drive stronger real-world results.