author

Ramya Ramakrishnan

Applied Research Scientist

,

Snorkel AI

The latest from Ramya

Blog

Coding agents don’t need to be perfect, they need to recover

Error analysis of 8 models on Agentic Coding tasks Successful completion of complex tasks doesn’t come from models being always right. It comes from models being resilient when things go wrong. To get a deeper understanding of model behavior in agentic environments, our team analyzed all of the errors found in the full traces of tasks from our Agentic Coding…

Feb 13, 2026 •

Ramya Ramakrishnan

Learn more about Coding agents don’t need to be perfect, they need to recover

Benchmarking Agents in Insurance Underwriting Environments

As AI agents integrate into enterprise applications, their evaluation demands benchmarks that reflect the complexity of real-world operations. Instead, existing benchmarks overemphasize open-domains such as code, use narrow accuracy metrics, and lack authentic complexity. We present UNDERWRITE, an expert-first, multi-turn insurance underwriting benchmark designed in close collaboration with domain experts to capture real-world enterprise challenges. UNDERWRITE introduces critical realism factors often absent in current benchmarks: proprietary business knowledge, noisy tool interfaces, and imperfect simulated users requiring careful information gathering. Evaluating 13 frontier models, we uncover significant gaps between research lab performance and enterprise readiness: the most accurate models are not...

Research Paper

Accepted to CAIS 2026

Benchmarking Agents in Insurance Underwriting Environments

As AI agents integrate into enterprise applications, their evaluation demands benchmarks that reflect the complexity of real-world operations. Instead, existing benchmarks overemphasize open-domains such as code, use narrow accuracy metrics, and lack authentic complexity. We present UNDERWRITE, an expert-first, multi-turn insurance underwriting benchmark designed in close collaboration with domain experts to capture real-world enterprise challenges. UNDERWRITE introduces critical realism factors…

Jan 31, 2026 •

Snorkel Team

Learn more about Benchmarking Agents in Insurance Underwriting Environments

For models that need to be right. Not just good enough.

Request dataset samples

Talk to our team

Ramya Ramakrishnan

The latest from Ramya

For models that need to be right. Not just good enough.

How do you want to work with Snorkel?