In this post, we unpack how Snorkel built a realistic benchmark dataset to evaluate AI agents in commercial insurance underwriting. From expert-driven data design to multi-tool reasoning tasks, see how our approach surfaces actionable failure modes that generic benchmarks miss—revealing what it really takes to deploy AI in enterprise workflows.
AI alignment ensures that AI systems align with human values, ethics, and policies. Here’s a primer on how developers can build safer AI.
Snorkel takes a step on the path to enterprise superalignment with new data development workflows for enterprise alignment
Humans learn tasks better when taught in a logical order. So do LLMs. Researchers developed a way to exploit this tendency called “Skill-it!”
The surest way to improve foundation models is through more and better data, but Snorkel researchers showed FMs can learn from themselves.
Getting better performance from foundation models (with less data)