Snorkel helps build Terminal-Bench 2.0. Learn More
From cutting-edge research to enterprise and frontier impact
Our research team advances the science of data-centric AI in partnership with leading enterprises and frontier labs. We translate these breakthroughs into production, powering the next generation of AI systems across industries, research, and government.
Deep research roots
Born out of the Stanford AI lab in 2019 and in collaboration with leading research institutions, Snorkel-affiliated researchers have published more than 170 peer-reviewed research papers on weak supervision, AI data development techniques, foundation models, and more — with special recognition at events such as NeurlPS, ICML, and ICLR. Our researchers are closely affiliated with academic institutions including Stanford University, University of Washington, Brown University, and the University of Wisconsin-Madison
Featured Benchmarks
Exclusive to Snorkel, these benchmarks are meticulously designed and validated by subject matter experts to probe frontier AI models on demanding, specialized tasks.
These are just a few of our featured benchmarks — new ones are added regularly, so check back often to see the latest from our research team.
These are just a few of our featured benchmarks — new ones are added regularly, so check back often to see the latest from our research team.
SnorkelUnderwrite
An expert-verified frontier benchmark with multi-turn conversations, focused on agentic reasoning and tool use in commercial underwriting settings.
View All Results
Finance Reasoning
A benchmark co-created with Snorkel's financial expert network, to test agents on financial reasoning questions, through tool-calling and planning.
View All Results
SnorkelSequences
A procedurally-generated and expert-verified benchmark for evaluating mathematical reasoning and compositional capabilities in LLMs.
View All Results
Browse blog posts and 100+ peer reviewed academic papers
Blog
Parsing Isn’t Neutral: Why Evaluation Choices Matter
Read more
Research Paper
Shrinking the Generation-Verification Gap with Weak Verifiers
Read more
Blog
Data quality and rubrics: how to build trust in your models
Read more
Research Paper
Theoretical Physics Benchmark (TPBench)- a Dataset and Study of AI Reasoning Capabilities in Theoretical Physics
Read more
Research Paper
WONDERBREAD: A Benchmark for Evaluating Multimodal Foundation Models on Business Process Management Tasks
Read more
Research Paper
The ALCHEmist: Automated Labeling 500x CHEaper Than LLM Data Annotators
Read more
Research Paper
Skill-It! A Data-Driven Skills Framework for Understanding and Training Language Models
Read more
Research Paper
Weak-to-Strong Generalization Through the Data-Centric Lens
Read more
Research Paper
Systems and Methods for Programmatic Labeling of Training Data for Machine Learning Models via Clustering and Language Model Prompting
Read more
Research Paper
Scalable Approach to Medical Wearable Post-Market Surveillance
Read more
Research Paper
Zero-Shot Robustification of Zero-Shot Models with Foundation Models
Read more
Join the Snorkel Research Team
Join our team of leading researchers and help shape the future of AI.
View all careers