From cutting-edge research to enterprise and frontier impact

Our research team advances the science of data-centric AI in partnership with leading enterprises and frontier labs. We translate these breakthroughs into production, powering the next generation of AI systems across industries, research, and government.

Deep research roots

Born out of the Stanford AI lab in 2019 and in collaboration with leading research institutions, Snorkel-affiliated researchers have published more than 170 peer-reviewed research papers on weak supervision, AI data development techniques, foundation models, and more — with special recognition at events such as NeurlPS, ICML, and ICLR. Our researchers are closely affiliated with academic institutions including Stanford University, University of Washington, Brown University, and the University of Wisconsin-Madison
ImageImageImageImage
Image

Browse blog posts and 100+ peer reviewed academic papers

Blog

Parsing Isn’t Neutral: Why Evaluation Choices Matter

Read more
Learn More about Parsing Isn’t Neutral: Why Evaluation Choices Matter
Blog

The science of rubric design

Read more
Learn More about The science of rubric design
Research Paper

Shrinking the Generation-Verification Gap with Weak Verifiers

Read more
Learn More about Shrinking the Generation-Verification Gap with Weak Verifiers
Blog

Data quality and rubrics: how to build trust in your models

Read more
Learn More about Data quality and rubrics: how to build trust in your models
Research Paper

Theoretical Physics Benchmark (TPBench)- a Dataset and Study of AI Reasoning Capabilities in Theoretical Physics

Read more
Learn More about Theoretical Physics Benchmark (TPBench)- a Dataset and Study of AI Reasoning Capabilities in Theoretical Physics
Research Paper

WONDERBREAD: A Benchmark for Evaluating Multimodal Foundation Models on Business Process Management Tasks

Read more
Learn More about WONDERBREAD: A Benchmark for Evaluating Multimodal Foundation Models on Business Process Management Tasks
Research Paper

The ALCHEmist: Automated Labeling 500x CHEaper Than LLM Data Annotators

Read more
Learn More about The ALCHEmist: Automated Labeling 500x CHEaper Than LLM Data Annotators
Research Paper

Skill-It! A Data-Driven Skills Framework for Understanding and Training Language Models

Read more
Learn More about Skill-It! A Data-Driven Skills Framework for Understanding and Training Language Models
Research Paper

Weak-to-Strong Generalization Through the Data-Centric Lens

Read more
Learn More about Weak-to-Strong Generalization Through the Data-Centric Lens
Research Paper

Systems and Methods for Programmatic Labeling of Training Data for Machine Learning Models via Clustering and Language Model Prompting

Read more
Learn More about Systems and Methods for Programmatic Labeling of Training Data for Machine Learning Models via Clustering and Language Model Prompting
Research Paper

Scalable Approach to Medical Wearable Post-Market Surveillance

Read more
Learn More about Scalable Approach to Medical Wearable Post-Market Surveillance
Research Paper

Zero-Shot Robustification of Zero-Shot Models with Foundation Models

Read more
Learn More about Zero-Shot Robustification of Zero-Shot Models with Foundation Models
1 2 3 ... 24

Join the Snorkel Research Team

Join our team of leading researchers and help shape the future of AI.
View all careers
Role
Department
Location