We develop methods, benchmarks, and training systems that turn expert data into frontier AI
building benchmarks and collaborating with
from the lab
Featured research
key research areas
Vision and impact
We help labs advance frontier models by working with domain experts to design and build complex, realistic datasets that drive model performance.
Benchmarking & Evaluation
Build benchmarks that define and advance the AI frontier
featured work
Scaling Subject Matter Expertise
Define how subject matter experts encode their knowledge into data
featured work
RL, Training, & Data Valuation
Drive dataset development based on feedback from RL and model training
featured work
initiatives
Community and open science
Open benchmarks, conversations, and research for real-world AI performance.

Open Benchmarks Grants
Backed by a $3M commitment, the program funds open-source datasets, benchmarks, and evaluation artifacts that shape how frontier AI systems are built and evaluated.

Benchtalks
Our podcast series at the intersection of AI evaluation, data quality, and real-world impact.

Reading Group
A recurring forum for researchers and practitioners to explore the latest frontier developments in AI while building meaningful connections within the community.
DEEP RESEARCH Expertise
Technical advisors and distinguished affiliates
PUBLICATIONS
Browse research blogs and academic papers
Type: All Types
Sort: Newest
Data Programming With DDLite: Putting Humans in a Different Part of the Loop
Introducing DDLite, an interactive development framework for data programming.
Research Paper
Data Programming With DDLite: Putting Humans in a Different Part of the Loop
Introducing DDLite, an interactive development framework for data programming.


Let’s research together
Join our team of leading researchers and help shape the future of AI.








