Image
author

Alex Ratner

Co-Founder & CEO, Snorkel AI
Faculty, University of Washington

Alex Ratner is the co-founder and CEO at Snorkel AI, and an affiliate assistant professor of computer science at the University of Washington. Prior to Snorkel AI and UW, he completed his Ph.D. in computer science advised by Christopher Ré at Stanford, where he started and led the Snorkel open source project. His research focused on data-centric AI, applying data management and statistical learning techniques to AI data development and curation.

The latest from Alex

Data-centric development of an enterprise AI agent with Snorkel
Blog
Data-centric development of an enterprise AI agent with Snorkel

See how we can use these two new products—Snorkel Evaluate and Expert Data-as-a-Service–to evaluate and develop a specialized agentic AI system for an enterprise use case

May 29, 2025
Learn more about Data-centric development of an enterprise AI agent with Snorkel
Building the data development platform for specialized AI
Blog
Building the data development platform for specialized AI

Announcing two new products on our AI Data Development Platform that together create a complete solution for enterprises to specialize AI systems with expert data at scale.

May 29, 2025
Learn more about Building the data development platform for specialized AI
On the Tradeoff of Intra-/Inter-class Diversity for Supervised Pre-training
Pre-training datasets are critical for building state-of-the-art machine learning models, motivating rigorous study on their impact on downstream tasks. In this work, we study the impact of the trade-off between the intra-class diversity (the number of samples per class) and the inter-class diversity (the number of classes) of a supervised pre-training dataset. Empirically, we found that with the size of the pre-training dataset fixed, the best downstream performance comes with a balance on the intra-/inter-class diversity. To understand the underlying mechanism, we show theoretically that the downstream performance depends monotonically on both types of diversity. Notably, our theory reveals that...
Research Paper
On the Tradeoff of Intra-/Inter-class Diversity for Supervised Pre-training

Pre-training datasets are critical for building state-of-the-art machine learning models, motivating rigorous study on their impact on downstream tasks. In this work, we study the impact of the trade-off between the intra-class diversity (the number of samples per class) and the inter-class diversity (the number of classes) of a supervised pre-training dataset. Empirically, we found that with the size of…

Sep 18, 2024
J. Zhang et al.
Learn more about On the Tradeoff of Intra-/Inter-class Diversity for Supervised Pre-training
Found in the Middle: Calibrating Positional Attention Bias Improves Long Context Utilization
Large language models (LLMs), even when specifically trained to process long input contexts, struggle to capture relevant information located in the middle of their input. This phenomenon has been known as the lost-in-themiddle problem. In this work, we make three contributions. First, we set out to understand the factors that cause this phenomenon. In doing so, we establish a connection between lost-in-the-middle to LLMs’ intrinsic attention bias: LLMs exhibit an U-shaped attention bias where the tokens at the beginning and at the end of its input receive higher attention, regardless of their relevance. Second, we mitigate this positional bias through...
Research Paper
Found in the Middle: Calibrating Positional Attention Bias Improves Long Context Utilization

Large language models (LLMs), even when specifically trained to process long input contexts, struggle to capture relevant information located in the middle of their input. This phenomenon has been known as the lost-in-themiddle problem. In this work, we make three contributions. First, we set out to understand the factors that cause this phenomenon. In doing so, we establish a connection…

Sep 18, 2024
C. Hsieh, et al.
Learn more about Found in the Middle: Calibrating Positional Attention Bias Improves Long Context Utilization
Walking safely before building flying saucer seatbelts: introducing Enterprise Alignment
Blog
Walking safely before building flying saucer seatbelts: introducing Enterprise Alignment

Snorkel takes a step on the path to enterprise superalignment with new data development workflows for enterprise alignment

Learn more about Walking safely before building flying saucer seatbelts: introducing Enterprise Alignment
Crossing the demo-to-production chasm with Snorkel Custom
Blog
Crossing the demo-to-production chasm with Snorkel Custom

We’re excited to announce Snorkel Custom to help enterprises cross the chasm from flashy chatbot demos to real production AI value.

Apr 11, 2024
Learn more about Crossing the demo-to-production chasm with Snorkel Custom
Enterprises must shift their focus from models to data in AI development
Blog
Enterprises must shift their focus from models to data in AI development

Snorkel AI CEO Alex Ratner explains his view on the importance of AI in data development and illustrates his position with two case studies.

Feb 09, 2024
Learn more about Enterprises must shift their focus from models to data in AI development
Characterizing the Impacts of Semi-supervised Learning for Weak Supervision
Labeling training data is a critical and expensive step in producing high accuracy ML models, whether training from scratch or fine-tuning. To make labeling more efficient, two major approaches are programmatic weak supervision (WS) and semi-supervised learning (SSL). More recent works have either explicitly or implicitly used techniques at their intersection, but in various complex and ad hoc ways. In this work, we define a simple, modular design space to study the use of SSL techniques for WS more systematically. Surprisingly, we find that fairly simple methods from our design space match the performance of more complex state-of-the-art methods, averaging...
Research Paper
Characterizing the Impacts of Semi-supervised Learning for Weak Supervision

Labeling training data is a critical and expensive step in producing high accuracy ML models, whether training from scratch or fine-tuning. To make labeling more efficient, two major approaches are programmatic weak supervision (WS) and semi-supervised learning (SSL). More recent works have either explicitly or implicitly used techniques at their intersection, but in various complex and ad hoc ways. In…

Jan 16, 2024
Jeffrey Li, Jieyu Zhang, Ludwig Schmidt & Alexander Ratner
Learn more about Characterizing the Impacts of Semi-supervised Learning for Weak Supervision
Tool documentation enables zero-shot tool-usage with large language models
Today, large language models (LLMs) are taught to use new tools by providing a few demonstrations of the tool’s usage. Unfortunately, demonstrations are hard to acquire, and can result in undesirable biased usage if the wrong demonstration is chosen. Even in the rare scenario that demonstrations are readily available, there is no principled selection protocol to determine how many and which ones to provide. As tasks grow more complex, the selection search grows combinatorially and invariably becomes intractable. Our work provides an alternative to demonstrations: tool documentation. We advocate the use of tool documentation—descriptions for the individual tool usage—over demonstrations....
Research Paper
Tool documentation enables zero-shot tool-usage with large language models

Today, large language models (LLMs) are taught to use new tools by providing a few demonstrations of the tool’s usage. Unfortunately, demonstrations are hard to acquire, and can result in undesirable biased usage if the wrong demonstration is chosen. Even in the rare scenario that demonstrations are readily available, there is no principled selection protocol to determine how many and…

Oct 20, 2023
CY. Hseih, et al.
Learn more about Tool documentation enables zero-shot tool-usage with large language models
1 2 5 6

For models that need to be right. Not just good enough.