Image
author

Chris Ré

Co-Founder
,
Snorkel AI
Professor @ Stanford University

Christopher (Chris) Ré is a professor in the department of computer science at Stanford University. He is in the Stanford AI Lab and is affiliated with the Statistical Machine Learning Group. His recent work is to understand how software and hardware systems will change as a result of machine learning along with a continuing, petulant drive to work on math problems. Research from his group has been incorporated into scientific and humanitarian efforts, such as the fight against human trafficking, along with widely used products from technology and enterprise companies including Google Ads, Gmail, YouTube, and Apple.

He has co-founded four companies based on his research into machine learning systems, SambaNova and Snorkel, along with two companies that are now part of Apple, Lattice (DeepDive) in 2017, and Inductiv (HoloClean) in 2020.

His research contributions have spanned database theory, database systems, and machine learning. His work has won the best paper or test-of-time awards at the premier venues in each area. He still can’t believe he won the MacArthur Foundation Fellowship.

The latest from Chris

Inferring Generative Model Structure With Static Analysis
Obtaining enough labeled data to robustly train complex discriminative models is a major bottleneck in the machine learning pipeline. A popular solution is combining multiple sources of weak supervision using generative models. The structure of these models affects the quality of the training labels, but is difficult to learn without any ground truth labels. We instead rely on weak supervision sources having some structure by virtue of being encoded programmatically. We present Coral, a paradigm that infers generative model structure by statically analyzing the code for these heuristics, thus significantly reducing the amount of data required to learn structure. We...
Research Paper
Inferring Generative Model Structure With Static Analysis

Obtaining enough labeled data to robustly train complex discriminative models is a major bottleneck in the machine learning pipeline. A popular solution is combining multiple sources of weak supervision using generative models. The structure of these models affects the quality of the training labels, but is difficult to learn without any ground truth labels. We instead rely on weak supervision…

Dec 17, 2017
P. Varma, et al, 2017
Learn more about Inferring Generative Model Structure With Static Analysis
Swellshark: A Generative Model for Biomedical Named Entity Recognition Without Labeled Data
Introducing SwellShark, a framework for building biomedical named entity recognition (NER) systems quickly.
Research Paper
Swellshark: A Generative Model for Biomedical Named Entity Recognition Without Labeled Data

Introducing SwellShark, a framework for building biomedical named entity recognition (NER) systems quickly.

Nov 13, 2017
J. Fries, et al, 2017
Learn more about Swellshark: A Generative Model for Biomedical Named Entity Recognition Without Labeled Data
Socratic Learning: Augmenting Generative Models to Incorporate Latent Subsets in Training Data
A challenge in training discriminative models like neural networks is obtaining enough labeled training data. Recent approaches use generative models to combine weak supervision sources, like user-defined heuristics or knowledge bases, to label training data. Prior work has explored learning accuracies for these sources even without ground truth labels, but they assume that a single accuracy parameter is sufficient to model the behavior of these sources over the entire training set. In particular, they fail to model latent subsets in the training data in which the supervision sources perform differently than on average. We present Socratic learning, a paradigm that...
Research Paper
Socratic Learning: Augmenting Generative Models to Incorporate Latent Subsets in Training Data

A challenge in training discriminative models like neural networks is obtaining enough labeled training data. Recent approaches use generative models to combine weak supervision sources, like user-defined heuristics or knowledge bases, to label training data. Prior work has explored learning accuracies for these sources even without ground truth labels, but they assume that a single accuracy parameter is sufficient to…

Nov 13, 2017
P. Varma, et al, 2017
Learn more about Socratic Learning: Augmenting Generative Models to Incorporate Latent Subsets in Training Data
Snorkel: Rapid Training Data Creation With Weak Supervision
This paper presents a flexible interface layer to write labeling functions based on experience.
Research Paper
Snorkel: Rapid Training Data Creation With Weak Supervision

This paper presents a flexible interface layer to write labeling functions based on experience.

Oct 04, 2017
Alexander Ratner, Stephen H Bach, Henry Ehrenberg, Jason Fries, Sen Wu, Christopher Ré
Learn more about Snorkel: Rapid Training Data Creation With Weak Supervision
Data Programming: Creating Large Training Sets, Quickly
A paradigm for labeling training datasets programmatically rather than by hand.
Research Paper
Data Programming: Creating Large Training Sets, Quickly

A paradigm for labeling training datasets programmatically rather than by hand.

Dec 20, 2016
A. Ratner, et al. 2016
Learn more about Data Programming: Creating Large Training Sets, Quickly
Data Programming With DDLite: Putting Humans in a Different Part of the Loop
Introducing DDLite, an interactive development framework for data programming.
Research Paper
Data Programming With DDLite: Putting Humans in a Different Part of the Loop

Introducing DDLite, an interactive development framework for data programming.

Dec 19, 2016
H. Ehrenberg, et al, 2016
Learn more about Data Programming With DDLite: Putting Humans in a Different Part of the Loop
1 2 3 4

For models that need to be right. Not just good enough.