

Braden is a co-founder and Head of Technology at Snorkel AI. Before Snorkel, Braden spent four years developing new programmatic approaches for efficiently labeling, augmenting, and structuring training data with the Stanford AI Lab, Facebook, and Google. Prior to that, he performed NLP and ML research at Johns Hopkins University and MIT Lincoln Laboratory and earned a B.S. in Mechanical Engineering from Brigham Young University.
The latest from Braden
This work demonstrates how organizational resources, in the form of aggregate statistics, knowledge bases, and existing services can be used to connect new and existing data modalities.
Proposing a framework for integrating and modeling such weak supervision sources by viewing them as labeling different related sub-tasks of a problem, which we refer to as the multi-task weak supervision setting
Outlining a vision for a Software 2.0 lifecycle centered around the idea that labeling training data can be the primary interface to Software 2.0 systems.
This is first-of-its-kind study showing how existing knowledge resources from across an organization can be used as weak supervision in order to bring development time and cost down by an order of magnitude, and introduce Snorkel DryBell, a new weak supervision management system for this setting
Describing GWASkb, a machine-compiled knowledge base of genetic associations collected from the scientific literature using automated information extraction algorithms.
Training accurate classifiers requires many labels, but each label provides only limited information (one bit for binary classification). In this work, we propose BabbleLabble, a framework for training classifiers in which an annotator provides a natural language explanation for each labeling decision. A semantic parser converts these explanations into programmatic labeling functions that generate noisy labels for an arbitrary amount…
Presenting Snorkel MeTal, an end-to-end system for multi-task learning.
Introducing Fonduer, a machine-learning-based KBC system for richly formatted data.



