Snorkel AI is a Gartner Cool Vendor for Data-Centric AI.
Proposing Osprey, a weak-supervision system suited for highly imbalanced data, built on top of the Snorkel framework.
Proposing Dugong, the first framework to model multi-resolution weak supervision sources with complex correlations to assign probabilistic labels to training data.
Showcasing state-of-the-art deep learning methods that identify patient outcomes from clinical notes without requiring hand-labeled training data.
This work focuses on a robust PCA-based algorithm for learning these dependency structures, establish improved theoretical recovery rates, and outperform existing methods on various real world tasks.
Demonstrating in synthetic and real-world experiments how two simple labeling function acquisition strategies outperform a random baseline.
This paper presents a framework called search, label, and propagate (SLP) for bootstrapping intents from existing chat logs using weak supervision.
Describing GWASkb, a machine-compiled knowledge base of genetic associations collected from the scientific literature using automated information extraction algorithms.
This work develops a rule-based NLP algorithm to automatically generate labels for the training data, and then use the pre-trained word embeddings as deep representation features for training machine learning models.
See Snorkel Flow’s data-centric AI workflow in action
Join the Snorkel AI newsletterLearn what’s new in Snorkel Flow and AI