Proposing Dugong, the first framework to model multi-resolution weak supervision sources with complex correlations to assign probabilistic labels to training data.
Showcasing state-of-the-art deep learning methods that identify patient outcomes from clinical notes without requiring hand-labeled training data.
This work focuses on a robust PCA-based algorithm for learning these dependency structures, establish improved theoretical recovery rates, and outperform existing methods on various real world tasks.
Demonstrating in synthetic and real-world experiments how two simple labeling function acquisition strategies outperform a random baseline.
This paper presents a framework called search, label, and propagate (SLP) for bootstrapping intents from existing chat logs using weak supervision.
Describing GWASkb, a machine-compiled knowledge base of genetic associations collected from the scientific literature using automated information extraction algorithms.
This work develops a rule-based NLP algorithm to automatically generate labels for the training data, and then use the pre-trained word embeddings as deep representation features for training machine learning models.
Introducing BabbleLabble, a framework for training classifiers in which an annotator provides a natural language explanation for each labeling decision.
Request a demo to see how you can accelerate AI development with Snorkel Flow’s revolutionary data-centric, programmatic workflow.
Join the Snorkel AI newsletterGet the latest content and upcoming events