Resource library
Explore our complete library of resources including blogs, benchmarks, research papers, and more.
Demonstrating in synthetic and real-world experiments how two simple labeling function acquisition strategies outperform a random baseline.
This paper presents a framework called search, label, and propagate (SLP) for bootstrapping intents from existing chat logs using weak supervision.
Describing GWASkb, a machine-compiled knowledge base of genetic associations collected from the scientific literature using automated information extraction algorithms.
This work develops a rule-based NLP algorithm to automatically generate labels for the training data, and then use the pre-trained word embeddings as deep representation features for training machine learning models.
Training accurate classifiers requires many labels, but each label provides only limited information (one bit for binary classification). In this work, we propose BabbleLabble, a framework for training classifiers in which an annotator provides a natural language explanation for each labeling decision. A semantic parser converts these explanations into programmatic labeling functions that generate noisy labels for an arbitrary amount…
This paper describes Snorkel, a system that enables users to help shape, create, and manage training data for Software 2.0 stacks.
Presenting Snorkel MeTal, an end-to-end system for multi-task learning.
Introducing Fonduer, a machine-learning-based KBC system for richly formatted data.
This paper showcases methods for unsupervised mining of fashion attributes from Instagram text, which can enable a new kind of user recommendation in the fashion domain.












