This work demonstrates how organizational resources, in the form of aggregate statistics, knowledge bases, and existing services can be used to connect new and existing data modalities.
Outlining a vision for a Software 2.0 lifecycle centered around the idea that labeling training data can be the primary interface to Software 2.0 systems.
This paper introduces a semi-supervised method that assigns probabilistic relationship labels to a large number of unlabeled images using few labeled examples.
Describing GWASkb, a machine-compiled knowledge base of genetic associations collected from the scientific literature using automated information extraction algorithms.
Introducing BabbleLabble, a framework for training classifiers in which an annotator provides a natural language explanation for each labeling decision.
This paper describes Snorkel, a system that enables users to help shape, create, and manage training data for Software 2.0 stacks.
Introducing Fonduer, a machine-learning-based KBC system for richly formatted data.
Introducing Socratic learning, a paradigm that uses feedback from a discriminative model to automatically identify latent data subsets in training data.
Introducing DDLite, an interactive development framework for data programming.