Training classifiers with natural language explanations

Training Classifiers With Natural Language Explanations

Team Snorkel

Published: May 24, 2021

Updated: September 16, 2024

Machine Learning Whiteboard (MLW) Open-source Series

Earlier this year, we started our machine learning whiteboard (MLW) series, an open-invite space to brainstorm ideas and discuss the latest papers, techniques, and workflows in the AI space. We emphasize an informal and open environment to everyone interested in learning about machine learning.In this episode, our Co-founder and Head of Technology. Braden Hancock speaks about “Training Classifiers with Natural Language Explanations,” a research paper by Braden Hancock, Paroma Varma, Stephanie Wang, Martin Bringmann, Percy Liang, and Christopher Ré presented at ACL 2018.This episode is part of the #MLwhiteboard video series hosted by the Snorkel AI team. Check out the episode here:

Abstract:

Training accurate classifiers requires many labels, but each label provides only limited information (one bit for binary classification). In this work, we propose BabbleLabble, a framework for training classifiers in which an annotator provides a natural language explanation for each labeling decision. A semantic parser converts these explanations into programmatic labeling functions that generate noisy labels for an arbitrary amount of unlabeled data, which is used to train a classifier. On three relation extraction tasks, we find that users are able to train classifiers with comparable F1 scores from 5-100× faster by providing explanations instead of just labels. Furthermore, given the inherent imperfection of labeling functions, we find that a simple rule-based semantic parser suffices.

If you are interested in learning with us, consider joining us at our biweekly ML whiteboard.If you’re interested in staying in touch with Snorkel AI, follow us on Twitter, LinkedIn, Facebook, Youtube, or Instagram, and if you’re interested in joining the Snorkel team, we’re hiring! Please apply on our careers page.

Team Snorkel

Training Classifiers With Natural Language Explanations

Machine Learning Whiteboard (MLW) Open-source Series

Abstract:

Recommended
articles

How Tool Discipline Let a 4B Model Outsmart a 235B Giant on Financial Tasks

Coding agents don’t need to be perfect, they need to recover

Closing the Evaluation Gap in Agentic AI

Join our newsletter for expert advice, the latest research, and exclusive events.

Training Classifiers With Natural Language Explanations

Machine Learning Whiteboard (MLW) Open-source Series

Abstract:

Recommended articles

How Tool Discipline Let a 4B Model Outsmart a 235B Giant on Financial Tasks

Coding agents don’t need to be perfect, they need to recover

Closing the Evaluation Gap in Agentic AI

Join our newsletter for expert advice, the latest research, and exclusive events.

Recommended
articles