Training classifiers with natural language explanations

Training Classifiers With Natural Language Explanations

Team Snorkel

Published: May 24, 2021

Updated: September 16, 2024

Machine Learning Whiteboard (MLW) Open-source Series

Earlier this year, we started our machine learning whiteboard (MLW) series, an open-invite space to brainstorm ideas and discuss the latest papers, techniques, and workflows in the AI space. We emphasize an informal and open environment to everyone interested in learning about machine learning.In this episode, our Co-founder and Head of Technology. Braden Hancock speaks about “Training Classifiers with Natural Language Explanations,” a research paper by Braden Hancock, Paroma Varma, Stephanie Wang, Martin Bringmann, Percy Liang, and Christopher Ré presented at ACL 2018.This episode is part of the #MLwhiteboard video series hosted by the Snorkel AI team. Check out the episode here:

Abstract:

Training accurate classifiers requires many labels, but each label provides only limited information (one bit for binary classification). In this work, we propose BabbleLabble, a framework for training classifiers in which an annotator provides a natural language explanation for each labeling decision. A semantic parser converts these explanations into programmatic labeling functions that generate noisy labels for an arbitrary amount of unlabeled data, which is used to train a classifier. On three relation extraction tasks, we find that users are able to train classifiers with comparable F1 scores from 5-100× faster by providing explanations instead of just labels. Furthermore, given the inherent imperfection of labeling functions, we find that a simple rule-based semantic parser suffices.

If you are interested in learning with us, consider joining us at our biweekly ML whiteboard.If you’re interested in staying in touch with Snorkel AI, follow us on Twitter, LinkedIn, Facebook, Youtube, or Instagram, and if you’re interested in joining the Snorkel team, we’re hiring! Please apply on our careers page.

Team Snorkel

Training Classifiers With Natural Language Explanations

Machine Learning Whiteboard (MLW) Open-source Series

Abstract:

Recommended
articles

Research spotlight: is long chain-of-thought structure all that matters when it comes to LLM reasoning distillation?

Research spotlight: Is intent analysis the key to unlocking more accurate LLM question answering?

Long context models in the enterprise: benchmarks and beyond

Join our newsletter for expert advice, the latest research, and exclusive events.

Product

Solutions

Services

Industries

Customers

Resources

Learn

Engage

AI Primers

Docs

AI Research

Company

Contact

Compliance

Training Classifiers With Natural Language Explanations

Machine Learning Whiteboard (MLW) Open-source Series

Abstract:

Recommended articles

Research spotlight: is long chain-of-thought structure all that matters when it comes to LLM reasoning distillation?

Research spotlight: Is intent analysis the key to unlocking more accurate LLM question answering?

Long context models in the enterprise: benchmarks and beyond

Join our newsletter for expert advice, the latest research, and exclusive events.

Product

Solutions

Services

Industries

Customers

Resources

Learn

Engage

AI Primers

Docs

AI Research

Company

Contact

Compliance

Recommended
articles