Research

Epoxy: Using Semi-Supervised Learning to Augment Weak Supervision

December 16, 2021
4 min read

Machine Learning Whiteboard (MLW) Open-source Series

We launched the machine learning whiteboard series (MLW) was launched earlier this year as an open-invitation forum to brainstorm ideas and discuss the latest papers, techniques, and workflows in artificial intelligence. Everyone interested in learning about machine learning can participate in an informal and open environment. If you are interested in learning about ML, we encourage you to join us on our next ML whiteboard. In this episode, Humza Iqbal, a machine learning research engineer with our research team, talks about “Train and You’ll Miss It: Interactive Model Iteration with Weak Supervision and Pre-Trained Embeddings,” by Mayee F. Chen, Daniel Y. Fu, Frederic Sala, Sen Wu, Ravi Teja Mullapudi, Fait Poms, Kayvon Fatahalian, and Christopher Ré. The reproducible code for Epoxy can be found on Github.This episode is part of the #MLwhiteboard video series hosted by Snorkel AI. Check out the episode here:


With Epoxy, you can train models at programmatically-interactive speeds (less than 1/2 second) while retaining the performance of training deep networks. This Github repository contains a simple Epoxy proof-of-concept implementation (Showing an implementation at approximately 100 lines long, including documentation).Using weak supervision, users can write noisy labeling functions to generate labels for their data. These labeling functions are historically high in accuracy but low in coverage (each voting on a subset of points). Until recently, the only way to fill the gap was to write more labeling functions (which can be challenging as you start dealing with the long tail) or use the labeling functions to train an end model (see, for example, FlyingSquid for more information).By using Epoxy pre-trained embeddings, we obtain some of the benefits of training an end model without actually training one. By using the embeddings and nearest-neighbor search (which improves coverage), we make extended labeling functions and aggregate the functions with FlyingSquid. With this, you gain some of the benefits of deep learning at a fraction of the cost. You can also use Epoxy to generate labels to train a downstream model if you have time to train a deep network.

Abstract: 

Our goal is to enable machine learning systems to be trained interactively. This requires models that perform well and train quickly, without large amounts of hand-labeled data. We take a step forward in this direction by borrowing from weak supervision (WS), wherein models can be trained with noisy sources of signal instead of hand-labeled data. But WS relies on training downstream deep networks to extrapolate to unseen data points, which can take hours or days. Pre-trained embeddings can remove this requirement. We do not use the embeddings as features as in transfer learning (TL), which requires fine-tuning for high performance, but instead, use them to define a distance function on the data and extend WS source votes to nearby points. Theoretically, we provide a series of results studying how performance scales with changes in source coverage, source accuracy, and the Lipschitzness of label distributions in the embedding space and compare this rate to standard WS without extension and TL without fine-tuning. On six benchmark NLP and video tasks, our method outperforms WS without extension by 4.1 points, TL without fine-tuning by 12.8 points, and traditionally-supervised deep networks by 13.1 points, and comes within 0.7 points of state-of-the-art weakly-supervised deep networks-all while training in less than half a second.Where to connect with Humza: Linkedin.If you are interested in learning with us, consider joining us at our biweekly ML whiteboard.Stay in touch with Snorkel AI, follow us on Twitter, LinkedIn, Facebook, Youtube, or Instagram, and if you’re interested in joining the Snorkel team, we’re hiring! Please apply on our careers page.

Share this article

Recommended articles

View all articles
judgment-bench-paper
JudgmentBench: Comparing Rubric and Preference Evaluation for Quality Assessment
At our latest Snorkel AI Reading Group, Russell Yang (AI Engineering Fellow at Stanford Law) stopped by our San Francisco office to present JudgmentBench: Comparing Rubric and Preference Evaluation for Quality Assessment. As AI models improve at open-ended tasks, the field faces a harder problem: how to measure quality in domains where ground truth is contested. Two paradigms dominate: rubric-based
June 18, 2026
Alexis Sobel
benchmarks-3-axis
The Art and Science of Building AI Benchmarks That Shape the Field
Vincent Sunn Chen spoke at AI Engineer London about what it actually takes to build AI benchmarks that move the field forward, not just measure it. The throughline is an asymmetry that keeps showing up across deployments and the 150+ proposals reviewed for the Open Benchmarks Grants: agent capabilities are climbing fast, but the ability to measure those agents with
June 16, 2026
Snorkel Team
Image
Cua-Bench: benchmarking computer-use agents on professional software
TL;DR We built a benchmark of 25 expert-authored KiCad schematic-editing tasks and ran a frontier computer-use agent against them. The headline numbers: 1. Why build a computer-use benchmark for electrical engineering? Most computer-use benchmarks today live in the same handful of apps: web browsers, file managers, generic productivity suites. Those evaluations are useful, but they share a structural weakness —
June 15, 2026
Armin Parchami
,
Zhengyang (Jason) Qi
Image

Join our newsletter

For expert advice, the latest research, and exclusive events.
By submitting this form, I acknowledge I will receive email updates from Snorkel AI, and I agree to the Terms of Use and acknowledge that my information will be used in accordance with the Privacy Policy.