Instruction Tuning LLMs with Weak Supervision: A Case Study with RedPajama
Watch on demand
In partnership with Together AI, Snorkel researchers recently demonstrated a 24% improvement in response win-rate against ChatGPT by programmatically categorizing, scoring, and filtering the original corpus of prompt/response training examples for the open source RedPajama chat LLM—with less than one day of work.
Join this open discussion with Snorkel AI co-founder and head of technology Braden Hancock and Snorkel AI staff researcher Chris Glaze to learn more about our groundbreaking results. Get a first-hand look at how instruction tuning—and careful curation of training data with weak supervision—can improve the performance of open source LLMs like Llama 2 and RedPajama.
In advance of the webinar, you can read our blog post for more detail, including the complete results of our research, and please bring your questions to ask live during the event.
Presented by

Braden Hancock
Co-founder
Snorkel AI
Braden is a co-founder and Head of Technology at Snorkel AI. Before Snorkel, Braden spent four years developing new programmatic approaches for efficiently labeling, augmenting, and structuring training data with the Stanford AI Lab, Facebook, and Google. Prior to that, he performed NLP and ML research at Johns Hopkins University and MIT Lincoln Laboratory and earned a B.S. in Mechanical Engineering from Brigham Young University.

Chris Glaze
Principal Research Scientist
Snorkel AI
Experienced PhD with a demonstrated history of developing novel machine learning tools and mathematical models in academia and industry. Accomplishments span data mining, experimental research, and application to digital technologies.