stephen bach (steve bach)
author

Stephen Bach

Applied Research Scientist
,
Brown University
Eliot Horowitz Assistant Professor, Computer Science Department

Stephen Bach is the Eliot Horowitz Assistant Professor in the Computer Science Department at Brown University. Previously, he was a visiting scholar at Google, and a postdoctoral scholar in the computer science department at Stanford University advised by Christopher Ré.

He received his Ph.D. in computer science from the University of Maryland, where he was advised by Lise Getoor. His research focuses on weakly supervised, zero-shot, and few-shot machine learning. The goal of his work is to create methods and systems that drive down the labor cost of AI. He was a core contributor to the Snorkel framework, which was recognized with a Best of VLDB 2018 award. He also co-led the team that developed the T0 family of large language models. The team was also one of the proposers of instruction tuning, which is the process of fine-tuning language models with supervised training to follow instructions. Instruction tuning is now a standard part of training large language models. Stephen is also an advisor to Snorkel AI.

The latest from Stephen

An Adaptive Method for Weak Supervision with Drifting Data
We introduce an adaptive method with formal quality guarantees for weak supervision in a non-stationary setting. Our goal is to infer the unknown labels of a sequence of data by using weak supervision sources that provide independent noisy signals of the correct classification for each data point. This setting includes crowdsourcing and programmatic weak supervision. We focus on the non-stationary case, where the accuracy of the weak supervision sources can drift over time, e.g., because of changes in the underlying data distribution. Due to the drift, older data could provide misleading information to infer the label of the current data...
Research Paper
An Adaptive Method for Weak Supervision with Drifting Data

We introduce an adaptive method with formal quality guarantees for weak supervision in a non-stationary setting. Our goal is to infer the unknown labels of a sequence of data by using weak supervision sources that provide independent noisy signals of the correct classification for each data point. This setting includes crowdsourcing and programmatic weak supervision. We focus on the non-stationary…

Oct 20, 2023
A. Mazzetto, et al.
Learn more about An Adaptive Method for Weak Supervision with Drifting Data
Fairness via explanation quality: Evaluating disparities in the quality of post hoc explanations
As post hoc explanation methods are increasingly being leveragedto explain complex models in high-stakes settings, it becomes critical to ensure that the quality of the resulting explanations is consistently high across all subgroups of a population. For instance, it should not be the case that explanations associated with instances belonging to, e.g., women, are less accurate than those associated with other genders. In this work, we initiate the study of identifying group-based disparities in explanation quality. To this end, we first outline several key properties that contribute to explanation quality—namely, fidelity (accuracy), stability, consistency, and sparsity—and discuss why and how...
Research Paper
Fairness via explanation quality: Evaluating disparities in the quality of post hoc explanations

As post hoc explanation methods are increasingly being leveragedto explain complex models in high-stakes settings, it becomes critical to ensure that the quality of the resulting explanations is consistently high across all subgroups of a population. For instance, it should not be the case that explanations associated with instances belonging to, e.g., women, are less accurate than those associated with…

Oct 20, 2023
J. Dai, et al.
Learn more about Fairness via explanation quality: Evaluating disparities in the quality of post hoc explanations
Low-Resource Languages Jailbreak GPT-4
AI safety training and red-teaming of large language models (LLMs) are measures to mitigate the generation of unsafe content. Our work exposes the inherent cross-lingual vulnerability of these safety mechanisms, resulting from the linguistic inequality of safety training data, by successfully circumventing GPT-4’s safeguard through translating unsafe English inputs into low-resource languages. On the AdvBenchmark, GPT-4 engages with the unsafe translated inputs and provides actionable items that can get the users towards their harmful goals 79% of the time, which is on par with or even surpassing state-of-the-art jailbreaking attacks. Other high-/mid-resource languages have significantly lower attack success rate, which...
Research Paper
Low-Resource Languages Jailbreak GPT-4

AI safety training and red-teaming of large language models (LLMs) are measures to mitigate the generation of unsafe content. Our work exposes the inherent cross-lingual vulnerability of these safety mechanisms, resulting from the linguistic inequality of safety training data, by successfully circumventing GPT-4’s safeguard through translating unsafe English inputs into low-resource languages. On the AdvBenchmark, GPT-4 engages with the unsafe…

Oct 20, 2023
ZX. Yong, et al.
Learn more about Low-Resource Languages Jailbreak GPT-4
Does CLIP Bind Concepts? Probing Compositionality in Large Image Models
Large-scale neural network models combining text and images have made incredible progress in recent years. However, it remains an open question to what extent such models encode compositional representations of the concepts over which they operate, such as correctly identifying red cube by reasoning over the constituents red and cube. In this work, we focus on the ability of a large pretrained vision and language model (CLIP) to encode compositional concepts and to bind variables in a structure-sensitive way (e.g., differentiating cube behind sphere from sphere behind cube). In order to inspect the performance of CLIP, we compare several architectures...
Research Paper
Does CLIP Bind Concepts? Probing Compositionality in Large Image Models

Large-scale neural network models combining text and images have made incredible progress in recent years. However, it remains an open question to what extent such models encode compositional representations of the concepts over which they operate, such as correctly identifying red cube by reasoning over the constituents red and cube. In this work, we focus on the ability of a…

Oct 20, 2023
M. Lewis, et al.
Learn more about Does CLIP Bind Concepts? Probing Compositionality in Large Image Models
Tasks Algorithmically Given Labels Established via Transferred Symbols (TAGLETS)
We conducted research to reduce the amount of labeled data required to train machine learning systems. The pinnacle of this effort is the development of TAGLETS, a machine learning system that seamlessly integrates widely known collections of labeled data with a diverse array of machine learning algorithms, known as weak labelers. The system's evolution has been significantly influenced by comprehensive theoretical explorations into effectively aggregating these weak labelers within the system. The research's scope expands to the application of large pre-trained models in low-resource settings. The result of these efforts is Alfred, a second-generation system tailored for programmatic weak supervision...
Research Paper
Tasks Algorithmically Given Labels Established via Transferred Symbols (TAGLETS)

We conducted research to reduce the amount of labeled data required to train machine learning systems. The pinnacle of this effort is the development of TAGLETS, a machine learning system that seamlessly integrates widely known collections of labeled data with a diverse array of machine learning algorithms, known as weak labelers. The system’s evolution has been significantly influenced by comprehensive…

Sep 20, 2023
M. Littman, et al.
Learn more about Tasks Algorithmically Given Labels Established via Transferred Symbols (TAGLETS)
Enhancing CLIP with CLIP: Exploring Pseudolabeling for Limited-Label Prompt Tuning
The paper explores the use of pseudolabels, which are heuristic labels for unlabeled data, to enhance the performance of vision-language models like CLIP via prompt tuning. The authors investigate different learning paradigms and prompt modalities and find that iterative prompt-training strategies leveraging CLIP-based pseudolabels lead to significant improvements in CLIP's image classification performance.
Research Paper
Enhancing CLIP with CLIP: Exploring Pseudolabeling for Limited-Label Prompt Tuning

The paper explores the use of pseudolabels, which are heuristic labels for unlabeled data, to enhance the performance of vision-language models like CLIP via prompt tuning. The authors investigate different learning paradigms and prompt modalities and find that iterative prompt-training strategies leveraging CLIP-based pseudolabels lead to significant improvements in CLIP’s image classification performance.

Aug 02, 2023
Menghini et al.
Learn more about Enhancing CLIP with CLIP: Exploring Pseudolabeling for Limited-Label Prompt Tuning
Alfred: A System for Prompted Weak Supervision
The paper introduces Alfred, a system for programmatic weak supervision (PWS) that creates training data for machine learning by prompting. It enables users to encode their subject matter expertise via natural language prompts for language and vision-language models.
Research Paper
Alfred: A System for Prompted Weak Supervision

The paper introduces Alfred, a system for programmatic weak supervision (PWS) that creates training data for machine learning by prompting. It enables users to encode their subject matter expertise via natural language prompts for language and vision-language models.

Aug 02, 2023
Yu and Brown
Learn more about Alfred: A System for Prompted Weak Supervision
Learning to Compose Soft Prompts for Compositional Zero-Shot Learning
We introduce compositional soft prompting (CSP), a parameter-efficient learning technique to improve the zero-shot compositionality of large-scale pretrained vision-language models (VLMs) like CLIP. We develop CSP for compositional zero-shot learning, the task of predicting unseen attribute-object compositions (e.g., old cat and young tiger). VLMs have a flexible text encoder that can represent arbitrary classes as natural language prompts but they often underperform taskspecific architectures on the compositional zero-shot benchmark datasets. CSP treats the attributes and objects that define classes as learnable tokens of vocabulary. During training, the vocabulary is tuned to recognize classes that compose tokens in multiple ways (e.g.,...
Research Paper
Learning to Compose Soft Prompts for Compositional Zero-Shot Learning

We introduce compositional soft prompting (CSP), a parameter-efficient learning technique to improve the zero-shot compositionality of large-scale pretrained vision-language models (VLMs) like CLIP. We develop CSP for compositional zero-shot learning, the task of predicting unseen attribute-object compositions (e.g., old cat and young tiger). VLMs have a flexible text encoder that can represent arbitrary classes as natural language prompts but they…

Apr 24, 2023
N. Nayak et al.
Learn more about Learning to Compose Soft Prompts for Compositional Zero-Shot Learning
Zero-Shot Learning with Common Sense Knowledge Graphs
Zero-shot learning with Common Sense Knowledge Graphs is a general-purpose framework with a novel transformer graph convolutional network for generating class representations from common sense knowledge graphs, which improves over existing WordNet-based methods on zero-shot learning tasks.
Research Paper
Zero-Shot Learning with Common Sense Knowledge Graphs

Zero-shot learning with Common Sense Knowledge Graphs is a general-purpose framework with a novel transformer graph convolutional network for generating class representations from common sense knowledge graphs, which improves over existing WordNet-based methods on zero-shot learning tasks.

Mar 15, 2023
Snorkel Team
Learn more about Zero-Shot Learning with Common Sense Knowledge Graphs
1 2 3 4

For models that need to be right. Not just good enough.