Stephen Bach

Preference Tuning For Toxicity Mitigation Generalizes Across Languages

Detoxifying multilingual Large Language Models (LLMs) has become crucial due to their increasing global use. In this work, we explore zero-shot cross-lingual generalization of preference tuning in detoxifying LLMs. In contrast to prior work that suggests limited crosslingual generalization for other safety tasks, we show that Direct Preference Optimization (DPO) training with only English data can significantly reduce toxicity in multilingual openended generations. For instance, the probability of mGPT-1.3B in generating toxic continuations drops from 46.8% to 3.9% across 17 different languages after training. Our results also generalize to other multilingual LLMs, such as BLOOM, Llama3, and Aya-23. Using mechanistic...

Research Paper

Preference Tuning For Toxicity Mitigation Generalizes Across Languages

Detoxifying multilingual Large Language Models (LLMs) has become crucial due to their increasing global use. In this work, we explore zero-shot cross-lingual generalization of preference tuning in detoxifying LLMs. In contrast to prior work that suggests limited crosslingual generalization for other safety tasks, we show that Direct Preference Optimization (DPO) training with only English data can significantly reduce toxicity in…

Sep 18, 2024 •

X. Li, et al.

Learn more about Preference Tuning For Toxicity Mitigation Generalizes Across Languages

Planetarium: A Rigorous Benchmark for Translating Text to Structured Planning Languages

Many recent works have explored using language models for planning problems. One line of research focuses on translating natural language descriptions of planning tasks into structured planning languages, such as the planning domain definition language (PDDL). While this approach is promising, accurately measuring the quality of generated PDDL code continues to pose significant challenges. First, generated PDDL code is typically evaluated using planning validators that check whether the problem can be solved with a planner. This method is insufficient because a language model might generate valid PDDL code that does not align with the natural language description of the task....

Research Paper

Planetarium: A Rigorous Benchmark for Translating Text to Structured Planning Languages

Many recent works have explored using language models for planning problems. One line of research focuses on translating natural language descriptions of planning tasks into structured planning languages, such as the planning domain definition language (PDDL). While this approach is promising, accurately measuring the quality of generated PDDL code continues to pose significant challenges. First, generated PDDL code is typically…

Sep 18, 2024 •

M. Zuo, et al.

Learn more about Planetarium: A Rigorous Benchmark for Translating Text to Structured Planning Languages

LexC-Gen: Generating Data for Extremely Low-Resource Languages with Large Language Models and Bilingual Lexicons

Data scarcity in low-resource languages can be addressed with word-to-word translations from labeled task data in high-resource languages using bilingual lexicons. However, bilingual lexicons often have limited lexical overlap with task data, which results in poor translation coverage and lexicon utilization. We propose lexicon-conditioned data generation (LexC-Gen), a method that generates lowresource-language classification task data at scale. Specifically, LexC-Gen first uses highresource-language words from bilingual lexicons to generate lexicon-compatible task data, and then it translates them into low-resource languages with bilingual lexicons via word translation. Across 17 extremely low-resource languages, LexC-Gen generated data is competitive with expert-translated gold data, and...

Research Paper

LexC-Gen: Generating Data for Extremely Low-Resource Languages with Large Language Models and Bilingual Lexicons

Data scarcity in low-resource languages can be addressed with word-to-word translations from labeled task data in high-resource languages using bilingual lexicons. However, bilingual lexicons often have limited lexical overlap with task data, which results in poor translation coverage and lexicon utilization. We propose lexicon-conditioned data generation (LexC-Gen), a method that generates lowresource-language classification task data at scale. Specifically, LexC-Gen first…

Sep 18, 2024 •

ZX. Yong, et al.

Learn more about LexC-Gen: Generating Data for Extremely Low-Resource Languages with Large Language Models and Bilingual Lexicons

Learning to Generate Instruction Tuning Datasets for Zero-Shot Task Adaptation

We introduce Bonito, an open-source model for conditional task generation that converts unannotated text into task-specific training datasets for instruction tuning. We aim to enable zeroshot task adaptation of large language models on users’ specialized, private data. We train Bonito by fine-tuning a pretrained large language model on a new large-scale dataset with 1.65M examples created by remixing existing instruction tuning datasets into metatemplates. The meta-templates for a dataset produce training examples where the input is the unannotated text and the task attribute and the output consists of the instruction and the response. We use Bonito to generate synthetic tasks...

Research Paper

Learning to Generate Instruction Tuning Datasets for Zero-Shot Task Adaptation

We introduce Bonito, an open-source model for conditional task generation that converts unannotated text into task-specific training datasets for instruction tuning. We aim to enable zeroshot task adaptation of large language models on users’ specialized, private data. We train Bonito by fine-tuning a pretrained large language model on a new large-scale dataset with 1.65M examples created by remixing existing instruction…

Sep 18, 2024 •

N. Nayak et al.

Learn more about Learning to Generate Instruction Tuning Datasets for Zero-Shot Task Adaptation

If CLIP Could Talk: Understanding Vision-Language Model Representations Through Their Preferred Concept Descriptions

"Recent works often assume that VisionLanguage Model (VLM) representations are based on visual attributes like shape. However, it is unclear to what extent VLMs prioritize this information to represent concepts. We propose Extract and Explore (EX2), a novel approach to characterize important textual features for VLMs. EX2 uses reinforcement learning to align a large language model with VLM preferences and generates descriptions that incorporate the important features for the VLM. Then, we inspect the descriptions to identify the features that contribute to VLM representations. We find that spurious descriptions have a major role in VLM representations despite providing no helpful...

Research Paper

If CLIP Could Talk: Understanding Vision-Language Model Representations Through Their Preferred Concept Descriptions

“Recent works often assume that VisionLanguage Model (VLM) representations are based on visual attributes like shape. However, it is unclear to what extent VLMs prioritize this information to represent concepts. We propose Extract and Explore (EX2), a novel approach to characterize important textual features for VLMs. EX2 uses reinforcement learning to align a large language model with VLM preferences and…

Sep 18, 2024 •

R. Esfandiarpoor, et al.

Learn more about If CLIP Could Talk: Understanding Vision-Language Model Representations Through Their Preferred Concept Descriptions

Language Models in the Loop: Incorporating Prompting into Weak Supervision

We propose a new strategy for applying large pre-trained language models to novel tasks when labeled training data is limited. Rather than apply the model in a typical zero-shot or few-shot fashion, we treat the model as the basis for labeling functions in a weak supervision framework. To create a classifier, we first prompt the model to answer multiple distinct queries about an example and define how the possible responses should be mapped to votes for labels and abstentions. We then denoise these noisy label sources using the Snorkel system and train an end classifier with the resulting training data....

Research Paper

Language Models in the Loop: Incorporating Prompting into Weak Supervision

We propose a new strategy for applying large pre-trained language models to novel tasks when labeled training data is limited. Rather than apply the model in a typical zero-shot or few-shot fashion, we treat the model as the basis for labeling functions in a weak supervision framework. To create a classifier, we first prompt the model to answer multiple distinct…

Aug 22, 2024 •

R. Smith et al.

Learn more about Language Models in the Loop: Incorporating Prompting into Weak Supervision

Blog

Large language model training: how three training phases shape LLMs

Training large language models is a multi-layered stack of processes, each with its unique role and contribution to the model’s performance.

Feb 27, 2024 •

Stephen Bach

Learn more about Large language model training: how three training phases shape LLMs

Learning to Generate Instructions to Adapt Language Models to New Tasks

We present Bonito, the first open-source model for conditional task generation: the problem of converting unannotated corpus into a collection of tasks for instruction tuning. Our goal is to enable efficient task adaptation of instruction tuned language models on users' specialized, private data without relying on proprietary API-access-only models like GPT-4. We create Bonito by remixing existing, general-purpose instruction tuning data into a new training mixture for conditional task generation. Bonito learns to generate new tasks conditioned on the text and desired task type. The generated instructions in the specialized domain can be used to further train language models. We...

Research Paper

Learning to Generate Instructions to Adapt Language Models to New Tasks

We present Bonito, the first open-source model for conditional task generation: the problem of converting unannotated corpus into a collection of tasks for instruction tuning. Our goal is to enable efficient task adaptation of instruction tuned language models on users’ specialized, private data without relying on proprietary API-access-only models like GPT-4. We create Bonito by remixing existing, general-purpose instruction tuning…

Nov 26, 2023 •

N. Nayak et al.

Learn more about Learning to Generate Instructions to Adapt Language Models to New Tasks

Follow-Up Differential Descriptions: Langauge Models Resolve Ambiguities for Image Classification

A promising approach for improving the performance of vision-language models like CLIP for image classification is to extend the class descriptions (i.e., prompts) with related attributes, e.g., using brown sparrow instead of sparrow. However, current zero-shot methods select a subset of attributes regardless of commonalities between the target classes, potentially providing no useful information that would have helped to distinguish between them. For instance, they may use color instead of bill shape to distinguish between sparrows and wrens, which are both brown. We propose Follow-up Differential Descriptions (FuDD), a zero-shot approach that tailors the class descriptions to each dataset and...

Research Paper

Follow-Up Differential Descriptions: Langauge Models Resolve Ambiguities for Image Classification

A promising approach for improving the performance of vision-language models like CLIP for image classification is to extend the class descriptions (i.e., prompts) with related attributes, e.g., using brown sparrow instead of sparrow. However, current zero-shot methods select a subset of attributes regardless of commonalities between the target classes, potentially providing no useful information that would have helped to distinguish…

Nov 10, 2023 •

R. Esfandiarpoor, et al.

Learn more about Follow-Up Differential Descriptions: Langauge Models Resolve Ambiguities for Image Classification

Stephen Bach

The latest from Stephen

For models that need to be right. Not just good enough.

How do you want to work with Snorkel?