Changho Shin
author

Changho Shin

Postdoctoral Scholar at Princeton University
,
Princeton University

Changho Shin is a postdoctoral scholar at Princeton University. He completed his PhD in Computer Science at University of Wisconsin-Madison, advised by Frederic Sala. His research centers on data-centric AI and foundation models. Changho’s focus is on developing efficient methods for creating and curating data for foundation models; he is the recipient of multiple awards for work in this area. Changho is a 2024 Qualcomm Innovation Fellowship Finalist.

The latest from Changho

Weak-to-strong generalization through the data-centric lens
The weak-to-strong generalization phenomenon is the driver for important machine learning applications including highly data-efficient learning and, most recently, performing superalignment. While decades of research have resulted in numerous algorithms that produce strong empirical performance, understanding what aspects of data enable weak-to-strong generalization has been understudied. We propose a simple data-centric mechanism that characterizes weak-to-strong generalization: the overlap density. Intuitively, generalization tracks the number of points that contain overlaps, i.e., both easy patterns (learnable by a weak model) and challenging patterns (only learnable by a stronger model), as with such points, weak predictions can be used to learn challenging patterns...
Research Paper
Weak-to-strong generalization through the data-centric lens

The weak-to-strong generalization phenomenon is the driver for important machine learning applications including highly data-efficient learning and, most recently, performing superalignment. While decades of research have resulted in numerous algorithms that produce strong empirical performance, understanding what aspects of data enable weak-to-strong generalization has been understudied. We propose a simple data-centric mechanism that characterizes weak-to-strong generalization: the overlap density. Intuitively,…

Mar 01, 2025
Changho Shin, John Cooper, Frederic Sala Department of Computer Science University of Wisconsin-Madison
Learn more about Weak-to-strong generalization through the data-centric lens
Weak supervision for non-categorical applications + superalignment
Blog
Weak supervision for non-categorical applications + superalignment

We need more labeled data than ever, so we have explored weak supervision for non-categorical applications—with notable results.

Jul 02, 2024
Learn more about Weak supervision for non-categorical applications + superalignment
Zero-Shot Robustification of Zero-Shot Models with Foundation Models
Zero-shot inference is a powerful paradigm that enables the use of large pretrained models for downstream classification tasks without further training. However, these models are vulnerable to inherited biases that can impact their performance. The traditional solution is fine-tuning, but this undermines the key advantage of pretrained models, which is their ability to be used out-of-the-box. We propose ROBOSHOT, a method that improves the robustness of pretrained model embeddings in a fully zero-shot fashion. First, we use zero-shot language models (LMs) to obtain useful insights from task descriptions. These insights are embedded and used to remove harmful and boost useful...
Research Paper
Zero-Shot Robustification of Zero-Shot Models with Foundation Models

Zero-shot inference is a powerful paradigm that enables the use of large pretrained models for downstream classification tasks without further training. However, these models are vulnerable to inherited biases that can impact their performance. The traditional solution is fine-tuning, but this undermines the key advantage of pretrained models, which is their ability to be used out-of-the-box. We propose ROBOSHOT, a…

Oct 20, 2023
D. Adila, et al.
Learn more about Zero-Shot Robustification of Zero-Shot Models with Foundation Models

For models that need to be right. Not just good enough.