Data development

“Fall in love with your data”—Snorkel AI’s Enterprise LLM Summit

January 26, 2024

•

7 min read

•

Snorkel Team

Snorkel AI’s Jan. 25 Enterprise LLM Summit: Building GenAI with Your Data drew over a thousand engaged attendees across three and a half hours and nine sessions. The eight speakers at the event—the second in our Enterprise LLM series—united around one theme: AI data development drives enterprise AI success.

Generic large language models (LLMs) are becoming the new baseline for modern enterprise AI. Enterprises also feel pressure to deliver on the potential of generative AI (GenAI). To achieve the trust, quality, and reliability necessary for production applications, enterprise data science teams must develop proprietary data for use with specialized models. Regardless of where a company is in the journey to adopting LLMs, AI data development enables them to deliver performance and reliability for high-value AI use cases.

Each of the day’s sessions addressed the event’s core theme from different perspectives. You can watch a recording of the entire event and read individual session summaries below.

Opening keynote: Your enterprise will succeed or fail in AI depending on how you use your data

Setting the tone for the day’s events, Snorkel AI CEO Alex Ratner discussed the importance of data development in AI—especially for enterprise AI projects.

LLMs have progressed significantly, Alex noted, but they require fine-tuning to achieve production-grade performance on complex, domain-specific challenges. Data scientists can best improve LLM performance on specific tasks by feeding them the right data prepared in the right way. In the past, this meant slow manual processes, but Snorkel Flow allows data science teams to amplify the impact of subject matter experts to iteratively develop high-quality data sets faster than ever before.

To support the value of Snorkel Flow’s approach, he highlighted two recent case studies:

A Snorkel engineer used Google’s PaLM 2 as a baseline to build a smaller and more accurate model.
A Snorkel researcher recently used Snorkel Flow and a novel training approach to place a model at #2 on the AlpacaEval leaderboard.

Slides for this session.

Programmatically scale human preferences and alignment in GenAI

Hoang Tran, Machine Learning Engineer at Snorkel AI, explained how he used scalable tools to align language models with human preferences. On recent projects, he and his team developed reward models trained to mimic the preferences of human annotators. They then used these reward models to accept or reject LLM responses from a base model. This pipeline significantly improved the base model’s performance with far less end-user guidance than required by traditional feedback methods.

Slides for this session.

Data development for GenAI: A systems-level view

Chris Glaze, a staff research scientist at Snorkel AI, discussed the importance of taking a systems-level view of generative AI applications.

While generative models form their core, Glaze said, these applications contain many components, each of which can hold back the system’s performance. His talk explained a recent case study where his team improved the accuracy of a retrieval augmented generation (RAG) system for a global bank by 54 points in just three weeks. Snorkel engineers and researchers, he noted, used scalable data development tools to improve many parts of this system, including their embedding and retrieval models.

Slides for this session.

Adapting and Auditing LLMs in the age of instruction tuning

Stephen Bach, an assistant professor of computer science at Brown, discussed the importance of managing training data for LLMs. LLMs require three sequential stages of training, he noted, and harmonizing training data across these stages is crucial for their effectiveness.

Bach illustrated the value of data harmonization with two research vignettes from his lab. The first explained how his lab adapted GenAI models to new domains by automatically generating instruction tuning data. The second demonstrated what can happen with improperly harmonized data. A study from his lab found safety vulnerabilities in GPT-4 in low-resource languages.

Slides for this session.

Representation model fine-tuning

Trung Nguyen, an applied research scientist at Snorkel, explained how fine-tuning representation models can boost the performance of generative applications.

Representation models encode meaningful features from raw data for use in classification, clustering, or information retrieval tasks. Trung walked the audience through techniques and best practices for fine-tuning representation models, emphasizing the importance of data quality and augmentation. RAG-based pipelines, Trung noted, may use multiple representation models to optimize added context. One model may find all the right information, while another ranks retrieved chunks to maximize the impact of the context added to the final prompt.

Slides for this session.

Skill-it! A data-driven skills framework for understanding and training language models

Fred Sala, an assistant professor at the University of Wisconsin-Madison, presented research on Skill-it!, a data-driven approach for training and understanding language models as well as his work on zero-shot robustification.

Skill-it! centers on the idea that language models, like humans, learn skills more efficiently when they’re “taught” in increasing complexity. Fred and his researchers exploited this theory to optimize data curation across training stages. This approach outperformed similar approaches on 11 out of 12 benchmarks.

Zero-shot robustification improves foundation model performance, Fred said, by elevating the importance of relevant understanding already present in the model. For example, he showed questionable responses from a CLIP model asked to identify pacifiers in pictures. Using zero-shot robustification, Fred asked the model to focus on relevant features, such as the shape of a pacifier, and ignore spurious ones, like the presence of baby bottles.

Slides for this session.

Fireside chat: Building enterprise LLM applications in banking

Snorkel CEO Alex Ratner talked with Sarthak Pattanaik, an AI leader at the Bank of New York Mellon, about AI’s impact on the financial industry. AI is creating improvements in client services, business operations, and company culture, Sarthak said, and all of those improvements start with data.

“We say data is our biggest asset,” Sarthank said.

He believes that organizations should “fall in love” with their data and that focusing on data access and responsible AI practices, places BNY Mellon—and other companies that take a similar approach—advantageously in the evolving AI landscape.

Fireside chat: Insurance’s GenAI revolution: A business perspective

Alex Taylor, Global Head of Emerging Technology at QBE Ventures, explored the challenges and impact of AI in the enterprise—particularly in the insurance world—with Snorkel’s CEO, Alex Ratner.

The pair noted that data bottlenecks often hinder the implementation of AI in real-world settings. LLMs, Taylor said, make AI more accessible to people without engineering or data science backgrounds, but both he and Alex Ratner agreed that LLMs require fine-tuning for enterprise tasks.

The pair noted how Snorkel’s approach—keeping subject matter experts in the loop but amplifying their impact—leads to both better data sets and happier regulators.

Q&A panel

In a lively discussion, six of the day’s speakers returned to answer audience questions submitted during the event. Questions ranged from how DPO can help enforce safety guardrails in model training and how to decide when a business challenge does and doesn’t call for a large language model.

The panel closed by stating what they anticipated would be the biggest impact of enterprise AI systems this year. Alex Ratner noted that many large enterprises still hardly use AI and anticipated that this year’s biggest impact for many businesses would be to go from “zero to one.” Alex Taylor anticipated that LLMs would take a greater role in orchestrating AI systems yielding more powerful applications. Other panelists touched on virtual assistants and the value of smaller models built for specialized tasks.

Enterprises beginning to seize on LLM promise

Snorkel’s January Enterprise LLM Summit (the second in a series), brought together eight speakers and more than a thousand attendees under one theme: data development will drive enterprise AI success. That day’s sessions illustrated this theme through case studies, academic research, and practical examples. Sarthak Pattanaik however, may have said it best: “How do you fall in love with your data?”