Data development Archives

Data-centric AI with Snorkel and MinIO

High-performing AI systems require more than a well-designed model. They also require properly constructed training and testing data.

Data Development, Data Labeling, Data-Centric AI, Evaluation, Fine-Tuning, Foundation Models, MLOps, NLP

Keith Pijanowski (Guest blogger)

July 12, 2024

Weak supervision for non-categorical applications + superalignment

We need more labeled data than ever, so we have explored weak supervision for non-categorical applications—with notable results.

Alignment, Annotation, Data Labeling, Data-Centric AI, Evaluation, Foundation Models, LLMs, NLP

Changho Shin

July 2, 2024

Vision language models: how LLMs boost image classification

Vision language models demonstrate impressive image classification capabilities, but LLMs can help improve their performance. Learn how.

Computer vision, Data Development, Evaluation, Fine-Tuning, Foundation Models, LLMs, NLP

Reza Esfandiarpoor

June 12, 2024

How Bonito helps fine-tune specialized LLMs faster than ever

Fine-tuning specialized LLMs demands a lot of time and cost We developed Bonito to make this process faster, cheaper, and easier.

Evaluation, Fine-Tuning, NLP, Synthetic Data

Nihal Nayak

May 28, 2024

Accelerating AI development in manufacturing with Snorkel Flow and AWS SageMaker

The manufacturing industry has experienced a massive influx of data. Snorkel AI and AWS Sage Maker can make that data actionable.

Data Development, Data Labeling, Data-Centric AI, Fine-Tuning, Foundation Models, MLOps, Synthetic Data

Ryan Gooch (Guest Blogger)

May 1, 2024

The art of data development for Enterprise LLMs

Snorkel’s Paroma Varma and Google’s Ali Arsenjani discus the role of data in the development and implementation of LLMs.

Alignment, Data Development, Fine-Tuning, Foundation Models, LLMs, NLP, Partners, RAG

Team Snorkel

April 16, 2024

How Snorkel topped the AlpacaEval leaderboard (and why we’re not there anymore)

Snorkel AI placed a model at the top of the AlpacaEval leaderboard. Here’s how we built it, and how it changed AlpacaEval’s metrics.

Alignment, Annotation, Data-Centric AI, Evaluation, Fine-Tuning, LLMs, NLP

Hoang Tran

April 9, 2024

CRFM’s HELM and enterprise LLM evaluation beyond accuracy

As Snorkel AI prepares to build better enterprise LLM evaluations, we spoke with Yifan Mail from Stanford’s CRFM HELM project.

Evaluation, Fine-Tuning, Foundation Models, LLMs, NLP

Vivek Krishnamurthy

April 3, 2024

Here’s how Snorkel Flow + Google AI built an enterprise-ready model in a day

Google and Snorkel AI customized PaLM 2 using domain expertise and data development to improve performance by 38 F1 points in a matter of hours.

Banking & Finance, Data Development, Data Labeling, Fine-Tuning, Foundation Models, LLMs, NLP

Ali Arsanjani, Paroma Varma

March 19, 2024

How Skill-it! enables faster, better LLM training

Humans learn tasks better when taught in a logical order. So do LLMs. Researchers developed a way to exploit this tendency called “Skill-it!”

Data-Centric AI, Evaluation, Fine-Tuning, Foundation Models, LLMs, NLP

Fred Sala

March 12, 2024

Fine-tuned representation models boost LLM systems. Here’s how

Fine-tuned representation models are often the most effective way to boost the performance of AI applications. Learn why.

Data Development, Fine-Tuning, NLP, RAG

Trung Nguyen

March 5, 2024

Enterprises must shift their focus from models to data in AI development

Snorkel AI CEO Alex Ratner explains his view on the importance of AI in data development and illustrates his position with two case studies.

Data Development, Data-Centric AI, Fine-Tuning, Foundation Models, LLMs, NLP, Synthetic Data

Alex Ratner

February 9, 2024

Scaling human preferences in AI: Snorkel’s programmatic approach

We’ve developed new approaches to scale human preferences and align LLM output to enterprise users’ expectations by magnifying SME impact.

Alignment, Data Development, Data Labeling, Fine-Tuning, LLMs, NLP

Hoang Tran

January 31, 2024

Building better enterprise AI: incorporating expert feedback in system development

Enterprises that aim to build valuable GenAI applications must view them from a systems-level. LLMs are just one part of an ecosystem.

Data Development, Fine-Tuning, Foundation Models, LLMs, NLP, RAG

Chris Glaze

January 30, 2024

“Fall in love with your data”—Snorkel AI’s Enterprise LLM Summit

Snorkel AI’s Jan. 25 Enterprise LLM Summit focused on one theme: AI data development drives enterprise AI success.

Data Development, Fine-Tuning, Foundation Models, LLMs, NLP, RAG

Matt Casey

January 26, 2024

All articles on
Data development

Data-centric AI with Snorkel and MinIO

Weak supervision for non-categorical applications + superalignment

Vision language models: how LLMs boost image classification

How Bonito helps fine-tune specialized LLMs faster than ever

Accelerating AI development in manufacturing with Snorkel Flow and AWS SageMaker

The art of data development for Enterprise LLMs

How Snorkel topped the AlpacaEval leaderboard (and why we’re not there anymore)

CRFM’s HELM and enterprise LLM evaluation beyond accuracy

Here’s how Snorkel Flow + Google AI built an enterprise-ready model in a day

How Skill-it! enables faster, better LLM training

Fine-tuned representation models boost LLM systems. Here’s how

Enterprises must shift their focus from models to data in AI development

Scaling human preferences in AI: Snorkel’s programmatic approach

Building better enterprise AI: incorporating expert feedback in system development

“Fall in love with your data”—Snorkel AI’s Enterprise LLM Summit

Product

Solutions

Services

Industries

Customers

Resources

Learn

Engage

AI Primers

Docs

AI Research

Company

Contact

Compliance

All articles on Data development