Category
Data development

Our picks

How Snorkel topped the AlpacaEval leaderboard (and why we’re not there anymore)

Snorkel AI placed a model at the top of the AlpacaEval leaderboard. Here’s how we built it, and how it changed AlpacaEval’s metrics.

April 9, 2024

How Skill-it! enables faster, better LLM training

Humans learn tasks better when taught in a logical order. So do LLMs. Researchers developed a way to exploit this tendency called “Skill-it!”

March 12, 2024

Building better enterprise AI: incorporating expert feedback in system development

Enterprises that aim to build valuable GenAI applications must view them from a systems-level. LLMs are just one part of an ecosystem.

January 30, 2024

Recomended for you

How a global financial services company built a specialized AI copilot accurate enough for production

Learn how Snorkel, Databricks, and AWS enabled the team to build and deploy small, specialized, and highly accurate models which met their AI production requirements and strategic goals.

September 9, 2024

Task Me Anything: innovating multimodal model benchmarks

“Task Me Anything” empowers data scientists to generate bespoke benchmarks to assess and choose the right multimodal model for their needs.

September 4, 2024

Alfred: Data labeling with foundation models and weak supervision

Introducing Alfred: an open-source tool for combining foundation models with weak supervision for faster development of academic data sets.

August 27, 2024

All articles on
Data development

How a global financial services company built a specialized AI copilot accurate enough for production

Learn how Snorkel, Databricks, and AWS enabled the team to build and deploy small, specialized, and highly accurate models which met their AI production requirements and strategic goals.

Dr. Bubbles, Snorkel AI's mascot
September 9, 2024

Task Me Anything: innovating multimodal model benchmarks

“Task Me Anything” empowers data scientists to generate bespoke benchmarks to assess and choose the right multimodal model for their needs.

September 4, 2024

Alfred: Data labeling with foundation models and weak supervision

Introducing Alfred: an open-source tool for combining foundation models with weak supervision for faster development of academic data sets.

August 27, 2024

New GenAI features, data annotation: Snorkel Flow 2024.R2

This release features new GenAI tools and Multi-Schema Annotation, as well as new enterprise security tools and an updated home page.

August 7, 2024

How data slices transform enterprise LLM evaluation

Enterprises must evaluate LLM performance for production deployment. Custom, automated eval + data slices present the best path to production.

August 1, 2024

Meta’s Llama 3.1 405B is the new Mr. Miyagi, now what?

Meta’s Llama 3.1 405B, rivals GPT-4o in benchmarks, offering powerful AI capabilities. Despite high costs, it can enhance LLM adoption through fine-tuning, distillation, and as an AI judge.

July 25, 2024

Meta’s new Llama 3.1 models are here! Are you ready for it?

Meta released Llama 3 405B today, signaling a new era of open source AI. The model is ready to use on Snorkel Flow.

July 23, 2024

Data-centric AI with Snorkel and MinIO

High-performing AI systems require more than a well-designed model. They also require properly constructed training and testing data.

Weak supervision for non-categorical applications + superalignment

We need more labeled data than ever, so we have explored weak supervision for non-categorical applications—with notable results.

Changho Shin
July 2, 2024

Vision language models: how LLMs boost image classification

Vision language models demonstrate impressive image classification capabilities, but LLMs can help improve their performance. Learn how.

June 12, 2024

How Bonito helps fine-tune specialized LLMs faster than ever

Fine-tuning specialized LLMs demands a lot of time and cost We developed Bonito to make this process faster, cheaper, and easier.

May 28, 2024

Accelerating AI development in manufacturing with Snorkel Flow and AWS SageMaker

The manufacturing industry has experienced a massive influx of data. Snorkel AI and AWS Sage Maker can make that data actionable.

The art of data development for Enterprise LLMs

Snorkel’s Paroma Varma and Google’s Ali Arsenjani discus the role of data in the development and implementation of LLMs.

Dr. Bubbles, Snorkel AI's mascot
April 16, 2024

How Snorkel topped the AlpacaEval leaderboard (and why we’re not there anymore)

Snorkel AI placed a model at the top of the AlpacaEval leaderboard. Here’s how we built it, and how it changed AlpacaEval’s metrics.

Hoang Tran portrayed.
April 9, 2024

CRFM’s HELM and enterprise LLM evaluation beyond accuracy

As Snorkel AI prepares to build better enterprise LLM evaluations, we spoke with Yifan Mail from Stanford’s CRFM HELM project.

vivek krishnamurthy
April 3, 2024
Image

Ready to accelerate AI development?

Deploy production AI and ML applications 10-100x faster with Snorkel Flow, the AI data development platform.
Request a demo