Data development Archives

Our picks

LLM evaluation in enterprise applications: a new era in ML

Learn about the obstacles faced by data scientists in LLM evaluation and discover effective strategies for overcoming them.

Matt Casey, Venkatesh Rao

November 25, 2024

AI data development: a guide for data science projects

What is AI data development? AI data development includes any action taken to convert raw information into a format useful to AI.

Matt Casey, Minhajul Hoque

November 13, 2024

Building better enterprise AI: incorporating expert feedback in system development

Enterprises that aim to build valuable GenAI applications must view them from a systems-level. LLMs are just one part of an ecosystem.

Chris Glaze

January 30, 2024

Recomended for you

Data development

Building the Benchmark: Inside Our Agentic Insurance Underwriting Dataset

In this post, we unpack how Snorkel built a realistic benchmark dataset to evaluate AI agents in commercial insurance underwriting. From expert-driven data design to multi-tool reasoning tasks, see how our approach surfaces actionable failure modes that generic benchmarks miss—revealing what it really takes to deploy AI in enterprise workflows.

Snorkel Team

July 10, 2025

Data development

Evaluating AI Agents for Insurance Underwriting

In this post, we will show you a specialized benchmark dataset we developed with our expert network of Chartered Property and Casualty Underwriters (CPCUs). The benchmark uncovers several model-specific and actionable error modes, including basic tool use errors and a surprising number of insidious hallucinations from one provider. This is part of an ongoing series of benchmarks we are releasing across verticals…

Chris Glaze

June 26, 2025

Data development

LLM Observability: Key Practices, Tools, and Challenges

LLM observability is crucial for monitoring, debugging, and improving large language models. Learn key practices, tools, and strategies of LLM observability.

Snorkel Team

June 23, 2025

All articles on
Data development

Building the Benchmark: Inside Our Agentic Insurance Underwriting Dataset

Snorkel Team

July 10, 2025

Evaluating AI Agents for Insurance Underwriting

Annotation, LLMs

Chris Glaze

June 26, 2025

LLM Observability: Key Practices, Tools, and Challenges

LLM observability is crucial for monitoring, debugging, and improving large language models. Learn key practices, tools, and strategies of LLM observability.

Snorkel Team

June 23, 2025

LLM-as-a-judge for enterprises: evaluate model alignment at scale

Discover how enterprises can leverage LLM-as-Judge systems to evaluate generative AI outputs at scale, improve model alignment, reduce costs, and tackle challenges like bias and interpretability.

Annotation, Evaluation, LLMs

Matt Casey, Tom Walshe

March 26, 2025

Why enterprises should embrace LLM distillation

Unlock possibilities for your enterprise with LLM distillation. Learn how distilled, task-specific models boost performance and shrink costs.

Data Development, Data Labeling, Foundation Models, LLMs

Shane Johnson

February 18, 2025

LLM evaluation in enterprise applications: a new era in ML

Learn about the obstacles faced by data scientists in LLM evaluation and discover effective strategies for overcoming them.

Data Development, Evaluation, LLMs

Matt Casey, Venkatesh Rao

November 25, 2024

AI data development: a guide for data science projects

What is AI data development? AI data development includes any action taken to convert raw information into a format useful to AI.

Data Development, Data Labeling, Data-Centric AI

Matt Casey, Minhajul Hoque

November 13, 2024

How a global financial services company built a specialized AI copilot accurate enough for production

Learn how Snorkel, Databricks, and AWS enabled the team to build and deploy small, specialized, and highly accurate models which met their AI production requirements and strategic goals.

Fine-Tuning, RAG

Team Snorkel

September 9, 2024

Task Me Anything: innovating multimodal model benchmarks

“Task Me Anything” empowers data scientists to generate bespoke benchmarks to assess and choose the right multimodal model for their needs.

Annotation, Data Development, Data Labeling, Foundation Models

Jieyu Zhang

September 4, 2024

Alfred: Data labeling with foundation models and weak supervision

Introducing Alfred: an open-source tool for combining foundation models with weak supervision for faster development of academic data sets.

Annotation, Data Development, Data Labeling, Foundation Models

Peilin Yu

August 27, 2024

New GenAI features, data annotation: Snorkel Flow 2024.R2

This release features new GenAI tools and Multi-Schema Annotation, as well as new enterprise security tools and an updated home page.

Product Releases

Jennifer Lei

August 7, 2024

How data slices transform enterprise LLM evaluation

Enterprises must evaluate LLM performance for production deployment. Custom, automated eval + data slices present the best path to production.

Data Development, Evaluation, Product Releases, Retail & Ecommerce

Vincent Sunn Chen

August 1, 2024

Meta’s Llama 3.1 405B is the new Mr. Miyagi, now what?

Meta’s Llama 3.1 405B, rivals GPT-4o in benchmarks, offering powerful AI capabilities. Despite high costs, it can enhance LLM adoption through fine-tuning, distillation, and as an AI judge.

LLMs

Shane Johnson

July 25, 2024

Meta’s new Llama 3.1 models are here! Are you ready for it?

Meta released Llama 3 405B today, signaling a new era of open source AI. The model is ready to use on Snorkel Flow.

Foundation Models, LLMs

Cate Lochead

July 23, 2024

Data-centric AI with Snorkel and MinIO

High-performing AI systems require more than a well-designed model. They also require properly constructed training and testing data.

Data Development, Data Labeling, Data-Centric AI, Evaluation, Fine-Tuning, Foundation Models, MLOps, NLP

Keith Pijanowski (Guest blogger)

July 12, 2024

1 2 3 Next

See how Snorkel can help you get up to:

100x

Faster Data Curation

40x

Faster Model Delivery

99%

Model Accuracy

Let’s talk

Our picks

LLM evaluation in enterprise applications: a new era in ML

AI data development: a guide for data science projects

Building better enterprise AI: incorporating expert feedback in system development

Recomended for you

Building the Benchmark: Inside Our Agentic Insurance Underwriting Dataset

Evaluating AI Agents for Insurance Underwriting

LLM Observability: Key Practices, Tools, and Challenges

All articles on
Data development

Building the Benchmark: Inside Our Agentic Insurance Underwriting Dataset

Evaluating AI Agents for Insurance Underwriting

LLM Observability: Key Practices, Tools, and Challenges

LLM-as-a-judge for enterprises: evaluate model alignment at scale

Why enterprises should embrace LLM distillation

LLM evaluation in enterprise applications: a new era in ML

AI data development: a guide for data science projects

How a global financial services company built a specialized AI copilot accurate enough for production

Task Me Anything: innovating multimodal model benchmarks

Alfred: Data labeling with foundation models and weak supervision

New GenAI features, data annotation: Snorkel Flow 2024.R2

How data slices transform enterprise LLM evaluation

Meta’s Llama 3.1 405B is the new Mr. Miyagi, now what?

Meta’s new Llama 3.1 models are here! Are you ready for it?

Data-centric AI with Snorkel and MinIO

Product

Solutions

Services

Industries

Customers

Resources

Learn

Engage

AI Primers

Docs

AI Research

Company

Contact

Compliance

Our picks

Recomended for you

All articles on Data development

Product

Solutions

Services

Industries

Customers

Resources

Learn

Engage

AI Primers

Docs

AI Research

Company

Contact

Compliance

All articles on
Data development