The Snorkel AI Blog

Building the Data Development Platform for Specialized AI

Announcing two new products on our AI Data Development Platform that together create a complete solution for enterprises to specialize AI systems with expert data at scale.

Alex Ratner

Why enterprise GenAI evaluation requires fine-grained metrics to be insightful

GenAI needs fine-grained evaluation for AI teams to gain actionable insights.

Shane Johnson

What is specialized GenAI evaluation, and why is it so critical to enterprise AI?

Shane Johnson

Why GenAI evaluation requires SME-in-the-loop for validation and trust

Shane Johnson

Download and read "The Guide to Data Labeling & Annotation for Enterprise AI Teams"

Latest posts

Building the Benchmark: Inside Our Agentic Insurance Underwriting Dataset

In this post, we unpack how Snorkel built a realistic benchmark dataset to evaluate AI agents in commercial insurance underwriting. From expert-driven data design to multi-tool reasoning tasks, see how our approach surfaces actionable failure modes that generic benchmarks miss—revealing what it really takes to deploy AI in enterprise workflows.

Snorkel Team

July 10, 2025

Evaluating AI Agents for Insurance Underwriting

Chris Glaze

June 26, 2025

LLM Observability: Key Practices, Tools, and Challenges

LLM observability is crucial for monitoring, debugging, and improving large language models. Learn key practices, tools, and strategies of LLM observability.

Snorkel Team

June 23, 2025

Anthropic Claude + AWS: revolutionizing pharma data analytics with Snorkel AI

Explore how Anthropic Claude + AWS help pharmaceutical companies leverage AI for enhanced data insights and revenue growth.

Shan Kandaswamy (AWS), Matt Casey

June 4, 2025

Latest videos

Browse our YouTube channel

Expert Data, Specialized AI: Snorkel Summer 2025 Launch Event

Why Labeled Data is Essential for Generative AI Success

Enhancing LLM Observability and Gen AI Evaluation for Business Success

How to Evaluate LLM Performance for Domain-Specific Use Cases

Understand the basics of LLM training in under four minutes!

RAG Optimization: A Practical Overview for Improving Retrieval Augmented Generation

Applied AI

Why GenAI evaluation requires SME-in-the-loop for validation and trust

It’s critical enterprises can trust and rely on GenAI evaluation results, and for that, SME-in-the-loop workflows are needed. In my first blog post on enterprise GenAI evaluation, I discussed the importance of specialized evaluators as a scalable proxy for SMEs. It simply isn’t practical to task SMEs with performing manual evaluations – it can take weeks if not longer, unnecessarily…

Shane Johnson

March 20, 2025

Applied AI, Research

Research spotlight: is long chain-of-thought structure all that matters when it comes to LLM reasoning distillation?

We’re taking a look at the research paper, LLMs can easily learn to reason from demonstration (Li et al., 2025), in this week’s community research spotlight. It focuses on how the structure of reasoning traces impacts distillation from models such as DeepSeek R1. What’s the big idea regarding LLM reasoning distillation? The reasoning capabilities of powerful models such as DeepSeek…

Shane Johnson

March 19, 2025

Why enterprise GenAI evaluation requires fine-grained metrics to be insightful

GenAI needs fine-grained evaluation for AI teams to gain actionable insights.

Shane Johnson

March 18, 2025

What is specialized GenAI evaluation, and why is it so critical to enterprise AI?

Specialized GenAI evaluation ensures AI assistants meet business requirements, SME expertise, and industry regulations—critical for production-ready AI.

Shane Johnson

March 5, 2025

Customers

Anthropic Claude + AWS: revolutionizing pharma data analytics with Snorkel AI

Explore how Anthropic Claude + AWS help pharmaceutical companies leverage AI for enhanced data insights and revenue growth.

Shan Kandaswamy (AWS), Matt Casey

June 4, 2025

Call center AI for customer experience management: a case study

How one large financial institution used call center AI to inform customer experience management with real-time data.

Maxwell Williams

August 14, 2024

How we achieved 89% accuracy on contract question answering

A customer wanted an llm system for complex contract question answering tasks. We helped them build it—beating the baseline by 64 points.

Minhajul Hoque

April 2, 2024

Applied AI, Customers

Content filtering breakthrough: Snorkel client reaches 96% recall in 3 days

Snorkel AI helped a client solve the challenge of social media content filtering quickly and sustainably. Here’s how.

Gabe Smith

March 26, 2024

Data development

Building the Benchmark: Inside Our Agentic Insurance Underwriting Dataset

Snorkel Team

July 10, 2025

Evaluating AI Agents for Insurance Underwriting

Chris Glaze

June 26, 2025

LLM Observability: Key Practices, Tools, and Challenges

LLM observability is crucial for monitoring, debugging, and improving large language models. Learn key practices, tools, and strategies of LLM observability.

Snorkel Team

June 23, 2025

LLM-as-a-judge for enterprises: evaluate model alignment at scale

Discover how enterprises can leverage LLM-as-Judge systems to evaluate generative AI outputs at scale, improve model alignment, reduce costs, and tackle challenges like bias and interpretability.

Matt Casey, Tom Walshe

March 26, 2025

Product

Data-Centric Development of an Enterprise AI Agent with Snorkel

See how we can use these two new products—Snorkel Evaluate and Expert Data-as-a-Service–to evaluate and develop a specialized agentic AI system for an enterprise use case

Alex Ratner

May 29, 2025

Building the Data Development Platform for Specialized AI

Announcing two new products on our AI Data Development Platform that together create a complete solution for enterprises to specialize AI systems with expert data at scale.

Alex Ratner

May 29, 2025

Databricks + Snorkel Flow: integrated, streamlined AI development

Discover the power of integrating Databricks and Snorkel Flow for efficient data ingestion, labeling, model development, and AI deployment.

Bryan Wood

January 8, 2025

How LLM evaluation drives better models in Snorkel Flow

Discover how Snorkel AI’s methodical workflow can simplify the evaluation of LLM systems. Achieve better model performance in less time.

Rebekah Westerlind

December 17, 2024

Research