Continual Learning Bench by Berkeley & Snorkel

RESOURCES

Blog

Ideas, updates, and practical guidance from the Snorkel team.

Research

Closing the Evaluation Gap in Agentic AI

Announcing a $3M commitment to launch Open Benchmarks Grants

Vincent Sunn Chen

February 11, 2026

Anthropic Claude + AWS: revolutionizing pharma data analytics with Snorkel AI

Explore how Anthropic Claude + AWS help pharmaceutical companies leverage AI for enhanced data insights and revenue growth.

Jun 04, 2025 •

Shan Kandaswamy (AWS), Matt Casey

Learn more about Anthropic Claude + AWS: revolutionizing pharma data analytics with Snorkel AI

Data-centric development of an enterprise AI agent with Snorkel

See how we can use these two new products—Snorkel Evaluate and Expert Data-as-a-Service–to evaluate and develop a specialized agentic AI system for an enterprise use case

May 29, 2025 •

Alex Ratner

Learn more about Data-centric development of an enterprise AI agent with Snorkel

Building the data development platform for specialized AI

Announcing two new products on our AI Data Development Platform that together create a complete solution for enterprises to specialize AI systems with expert data at scale.

May 29, 2025 •

Alex Ratner

Learn more about Building the data development platform for specialized AI

LLM-as-a-judge for enterprises: evaluate model alignment at scale

Discover how enterprises can leverage LLM-as-Judge systems to evaluate generative AI outputs at scale, improve model alignment, reduce costs, and tackle challenges like bias and interpretability.

Mar 26, 2025 •

Matt Casey, Tom Walshe

Learn more about LLM-as-a-judge for enterprises: evaluate model alignment at scale

Why GenAI evaluation requires SME-in-the-loop for validation and trust

It’s critical enterprises can trust and rely on GenAI evaluation results, and for that, SME-in-the-loop workflows are needed. In my first blog post on enterprise GenAI evaluation, I discussed the importance of specialized evaluators as a scalable proxy for SMEs. It simply isn’t practical to task SMEs with performing manual evaluations – it can take weeks if not longer, unnecessarily…

Mar 20, 2025 •

Shane Johnson

Learn more about Why GenAI evaluation requires SME-in-the-loop for validation and trust

Research spotlight: is long chain-of-thought structure all that matters when it comes to LLM reasoning distillation?

We’re taking a look at the research paper, LLMs can easily learn to reason from demonstration (Li et al., 2025), in this week’s community research spotlight. It focuses on how the structure of reasoning traces impacts distillation from models such as DeepSeek R1. What’s the big idea regarding LLM reasoning distillation? The reasoning capabilities of powerful models such as DeepSeek…

Mar 19, 2025 •

Shane Johnson

Learn more about Research spotlight: is long chain-of-thought structure all that matters when it comes to LLM reasoning distillation?

Why enterprise GenAI evaluation requires fine-grained metrics to be insightful

GenAI needs fine-grained evaluation for AI teams to gain actionable insights.

Mar 18, 2025 •

Shane Johnson

Learn more about Why enterprise GenAI evaluation requires fine-grained metrics to be insightful

What is specialized GenAI evaluation, and why is it so critical to enterprise AI?

Specialized GenAI evaluation ensures AI assistants meet business requirements, SME expertise, and industry regulations—critical for production-ready AI.

Mar 05, 2025 •

Shane Johnson

Learn more about What is specialized GenAI evaluation, and why is it so critical to enterprise AI?

LLM alignment techniques: 4 post-training approaches

Ensure your LLMs align with your values and goals using LLM alignment techniques. Learn how to mitigate risks and optimize performance.

Mar 04, 2025 •

Tom Walshe

Learn more about LLM alignment techniques: 4 post-training approaches

1 … 4 5 6 … 37

Join our newsletter

For expert advice, the latest research, and exclusive events.

By submitting this form, I acknowledge I will receive email updates from Snorkel AI, and I agree to the Terms of Use and acknowledge that my information will be used in accordance with the Privacy Policy.

Blog

Closing the Evaluation Gap in Agentic AI

Join our newsletter

How do you want to work with Snorkel?