Explore the new GenAI Evaluation Suite: Snorkel 2024.R3

Marty Moesta

Published: October 09, 2024

Updated: November 26, 2024

Over the last two years, we’ve been obsessive in our commitment to help our customers get Generative AI (GenAI) into production. We’ve delivered some exciting results, and along the way have built powerful workflows into our product.

Just like traditional machine learning (ML) development, data drives enterprise generative AI development. We’re excited to share the latest product updates to our data-centric workflows for generative AI.

Learn more about:

Snorkel’s GenAI Evaluation Suite is available in public preview

Our customers are moving beyond “vibe checks” and using Snorkel’s new Generative AI Evaluation Suite to ensure their pipelines are ready for production.

We built our Evaluation Suite on three core principles:

GenAI evaluation needs to be specialized in an efficient and flexible way

Snorkel allows users to rapidly onboard and evaluate their data. The platform lets users measure adherence to criteria using out-of-the-box or customized criteria and a mix of ground truth-based and evergreen auto-evaluators. Users can also compare the results of multiple experiments side by side.

GenAI evaluation needs to be fine-grained

In Snorkel, users can now programmatically slice their data via slicing functions. Data slices allow users to identify high-priority subsets of inputs like:

Question topics
Different languages
Specific customer scenarios
Jailbreak attempts

GenAI evaluation must enable users to find and fix errors in one platform, in an iterative and programmatic way

AI developers need a platform that transitions them from evaluation to development in a single click. Evaluation dashboards give you direct insight into data errors, and allow users to launch workflows in Snorkel to directly address those errors.

[Screenshot of data hotspot modal]

To complement the new Evaluation Suite, we’ve developed a cookbook, available in our documentation. This provides a guided evaluation experience for your AI teams. If you’re interested in learning how Snorkel’s Evaluation Suite could help your team, please reach out at [insert best way to contact us].

Key enhancements to the LLM fine-tuning workflow

After working closely with our public preview customers, we are thrilled to announce that key improvements to our LLM fine-tuning workflow! As a reminder, our fine-tuning workflow follows five distinct steps:

This release offers core stability improvements, as well as new tooling to enhance your data development efforts when crafting your fine-tuning training sets.

AI developers can now safely connect to their LLM providers and leverage freeform prompting in the SDK.
Snorkel now supports synthetic data generation techniques in the SDK to address issues with sparse or missing data.
We’ve added key improvements to logging and performance with fine-tuning providers, notably Amazon SageMaker.

Snorkel offers guided workflows for fine-tuning and alignment via our LLM-fine tuning and alignment cookbook, available in our documentation.

Key improvements to Generative AI annotation workflows

Our annotation work enables data scientists to seamlessly collaborate with subject matter experts (SMEs) to scale their expertise. To help SMEs share their feedback, we introduced two new views for our annotation studio:

Single response view: SMEs can annotate, per their defined label schema, the LLM response and individual pieces of context used to generate the response.
Ranking view: This view enables SMEs to rank different responses to help create a preference dataset.

Ready to accelerate AI development?

Deploy production AI and ML applications 10-100x faster with Snorkel’s experts, using our proprietary technology.

Request a demo

Marty Moesta

Marty Moesta is the lead product manager for Snorkel’s Generative AI products and services, before that, Marty was part of the founding go to market team here at Snorkel, focusing on success management and field engineering with fortune 100 strategic customers across financial services, insurance and health care. Prior to Snorkel, Marty was a Director of Technical Product Management at Tanium.

Explore the new GenAI Evaluation Suite: Snorkel 2024.R3

Snorkel’s GenAI Evaluation Suite is available in public preview

GenAI evaluation needs to be specialized in an efficient and flexible way

GenAI evaluation needs to be fine-grained

GenAI evaluation must enable users to find and fix errors in one platform, in an iterative and programmatic way

Key enhancements to the LLM fine-tuning workflow

Key improvements to Generative AI annotation workflows

Ready to accelerate AI development?

Recommended
articles

Data-centric development of an enterprise AI agent with Snorkel

Building the data development platform for specialized AI

Databricks + Snorkel Flow: integrated, streamlined AI development

Join our newsletter for expert advice, the latest research, and exclusive events.

Explore the new GenAI Evaluation Suite: Snorkel 2024.R3

Snorkel’s GenAI Evaluation Suite is available in public preview

GenAI evaluation needs to be specialized in an efficient and flexible way

GenAI evaluation needs to be fine-grained

GenAI evaluation must enable users to find and fix errors in one platform, in an iterative and programmatic way

Key enhancements to the LLM fine-tuning workflow

Key improvements to Generative AI annotation workflows

Ready to accelerate AI development?

Recommended articles

Data-centric development of an enterprise AI agent with Snorkel

Building the data development platform for specialized AI

Databricks + Snorkel Flow: integrated, streamlined AI development

Join our newsletter for expert advice, the latest research, and exclusive events.

Recommended
articles