Over the last two years, we’ve been obsessive in our commitment to help our customers get Generative AI (GenAI) into production. We’ve delivered some exciting results, and along the way have built powerful workflows into our product.

Just like traditional machine learning (ML) development, data drives enterprise generative AI development. We’re excited to share the latest product updates to our data-centric workflows for generative AI.

Learn more about:

Snorkel’s GenAI Evaluation Suite is available in public preview

Our customers are moving beyond “vibe checks” and using Snorkel’s new Generative AI Evaluation Suite to ensure their pipelines are ready for production.

We built our Evaluation Suite on three core principles:

GenAI evaluation needs to be specialized in an efficient and flexible way

Snorkel allows users to rapidly onboard and evaluate their data. The platform lets users measure adherence to criteria using out-of-the-box or customized criteria and a mix of ground truth-based and evergreen auto-evaluators. Users can also compare the results of multiple experiments side by side.

GenAI evaluation needs to be fine-grained

In Snorkel, users can now programmatically slice their data via slicing functions. Data slices allow users to identify high-priority subsets of inputs like:

  • Question topics
  • Different languages
  • Specific customer scenarios
  • Jailbreak attempts

GenAI evaluation must enable users to find and fix errors in one platform, in an iterative and programmatic way

AI developers need a platform that transitions them from evaluation to development in a single click. Evaluation dashboards give you direct insight into data errors, and allow users to launch workflows in Snorkel to directly address those errors.

[Screenshot of data hotspot modal]

To complement the new Evaluation Suite, we’ve developed a cookbook, available in our documentation. This provides a guided evaluation experience for your AI teams. If you’re interested in learning how Snorkel’s Evaluation Suite could help your team, please reach out at [insert best way to contact us].

Key enhancements to the LLM fine-tuning workflow

After working closely with our public preview customers, we are thrilled to announce that key improvements to our LLM fine-tuning workflow! As a reminder, our fine-tuning workflow follows five distinct steps:

This release offers core stability improvements, as well as new tooling to enhance your data development efforts when crafting your fine-tuning training sets.

  • AI developers can now safely connect to their LLM providers and leverage freeform prompting in the SDK.
  • Snorkel now supports synthetic data generation techniques in the SDK to address issues with sparse or missing data.
  • We’ve added key improvements to logging and performance with fine-tuning providers, notably Amazon SageMaker.

Snorkel offers guided workflows for fine-tuning and alignment via our LLM-fine tuning and alignment cookbook, available in our documentation.

Key improvements to Generative AI annotation workflows

Our annotation work enables data scientists to seamlessly collaborate with subject matter experts (SMEs) to scale their expertise. To help SMEs share their feedback, we introduced two new views for our annotation studio:

  1. Single response view: SMEs can annotate, per their defined label schema, the LLM response and individual pieces of context used to generate the response.
  2. Ranking view: This view enables SMEs to rank different responses to help create a preference dataset.

Learn how to get more value from your PDF documents!

Transforming unstructured data such as text and documents into structured data is crucial for enterprise AI development. On December 17, we’ll hold a webinar that explains how to capture SME domain knowledge and use it to automate and scale PDF classification and information extraction tasks.

Sign up here!