Product

Discover what’s new in Snorkel Flow: Flexible data and LLM connectivity, secure data controls, and more!

April 24, 2024
4 min read

Snorkel AI is excited to announce several new features released in Snorkel Flow, the latest version of our programmatic AI data development platform. This release enables enterprises to rapidly accelerate the customization of large language models (LLMs) on their own unique data for production environments, new features for retrieval augmented generation (RAG) to power chunking and retrieval over long documents, and introduce support for new data modality, images.

With a comprehensive suite of enterprise features spanning security, accessibility, and support for diverse types, this latest version of Snorkel empowers businesses to capitalize on the untapped potential of their data. By leveraging the platform’s new exciting multimodal data support, organizations can now use their image data to fine-tune LLMs and drive strategic value through production-grade AI solutions.

Release Highlights: Flexible data and LLM connectivity

Snorkel Flow is the only trusted AI data development platform that can be used to fine-tune LLMs with any type of data from any enterprise source:

  • New LLM integrations for Google’s Gemini model family and Meta’s Llama 3 add to an existing library of native LLM integrations.  
  • New model deployment integrations with Databricks Unity Catalog, Google Cloud Vertex AI, and Microsoft Azure Machine Learning to accelerate shipping fine-tuned models to production.
  • New features for retrieval augmented generation (RAG) to power chunking and retrieval over long documents
  • Multimodal (image/computer vision) use case support for programmatic data labeling to meet the wave of interest in other data modalities, such as image, video, and audio. 

Raising the Standard on Enterprise Readiness

One of the leading features included in this release is the newly added Role-Based Access Controls (RBAC) to solve one of AI’s biggest challenges: protecting data. This new addition gives admins more control over who has access to what data and who can upload data for AI development.

With RBAC, enterprise admins can now regulate access to their data connectors, ensuring granular control over sensitive information and maintaining the highest standards of data security.

FM-Powered Document Intelligence Workflows

This latest release of Snorkel Flow also includes a foundation model-powered PDF workflow with a dedicated PDF prompting UI in Studio, allowing users to quickly get started by using prompting to label their PDFs.

This feature unlocks faster, smarter document processing. It enables users to utilize the latest FM models for immediate, intuitive PDF labeling and analysis, streamlining the extraction of valuable insights from complex documents.

Effortless integration to the multi modal AI stack

Also included in the release is access to the latest Google Gemini models and an enhanced SDK that allows customers to easily integrate with a wide spectrum of custom LLM services.

Additionally, we also enhanced our integration with Databricks, ensuring seamless compatibility with the modern AI stack. Users can now deploy models to the Databricks Unity Catalog, Vertex AI, and Azure Machine Learning, simplifying the process of integrating Snorkel Flow with existing enterprise infrastructure and workflows.

Programmatic Image Classification

We’re also excited to introduce the ability to programmatically curate, annotate, and operate on images in this release. opening up new opportunities for valuable AI-driven insights and efficiency.

With this feature, currently in beta, users can accurately enrich millions of images using programmatic labeling functions, turning visual data into actionable insights through a streamlined curation and analysis process.

Streamlined data annotation for SMEs (R2 release preview)

We’ve simplified the data annotation process even further, enabling SMEs to annotate for multiple tasks simultaneously in one unified project. This efficiency boost means faster preparation and analysis, and streamlining workflows even further for expensive SMEs.

Wrapping up

This latest release of Snorkel Flow sets a new standard for enterprise AI development, combining robust security measures, advanced data connectivity, and seamless integrations to accelerate the customization of LLMs for production use cases.

With its powerful new features and enhancements, Snorkel Flow empowers enterprises to unlock the full potential of their proprietary data and build enterprise-grade AI solutions up to 100x faster.

To learn more about how Snorkel Flow’s new features can help you harness the power of LLMs and drive transformative results.

Share this article
Nick Harvey author profile
Nick Harvey
Director of Product Marketing

Recommended articles

View all articles
Image
Building AI-Native Systems for Federal Infrastructure: A Conversation with Rezaur Rahman
Christopher Sniffen recently sat down with Rezaur Rahman — CIO / CISO / CAIO at the Advisory Council on Historic Preservation — for a conversation on what it actually takes to build frontier AI for federal infrastructure. They get into the limits of frontier models on geospatial reasoning, mechanistic interpretability for applied AI, the trick that makes vision models useful
May 14, 2026
Snorkel Team
Image
Code World Models and AutoHarness for LLM Agents
At our latest Snorkel AI Reading Group, Carter Wendelken of Google DeepMind walked us through two related papers he presented at ICLR: Code World Models for General Game Playing and AutoHarness: Improving LLM Agents by Automatically Synthesizing a Code Harness. Both ask the same question from opposite ends: when you want an LLM to act reliably in a complex, possibly
May 14, 2026
David Burch
coding-agents-eval
Why coding agents need better data, evals, and environments
Coding agents have moved from tab-complete to teammate. They autonomously inspect repositories, edit files, run commands, diagnose failures, and work through multi-step engineering tasks. That creates a harder reliability problem. A model that only suggests code is easy for a human to evaluate. A coding agent refactoring your repository and testing its own changes is much harder to supervise –
May 11, 2026
Justin Bauer
Image

Join our newsletter

For expert advice, the latest research, and exclusive events.
By submitting this form, I acknowledge I will receive email updates from Snorkel AI, and I agree to the Terms of Use and acknowledge that my information will be used in accordance with the Privacy Policy.