Product

Discover what’s new in Snorkel Flow: Flexible data and LLM connectivity, secure data controls, and more!

April 24, 2024
4 min read

Snorkel AI is excited to announce several new features released in Snorkel Flow, the latest version of our programmatic AI data development platform. This release enables enterprises to rapidly accelerate the customization of large language models (LLMs) on their own unique data for production environments, new features for retrieval augmented generation (RAG) to power chunking and retrieval over long documents, and introduce support for new data modality, images.

With a comprehensive suite of enterprise features spanning security, accessibility, and support for diverse types, this latest version of Snorkel empowers businesses to capitalize on the untapped potential of their data. By leveraging the platform’s new exciting multimodal data support, organizations can now use their image data to fine-tune LLMs and drive strategic value through production-grade AI solutions.

Release Highlights: Flexible data and LLM connectivity

Snorkel Flow is the only trusted AI data development platform that can be used to fine-tune LLMs with any type of data from any enterprise source:

  • New LLM integrations for Google’s Gemini model family and Meta’s Llama 3 add to an existing library of native LLM integrations.  
  • New model deployment integrations with Databricks Unity Catalog, Google Cloud Vertex AI, and Microsoft Azure Machine Learning to accelerate shipping fine-tuned models to production.
  • New features for retrieval augmented generation (RAG) to power chunking and retrieval over long documents
  • Multimodal (image/computer vision) use case support for programmatic data labeling to meet the wave of interest in other data modalities, such as image, video, and audio. 

Raising the Standard on Enterprise Readiness

One of the leading features included in this release is the newly added Role-Based Access Controls (RBAC) to solve one of AI’s biggest challenges: protecting data. This new addition gives admins more control over who has access to what data and who can upload data for AI development.

With RBAC, enterprise admins can now regulate access to their data connectors, ensuring granular control over sensitive information and maintaining the highest standards of data security.

FM-Powered Document Intelligence Workflows

This latest release of Snorkel Flow also includes a foundation model-powered PDF workflow with a dedicated PDF prompting UI in Studio, allowing users to quickly get started by using prompting to label their PDFs.

This feature unlocks faster, smarter document processing. It enables users to utilize the latest FM models for immediate, intuitive PDF labeling and analysis, streamlining the extraction of valuable insights from complex documents.

Effortless integration to the multi modal AI stack

Also included in the release is access to the latest Google Gemini models and an enhanced SDK that allows customers to easily integrate with a wide spectrum of custom LLM services.

Additionally, we also enhanced our integration with Databricks, ensuring seamless compatibility with the modern AI stack. Users can now deploy models to the Databricks Unity Catalog, Vertex AI, and Azure Machine Learning, simplifying the process of integrating Snorkel Flow with existing enterprise infrastructure and workflows.

Programmatic Image Classification

We’re also excited to introduce the ability to programmatically curate, annotate, and operate on images in this release. opening up new opportunities for valuable AI-driven insights and efficiency.

With this feature, currently in beta, users can accurately enrich millions of images using programmatic labeling functions, turning visual data into actionable insights through a streamlined curation and analysis process.

Streamlined data annotation for SMEs (R2 release preview)

We’ve simplified the data annotation process even further, enabling SMEs to annotate for multiple tasks simultaneously in one unified project. This efficiency boost means faster preparation and analysis, and streamlining workflows even further for expensive SMEs.

Wrapping up

This latest release of Snorkel Flow sets a new standard for enterprise AI development, combining robust security measures, advanced data connectivity, and seamless integrations to accelerate the customization of LLMs for production use cases.

With its powerful new features and enhancements, Snorkel Flow empowers enterprises to unlock the full potential of their proprietary data and build enterprise-grade AI solutions up to 100x faster.

To learn more about how Snorkel Flow’s new features can help you harness the power of LLMs and drive transformative results.

Share this article
Nick Harvey author profile
Nick Harvey
Director of Product Marketing

Recommended articles

View all articles
agentic-in-action
The Standard for Agents You Can Trust: Lessons from the Federal Front Lines
In the first installment of Agentic in Action — a series about real AI deployments, not demos — Snorkel AI’s Kevin Olivieri sat down with three people who have spent their careers where trust isn’t optional: Chris Sniffen, Federal Applied AI Lead at Snorkel AI; John Hickey, President of August Schell; and Mike Baca, CIO of August Schell. The conversation focused on
June 5, 2026
Snorkel Team
collab-gym-thumbnail
Collaborative Gym: A Framework for Enabling and Evaluating Human-Agent Collaboration
At our latest Snorkel AI Reading Group, Yijia Shao (Stanford NLP) stopped by our San Francisco office to present Collaborative Gym: A Framework for Enabling and Evaluating Human-Agent Collaboration. As LLM agents get better at automating tasks on their own, a large class of real-world problems still needs a human in the loop – for their preferences, their domain expertise, or simply for control.
June 4, 2026
Alexis Sobel
Image
Benchtalks #2: The future of coding benchmarks
For our second Benchtalks, the series dedicated to the researchers building the measurement toolkits that frontier labs hill-climb on, Snorkel AI co-founder Vincent Sunn Chen sat down with John Yang, a Stanford PhD student and creator of the SWE-bench franchise, SWE-smith, CodeClash, and most recently ProgramBench. Highlights More on ProgramBench: See the benchmark and the upcoming leaderboard at programbench.com. More from John Yang: Publications and writing at john-b-yang.github.io. Snorkel
June 3, 2026
Vincent Sunn Chen
Image

Join our newsletter

For expert advice, the latest research, and exclusive events.
By submitting this form, I acknowledge I will receive email updates from Snorkel AI, and I agree to the Terms of Use and acknowledge that my information will be used in accordance with the Privacy Policy.