Snorkel Research Project

The Snorkel AI founding team started the Snorkel Research Project at Stanford AI Lab in 2015, where we set out to explore a higher-level interface to machine learning through training data. This project was sponsored by Google, Intel, DARPA, and several other leading organizations and the research was represented in over 40 academic conferences such as ACL, NeurIPS, Nature and more.
Snorkel Open Source Research Library was primarily developed from 2015 to 2017 as a prototyping tool. Unlike Snorkel Flow, it is not a comprehensive platform for AI development. It is a Python library that contains a legacy base class for defining code-based Labeling Functions (LFs) and some early algorithms for combining LF votes.
Snorkel Research Project

Snorkel Flow

Snorkel Flow is a platform built by the original creators of the Snorkel Research Project, incorporating years of experience from applying weak supervision and programmatic labeling concepts to real-world ML problems.

In Snorkel Flow, users can label and manage data programmatically, train models and identify model error modes to iteratively improve them in a rapid, data-centric workflow, using both SDK and no-code interfaces.

This shortens the development cycle and improves application quality significantly while also making it easier to manage bias and adapt to changes in production data or business objectives.

Snorkel Flow is used by some of the world’s most advanced organizations in banking, insurance, biotech, telecommunications, and several government agencies.

Snorkel Flow: a data-centric AI platform

Feature evolution

Snorkel Flow is an enterprise-grade platform built to make the core concepts of the Snorkel Research Project and data-centric AI practical for the enterprise. With Snorkel Flow, enterprises build and deploy accurate and adaptable AI applications rapidly.


Snorkel Research Project
Programmatic labeling
Data scientists write Labeling Functions (LF) in Python code
Data scientists and domain expert users create LFs in a no-code, push-button UI
UI-based analysis, feedback, and suggesting to guide iterative LF development
Auto-suggest and auto-tuning of LFs
Built-in interactive data visualization with support for building LFs by drawing directly on data plots
Training dataset management
Basic algorithms for denoising and combining LF outputs
Advanced algorithms and automated tuning for denoising and combining LF outputs, including correlation analysis
One-click to execute LFs with automated parallelization and label model optimization
Model training and analysis
Train custom models using Python
One-click to train and tune pre-configured, state-of-the-art models via the built-in model zoo
One-click to execute LFs with automated parallelization and label model optimization
Auto-generated UI-based model analysis with suggestions for model and LF improvement
Application and model serving

Train custom models using Python

One-click endpoint creation for model/application serving
Deployment and security
REST API, monitoring services, and managed workers for job execution
Snorkel AI-hosted and managed hybrid cloud (AWS) deployment
Support for distributed deployment via Kubernetes
Encryption (in-transit and at-rest), authentication, and role-based access control (RBAC)
Training and support

Are you ready to dive in?

Build high-quality AI 100x faster with Snorkel Flow, the AI data development platform.
Get started

The Future of Data-Centric AI

June 7-8, 2023

Claim your free ticket