Snorkel Flow


The data-first AI platform powered by programmatic labeling. Turn data into accurate and adaptive applications—fast.


Request demo






Technology developed and deployed with the world’s leading organizations
Image
Image
Image
Image
Image
Image
Image
Image



How Snorkel Flow works —




Powered By Programmatic Labeling




Label data programmatically, train models efficiently, improve performance iteratively, and deploy applications rapidly—all in one platform. 




Image
Image

01


Label & Build
Label and build training data programmatically in hours without months of hand-labeling
Image

02


Integrate & Manage
Automatically clean, integrate, and manage programmatic training data from all sources
Image

03


Train & Deploy
Train and deploy state-of-the-art machine learning models in-platform or via Python SDK
Image

04


Analyze & Monitor
Analyze and monitor model performance to rapidly identify and correct error modes in the data






01 Label & Build


Rather than hand-labeling thousands of data points by hand, use Data Studio to programmatically label massive amounts of training data using labeling functions—rules, heuristics, and other custom complex operators—via a push-button UI or Python SDK using integrated notebooks. Get started quickly with ready-made labeling functions (LF) builders, data exploration tools,  or nifty auto-suggest features. Receive instant feedback with coverage and accuracy estimates of your LFs to develop a high-quality training data set.


Ready-made Labeling Function Builders


Image
REGEX
Image
KEYWORD
Image
NUMERICAL
Image

CROWDWORKERS

Image
DICTIONARY


Image






02 Integrate & Manage


Snorkel Flow automatically learns the different labeling functions’ accuracies, denoises and integrates them, and stores versioned LF packages and training data in Data Manager. Unlike with hand-labeled data, you create training data in Snorkel Flow using code, so you can audit, modify, or serve it almost instantly. Snorkel Flow makes it easy to share resources, both LF and training data, with others on the team--no need to reinvent the wheel.


Image






03 Train & Deploy


Train state-of-the-art ML models with a button push or via Python SDK using integrated notebooks to plug into your existing modeling pipelines. Snorkel Flow provides access to popular open-source model libraries that you can train on CPU- or GPU-based computing infrastructure. Snorkel Flow makes it easy to tune your models with automated hyperparameter search. Deploy these high-accuracy models immediately as real-time or batch APIs or via the SDK.


Snorkel Flow Model Zoo


Image
SCIKIT-LEARN
Image

HUGGINGFACE TRANSFORMERS

Image
XG BOOST
Image

TENSORFLOW

Image
CUSTOM

Image






04 Analyze & Monitor


Snorkel Flow includes several commonly used and custom analysis tools to compare multiple models over different data splits. It offers suggestions on improving model quality by adding or editing LFs or optimizing the model to target specific errors. You can also monitor performance drifts in LFs or the model and rapidly adapt to changes without relabeling from scratch. The result: AI development is now an iterative process rather than a one-and-done exercise that leaves performance on the table.


Image






Watch How Snorkel Flow Works










Why Snorkel Flow —



Data-First AI Development




Rather than relying solely on generic third-party models, brittle rule-based systems, or armies of human labelers, accelerate AI development with a new, data-first approach using Snorkel Flow.


With Snorkel Flow

  • Customize state-of-the-art models by training with your data & adapt to changing data or goals with a few lines of code.
  • Leverage cutting-edge ML to go beyond simple rules, and retain the flexibility to audit and adapt.
  • Label thousands of data points programmatically in hours, while keeping your data in-house and private.


With Conventional Approaches

  • Black box models or APIs ignore the nuances of your data and objectives, and offer no way to customize, adapt, or audit their behavior.
  • Rules-based approaches often don’t generalize as well as ML models on complex data or adapt easily to changing data or goals.
  • Hand-labeled ML is notoriously expensive & slow with limited ability to iterate, adapt, audit, or be privacy compliant.






Features —


Enterprise-Grade Capabilities


Advanced features to foster collaboration across roles, from data scientists and developers to subject matter experts, and leverage data at enterprise scale to build highly-accurate models.










Application Studio*


Application Studio is a visual builder with pre-built solutions for industry-specific use cases and common AI tasks, giving you a head start developing ML-based applications over your data. Packaged application-specific pre- and post-processors, labeling functions, models, as well as a library of operator DAGs make customizing applications as easy as dragging and dropping new operators into the application flow. 

*Application Studio is in preview and will be generally available later in 2021.


Image






Easy Integrations & Interoperability



Integrating Snorkel Flow with other machine learning and data systems is as easy as writing a line of Python–quickly integrate your existing training labels, data, models, and full applications with Snorkel Flow’s SDK and API access points at all stages in the development/deployment pipeline.







Diverse Data Types


Snorkel’s technology has been proven to work with a wide range of data types and the use of cross-modal data--enabling solutions for use cases that weren’t possible before.


Image
TEXT
Image
DOCUMENTS
Image
TIME SERIES
Image
PDF
Image
TABLES
Image
FORMS
Image
WEBPAGES
Image
NETWORK DATA






Flexible UI for All Teams


Snorkel Flow’s no-code GUI, Python SDK, and developer APIs are fully-interoperable, so your entire team—from engineers and data scientists to subject matter experts—can collaborate in the development workflow.


Image






Secure, Scalable Deployment


Range of enterprise-grade secure deployment options available including single-node or distributed on Kubernetes based on your needs.


Image

SECURE PRIVATE CLOUD/ON-PREM

Image

SECURE PUBLIC CLOUD

Image

SECURE SNORKEL CLOUD







User community —


What the Data Scientists Are Saying










Solutions —



AI Use Cases


Build and deploy use cases previously blocked by training data by combining state-of-the-art ML with industry-specific best practices and flexible API-based integrations using Snorkel Flow.



INSURANCE



Risk Classification

Classify policy documents on the basis of the behavior or occupation to assess risk.

TELECOM & CYBER



Customer Segmentation

Build customized promotions by analyzing customer behavior and demographics.

HEALTHCARE



Clinical Trial Matching

Determine clinical trial candidates by categorizing patient records.

FINANCIAL SERVICES




News Analytics


Extract entities, events, and relationships to improve investment and risk strategies and more.

TELECOM & CYBER




Interaction Analytics


Understand every customer interaction deeply by analyzing chats, emails, and tickets.

FINANCIAL SERVICES




Financial Spreading


Manage credit risk by collecting financial and non-financial data in any format from statements.

SOFTWARE



Search Engine Optimization

Identify named entities in customer search queries and optimize content on websites.

RETAIL



Product Recommendation

Enhance recommender systems by identifying entities (price, keywords, etc.) in product descriptions.

FINANCIAL SERVICES



Contract Intelligence

Extract and organize data from a wide variety of complex contracts efficiently.

RETAIL




Product Catalogs


Extract product attributes from tables, lists, and forms for cataloging.

SOFTWARE




Email Filtering & Routing


Classify emails to remove spam and route queries to the correct channels.