Case Studies




Technology proven in production at some of the world’s leading organizations.

Request demo






Technology developed and deployed with the world’s leading organizations
Image
Image
Image
Image
Image
Image
Image
Image



Image
Google

Google used Snorkel to replace 100K+ hand-annotated labels in critical ML pipelines.



Problem




Content, product, and event classification problems change too fast to hand-label, even with significant annotation budget.

Solution




We deployed early versions of Snorkel Flow's core technology with three high-impact teams at Google, repurposing many organizational resources as labeling functions.

Results




Hours of labeling function development replaced 10-100K+ hand labels, significantly impacting the bottom line and acceleration of ML solution adoption.

6 MONTHS
of hand-labeling data replaced in 30 mins
<0hrs
To develop the first custom ML model
52%
Performance improvement
+0%
Accuracy for contract classification
100K
hand labels replaced with programmatic approach
0K
Contracts processed in minutes

Read more






Image
Top U.S. Bank

A top U.S. bank uses Snorkel Flow to quickly build AI applications that classify and extract information from their documents.



Problem




The bank estimated that, for a time-sensitive use case, hand-labeling data would take over a month.

Solution




With Snorkel Flow, the team produced a solution that was over 99% accurate in under 24 hours.

Results




The resulting AI application could be quickly and easily adapted to new problems and business lines.

99 .1%
Snorkel Flow Accuracy
<0hrs
To develop the first custom ML model
<24hrs
From problem start
+0%
Accuracy for contract classification
>250K
# Documents processed
0K
Contracts processed in minutes






Image

Apple


Apple built applications with an internal Snorkel-based system that answered billions of queries in multiple languages and processed trillions of records with up to 2.9x fewer errors.




Problem




Apple needed a system that supported engineers facing contradictory or incomplete supervision data.


Solution




Apple built a solution called Overton which utilized Snorkel’s framework of weak supervision to overcome cost, privacy, and cold-start issues.


Results




Overton achieved a 12%+ bump in F1 score by going from 30K to 1M data labels. 


12%

bump in F1 score

<0hrs
To develop the first custom ML model
2.9x

fewer errors with Snorkel-based applications

+0%
Accuracy for contract classification
32x

more labels generated

0K
Contracts processed in minutes

Read more






Image

Fortune 500 Biotech


A Fortune 500 biotech pioneer leveraged Snorkel Flow to extract critical chronic disease data from clinical trials, accurately processing 300K documents in minutes.




Problem




Building AI applications to extract entities requires high domain expertise, and large amounts of labeled training data, which is expensive and time consuming.

Solution





Used Snorkel Flow to build a custom model with 99.1% accuracy by adjusting label schema and re-labeling done in hours.

Results




With Snorkel Flow, this biotech giant programmatically labeled ~300k documents in minutes versus using manually labeling, all while saving $10M in costs.


$10M
Saved on labeling for extraction
<0hrs
To develop the first custom ML model
99.1%
Accuracy on complex ML pipeline
+0%
Accuracy for contract classification
1 Day
Vs. 1 year to adjust label schema
0K
Contracts processed in minutes






Image
Intel

Intel used Snorkel to replace a high-cost, high-latency crowdsourcing pipeline and accelerate sales and marketing agents.




Problem




Rapidly changing sales goals make social media monitoring difficult to maintain.

Solution




Deployed a proto version of Snorkel (Snorkel Osprey) to replace months-long crowdworker labels with cheap & fast programmatic labeling.


Results




Better performance and major cost savings in Sales & Marketing and Advanced Analytics.

6 MONTHS
of crowdworker labels replaced
<0hrs
To develop the first custom ML model
+18.5
Performance improvement
+0%
Accuracy for contract classification
+28.5
Coverage percentage points
0K
Contracts processed in minutes

Read more






Image

Fortune 500 Telco


A Fortune 500 telecom provider used Snorkel Flow to classify encrypted network data flows into their associated application categories.




Problem




AI-enabled network applications are blocked by the lack of training data, which is typically slow and time-consuming to create and requires network expertise.


Solution




They deployed Snorkel’s unique programmatic labeling to precisely classify network traffic, while taking advantage of unlabeled/partially labeled data.


Results




The telco trained 200K labels in hours and achieved +25% accuracy above ground truth baseline, all using Snorkel’s comprehensive network data exploration and analysis tools.


200K

Labels trained in hours

<0hrs
To develop the first custom ML model
+25%

Accuracy above ground truth baseline

+0%
Accuracy for contract classification
+75%

Accuracy improved on critical data slice

0K
Contracts processed in minutes







Image

Fortune 50 Bank


A Fortune 50 bank achieved a 25+ point performance gain over a black box vendor solution for news analytics application with Snorkel Flow- in just a few weeks.




Problem




The bank needed an accurate way to tag companies in unstructured news text, link them to identifiers (e.g., stock tickers), and classify mentions by sentiment and other aspects. 


Solution




The bank used Snorkel Flow to develop an AI-powered news analytics application that monitors target companies' press coverage in unstructured data feeds. 


Results




With Snorkel Flow, the team achieved a 25+ point performance gain over a legacy vendor system and internal heuristic approaches.


45x

faster compared to hand-labeling

<0hrs
To develop the first custom ML model
+90

F1 score for news analytics application

+0%
Accuracy for contract classification
+25%

point performance gain over black box vendor system

0K
Contracts processed in minutes






Image
Stanford Medicine

Researchers at Stanford Medicine used Snorkel to label medical imaging & monitoring datasets, replacing person-years of hand labeling with several hours of using Snorkel.




Problem




Labeling training data for triaging models takes person-months to person-years of radiologist time.

Solution




We deployed a cross-modal Snorkel pipeline, matching or exceeding the performance of painstakingly gathered manual labels in hours.

Results




Currently being tested for deployment in Stanford & Department of Veteran Affairs (VA) hospital systems.


8
Person-months of labeling replaced
<0hrs
To develop the first custom ML model
94%
ROC AUC Performance
+0%
Accuracy for contract classification
50K+
Images labeled in minutes
0K
Contracts processed in minutes

Read more






Image

Global Financial Services Leader


A global financial services leader extracts financial information from PDFs with 99% accuracy in milliseconds using a financial spreading application built with Snorkel Flow.




Problem




The bank needed to extract structured financial data from balance sheets and income statements (hOCR PDF) from private company financials.


Solution




The bank used Snorkel Flow to develop an AI-powered financial spreading application that parses textual and spatial/visual data features. 


Results




With Snorkel Flow, the team achieved superior performance with greater generalizability (2x coverage) compared to a purely rules-based approach.


2x

coverage compared to rules-based approach

<0hrs
To develop the first custom ML model
99%

extraction accuracy

+0%
Accuracy for contract classification
45x

faster compared to hand-labeling

0K
Contracts processed in minutes






Image
Tide

A UK based fintech company, used Snorkel to match receivable invoices from the mobile app with incoming transactions.



Problem




Tide needed to label matching invoices with transactions that required investing highly paid subject matter experts’ time in hand-labeling historical data.

Solution




Used Snorkel to programmatically label data, extract information, and harness business knowledge by creating labeling functions.

Results




Achieved 97.6% accuracy to detect transactions made for a particular invoice. Created training data programmatically replacing 1000 hours of hand labeling.

15
Days to create training dataset & deploy model
<0hrs
To develop the first custom ML model
97%
ML model accuracy
+0%
Accuracy for contract classification
5M
Invoices processed
0K
Contracts processed in minutes

Read more