Case studies

Technology proven in production at some of the world’s leading organizations.

Request a demo


Case study

Google

Google used Snorkel to replace 100K+ hand-annotated labels in critical ML pipelines for text classification.
Read more



Problem

Content, product, and event classification problems change too fast to hand-label, even with significant annotation budget.

Solution

Google deployed early versions of Snorkel's core technology with three high-impact teams, repurposing many resources as labeling functions.

Results

Hours of labeling function development replaced 10-100K+ hand labels, significantly impacting the bottom line and accelerating of ML adoption.

6 months

of hand-labeling data replaced in 30 mins

52%

performance improvement

100k+

hand labels replaced with a programmatic approach


Case study

Google

Google used Snorkel to replace 100K+ hand-annotated labels in critical ML pipelines for text classification.
Read more



Problem

Content, product, and event classification problems change too fast to hand-label, even with significant annotation budget.

6 months

of hand-labeling data replaced in 30 mins

Solution

Google deployed early versions of Snorkel's core technology with three high-impact teams, repurposing many resources as labeling functions.

52%

performance improvement

Results

Hours of labeling function development replaced 10-100K+ hand labels, significantly impacting the bottom line and accelerating of ML adoption.

100K+

hand labels replaced with programmatic approach


Case Study

Google

Google used Snorkel to replace 100K+ hand-annotated labels in critical ML pipelines for text classification.
Read More



Problem

Content, product, and event classification problems change too fast to hand-label, even with significant annotation budget.

Solution

Google deployed early versions of Snorkel's core technology with three high-impact teams, repurposing many resources as labeling functions.

Results

Hours of labeling function development replaced 10-100K+ hand labels, significantly impacting the bottom line and accelerating of ML adoption.

6 Months

of hand-labeling data replaced in 30 mins

52%

performance improvement

100k+

hand labels replaced with programmatic approach



Case study

Top U.S. bank

A top U.S. bank uses Snorkel Flow to quickly build AI applications that classify and extract information from their documents.
Read more



Problem

The bank estimated that, for a time-sensitive use case, hand-labeling data would take over a month.

Solution

With Snorkel Flow, the team produced a solution that was over 99% accurate in under 24 hours.

Results

The resulting AI application could be quickly and easily adapted to new problems and business lines.

99.1%

Snorkel Flow accuracy

<24hrs

from problem start

>250K

documents processed


Case study

Top U.S. Bank

A top U.S. bank uses Snorkel Flow to quickly build AI applications that classify and extract information from their documents.
Read more



Problem

The bank estimated that, for a time-sensitive use case, hand-labeling data would take over a month.

99.1%

Snorkel Flow accuracy

Solution

With Snorkel Flow, the team produced a solution that was over 99% accurate in under 24 hours.

<24hrs

from problem start

Results

The resulting AI application could be quickly and easily adapted to new problems and business lines.

>250K

# documents processed


Case Study

Top U.S. Bank

A top U.S. bank uses Snorkel Flow to quickly build AI applications that classify and extract information from their documents.
Read More



Problem

The bank estimated that, for a time-sensitive use case, hand-labeling data would take over a month.

Solution

With Snorkel Flow, the team produced a solution that was over 99% accurate in under 24 hours.

Results

The resulting AI application could be quickly and easily adapted to new problems and business lines.

99.1%

Snorkel Flow accuracy

<24hrs

from problem start

>250k

# documents processed



Case study

Apple

Apple built applications with an internal Snorkel-based system that answered billions of queries in multiple languages and processed trillions of records with up to 2.9x fewer errors.
Read more



Problem

Apple needed a system that supported engineers facing contradictory or incomplete supervision data.

Solution

Apple built a solution called Overton which utilized Snorkel’s framework of weak supervision to overcome cost, privacy, and cold-start issues.

Results

Overton achieved a 12%+ bump in F1 score by going from 30K to 1M data labels.

12%

bump in F1 score

2.9%

fewer errors with Snorkel-based applications

32x

more labels generated



Case Study

Apple

Apple built applications with an internal Snorkel-based system that answered billions of queries in multiple languages and processed trillions of records with up to 2.9x fewer errors.
Read more



Problem

Apple needed a system that supported engineers facing contradictory or incomplete supervision data.

12%

bump in F1 score

Solution

Apple built a solution called Overton which utilized Snorkel’s framework of weak supervision to overcome cost, privacy, and cold-start issues.

2.9%

fewer errors with Snorkel-based applications

Results

Overton achieved a 12%+ bump in F1 score by going from 30K to 1M data labels.

32x

more labels generated


Case Study

Apple

Apple built applications with an internal Snorkel-based system that answered billions of queries in multiple languages and processed trillions of records with up to 2.9x fewer errors.
Read More



Problem

Apple needed a system that supported engineers facing contradictory or incomplete supervision data.

Solution

Apple built a solution called Overton which utilized Snorkel’s framework of weak supervision to overcome cost, privacy, and cold-start issues.

Results

Overton achieved a 12%+ bump in F1 score by going from 30K to 1M data labels.

12%

bump in F1 score

2.9%

fewer errors with Snorkel-based applications

32x

more labels generated



Case study

Fortune 500 biotech

A Fortune 500 biotech pioneer leveraged Snorkel Flow to extract critical chronic disease data from clinical trials, accurately processing 300K documents in minutes.
Read more



Problem

Building AI applications to extract entities requires high domain expertise and large amounts of labeled training data, which is expensive and time consuming.

Solution

With Snorkel Flow they built a custom model with 99.1% accuracy by adjusting label schema and re-labeling programmatically.

Results

With Snorkel Flow, this biotech giant programmatically labeled ~300K documents in minutes versus using manual labeling, all while saving $10M in costs.

$10M

saved on labeling for extraction

99.1%

accuracy on complex ML pipeline

1 day

vs. 1 year to adjust label schema


Case study

Fortune 500 Biotech

A Fortune 500 biotech pioneer leveraged Snorkel Flow to extract critical chronic disease data from clinical trials, accurately processing 300K documents in minutes.
Read more



Problem

Building AI applications to extract entities requires high domain expertise, and large amounts of labeled training data, which is expensive and time consuming.

$10M

saved on labeling for extraction

Solution

Used Snorkel Flow to build a custom model with 99.1% accuracy by adjusting label schema and re-labeling done in hours.

99.1%

accuracy on complex ML pipeline

Results

With Snorkel Flow, this biotech giant programmatically labeled ~300k documents in minutes versus using manual labeling, all while saving $10M in costs.

1 day

vs. 1 year to adjust label schema


Case Study

Fortune 500 Biotech

A Fortune 500 biotech pioneer leveraged Snorkel Flow to extract critical chronic disease data from clinical trials, accurately processing 300K documents in minutes.
Read More



Problem

Building AI applications to extract entities requires high domain expertise, and large amounts of labeled training data, which is expensive and time consuming.

Solution

Used Snorkel Flow to build a custom model with 99.1% accuracy by adjusting label schema and re-labeling done in hours.

Results

With Snorkel Flow, this biotech giant programmatically labeled ~300k documents in minutes versus using manual labeling, all while saving $10M in costs.

$10M

saved on labeling for extraction

99.1%

accuracy on complex ML pipeline

1 day

vs. 1 year to adjust label schema



Case study

Intel

Intel used Snorkel to replace a high-cost, high-latency crowdsourcing pipeline and accelerate sales and marketing agents.
Read more



Problem

Rapidly changing sales goals make social media monitoring difficult to maintain.

Solution

Deployed a proto version of Snorkel(Snorkel Osprey) to rapidly replace crowdworker labels that took months with programmatically generated labels.

Results

Better performance and major cost savings in Sales & Marketing and Advanced Analytics.

6 months

of crowdworker labels replaced

+18.5

point performance improvement

+28.5

coverage percentage points



Case study

Intel

Intel used Snorkel to replace a high-cost, high-latency crowdsourcing pipeline and accelerate sales and marketing agents.
Read more



Problem

Rapidly changing sales goals make social media monitoring difficult to maintain.

6 months

of crowdworker labels replaced

Solution

Deployed a proto version of Snorkel (Snorkel Osprey) to replace months-long crowdworker labels with inexpensive and fast programmatic labeling.

+18.5

performance improvement

Results

Better performance and major cost savings in Sales and Marketing and Advanced Analytics.

+28.5

coverage percentage points


Case Study

Intel

Intel used Snorkel to replace a high-cost, high-latency crowdsourcing pipeline and accelerate sales and marketing agents.
Learn More



Problem

Rapidly changing sales goals make social media monitoring difficult to maintain.

Solution

Deployed a proto version of Snorkel (Snorkel Osprey) to replace months-long crowdworker labels with cheap & fast programmatic labeling.

Results

Better performance and major cost savings in Sales & Marketing and Advanced Analytics.

6 Months

of crowdworker labels replaced

+18.5

performance improvement

+28.5

coverage percentage points



Case study

Fortune 500 telecom

A Fortune 500 telecom provider used Snorkel Flow to classify encrypted network data flows into their associated application categories.
Read more



Problem

AI-enabled network applications are blocked by the lack of training data, which is typically slow and time-consuming to create and requires network expertise.

Solution

They used Snorkel’s programmatic labeling to precisely classify network traffic, taking advantage of unlabeled/partially labeled data.

Results

The telco trained 200k labels in hours and achieved +25% accuracy above their ground truth baseline using Snorkel Flow’s comprehensive network data exploration and analysis tools.

100k

labels trained in hours

+25%

accuracy above ground truth baseline

+75%

accuracy improved on critical data slice


Case study

Fortune 500 telecom

A Fortune 500 telecom provider used Snorkel Flow to classify encrypted network data flows into their associated application categories.
Read more



Problem

AI-enabled network applications are blocked by the lack of training data, which is typically slow and time-consuming to create and requires network expertise.

100K

labels trained in hours

Solution

They deployed Snorkel’s unique programmatic labeling to precisely classify network traffic, while taking advantage of unlabeled/partially labeled data.

+25%

accuracy above ground truth baseline

Results

The telco trained 200K labels in hours and achieved +25% accuracy above ground truth baseline, all using Snorkel’s comprehensive network data exploration and analysis tools.

+75%

accuracy improved on critical data slice


Case Study

Fortune 500 Telco

A Fortune 500 telecom provider used Snorkel Flow to classify encrypted network data flows into their associated application categories.
Read More



Problem

AI-enabled network applications are blocked by the lack of training data, which is typically slow and time-consuming to create and requires network expertise.

Solution

They deployed Snorkel’s unique programmatic labeling to precisely classify network traffic, while taking advantage of unlabeled/partially labeled data.

Results

The telco trained 200K labels in hours and achieved +25% accuracy above ground truth baseline, all using Snorkel’s comprehensive network data exploration and analysis tools.

100k

labels trained in hours

+25%

accuracy above ground truth baseline

+75%

accuracy improved on critical data slice



Case study

Fortune 50 bank

In just weeks, a Fortune 50 bank achieved a 25+ point performance gain over a black box vendor solution for news analytics application with Snorkel Flow.
Read more



Problem

The bank needed an accurate way to tag companies in unstructured news text, link them to identifiers (e.g., stock tickers), and classify mentions by sentiment and other aspects.

Solution

The bank used Snorkel Flow to develop an AI-powered news analytics application that monitors target companies' press coverage in unstructured data feeds.

Results

With Snorkel Flow, the team achieved a 25+ point performance gain over a legacy vendor system and internal heuristic approaches.

45x

faster compared to hand-labeling

+90

F1 score for news analytics application

+25%

performance gain over black box vendor system



Case study

Fortune 50 Bank

A Fortune 50 bank achieved a 25+ point performance gain over a black box vendor solution for news analytics application with Snorkel Flow- in just a few weeks.
Read more



Problem

The bank needed an accurate way to tag companies in unstructured news text, link them to identifiers (e.g., stock tickers), and classify mentions by sentiment and other aspects.

45x

faster compared to hand-labeling

Solution

The bank used Snorkel Flow to develop an AI-powered news analytics application that monitors target companies' press coverage in unstructured data feeds.

+90

F1 score for news analytics application

Results

With Snorkel Flow, the team achieved a 25+ point performance gain over a legacy vendor system and internal heuristic approaches.

+25%

point performance gain over black box vendor system


Case Study

Fortune 50 Bank

A Fortune 50 bank achieved a 25+ point performance gain over a black box vendor solution for news analytics application with Snorkel Flow- in just a few weeks.
Read More



Problem

The bank needed an accurate way to tag companies in unstructured news text, link them to identifiers (e.g., stock tickers), and classify mentions by sentiment and other aspects.

Solution

The bank used Snorkel Flow to develop an AI-powered news analytics application that monitors target companies' press coverage in unstructured data feeds.

Results

With Snorkel Flow, the team achieved a 25+ point performance gain over a legacy vendor system and internal heuristic approaches.

45x

faster compared to hand-labeling

+90

F1 score for news analytics application

+25%

point performance gain over black box vendor system



Case study

Stanford Medicine

Researchers at Stanford Medicine used Snorkel to label medical imaging and monitoring datasets, replacing person-years of hand-labeling with several hours of using Snorkel.
Read more



Problem

Labeling training data for triaging models takes person-months to person-years of radiologist time.

Solution

Stanford Medicine deployed a cross-modal Snorkel pipeline, matching or exceeding the performance of painstakingly gathered manual labels in hours.

Results

Currently being tested for deployment in hospital systems at Stanford and the Department of Veteran Affairs (VA).

8 months

person-months of labeling replaced

94%

ROC AUC performance

50k+

images labeled in minutes


Case study

Stanford Medicine

Researchers at Stanford Medicine used Snorkel to label medical image datasets, replacing person-years of hand-labeling with several hours of using Snorkel.
Read more



Problem

Labeling training data for triaging models takes person-months to person-years of radiologist time.

8 months

person-months of labeling replaced

Solution

We deployed a cross-modal Snorkel pipeline, matching or exceeding the performance of painstakingly gathered manual labels in hours.

94%

ROC AUC performance

Results

Currently being tested for deployment in Stanford and Department of Veteran Affairs (VA) hospital systems.

50K+

images labeled in minutes


Case Study

Stanford Medicine

Researchers at Stanford Medicine used Snorkel to label medical imaging & monitoring datasets, replacing person-years of hand-labeling with several hours of using Snorkel.
Read More



Problem

Labeling training data for triaging models takes person-months to person-years of radiologist time.

Solution

We deployed a cross-modal Snorkel pipeline, matching or exceeding the performance of painstakingly gathered manual labels in hours.

Results

Currently being tested for deployment in Stanford & Department of Veteran Affairs (VA) hospital systems.

8 months

person-months of labeling replaced

94%

ROC AUC performance

50k+

images labeled in minutes



Case study

Global financial services leader

A global financial services leader extracts financial information from PDFs with 99% accuracy in milliseconds using a financial spreading application built with Snorkel Flow.
Read more



Problem

The bank needed to extract structured financial data from balance sheets and income statements (hOCR PDF) from private company financials.

Solution

The bank used Snorkel Flow to develop an AI-powered financial spreading application that parses textual and spatial/visual data features.

Results

With Snorkel Flow, the team achieved superior performance with greater generalizability (2x coverage) compared to a purely rules-based approach.

2x

coverage compared to rules-based approach

99%

extraction accuracy

45x

faster compared to hand-labeling



Case Study

Global Financial Services Leader

A global financial services leader extracts financial information from PDFs with 99% accuracy in milliseconds using a financial spreading application built with Snorkel Flow.
Read more



Problem

The bank needed to extract structured financial data from balance sheets and income statements (hOCR PDF) from private company financials.

2x

coverage compared to rules-
based approach

Solution

The bank used Snorkel Flow to develop an AI-powered financial spreading application that parses textual and spatial/visual data features.

99%

extraction accuracy

Results

With Snorkel Flow, the team achieved superior performance with greater generalizability (2x coverage) compared to a purely rules-based approach.

45x

faster compared to hand-labeling


Case Study

Global Financial Services Leader

A global financial services leader extracts financial information from PDFs with 99% accuracy in milliseconds using a financial spreading application built with Snorkel Flow.
Read More



Problem

The bank needed to extract structured financial data from balance sheets and income statements (hOCR PDF) from private company financials.

Solution

The bank used Snorkel Flow to develop an AI-powered financial spreading application that parses textual and spatial/visual data features.

Results

With Snorkel Flow, the team achieved superior performance with greater generalizability (2x coverage) compared to a purely rules-based approach.

2x

coverage compared to rules-based approach

99%

extraction accuracy

45x

faster compared to hand-labeling



Case study

Tide

A UK based fintech company used Snorkel to match receivable invoices from the mobile app with incoming transactions.
Read more



Problem

Tide needed to label matching invoices with transactions that required investing highly paid subject matter experts’ time in hand-labeling historical data.

Solution

Used Snorkel to programmatically label data, extract information, and harness business knowledge by creating labeling functions.

Results

Achieved 97.6% accuracy to detect transactions made for a particular invoice. Created training data programmatically replacing 1000 hours of hand-labeling.

15

days to create training dataset and deploy model

97%

ML model accuracy

5M

invoices processed


Case Study

Tide

A UK based fintech company, used Snorkel to match receivable invoices from the mobile app with incoming transactions.
Read more



Problem

Tide needed to label matching invoices with transactions that required investing highly paid subject matter experts’ time in hand-labeling historical data.

15

days to create training dataset & deploy model

Solution

Used Snorkel to programmatically label data, extract information, and harness business knowledge by creating labeling functions.

97%

ML model accuracy

Results

Achieved 97.6% accuracy to detect transactions made for a particular invoice. Created training data programmatically replacing 1000 hours of hand-labeling.

5M

invoices processed


Case Study

Tide

A UK based fintech company, used Snorkel to match receivable invoices from the mobile app with incoming transactions.
Read More



Problem

Tide needed to label matching invoices with transactions that required investing highly paid subject matter experts’ time in hand-labeling historical data.

Solution

Used Snorkel to programmatically label data, extract information, and harness business knowledge by creating labeling functions.

Results

Achieved 97.6% accuracy to detect transactions made for a particular invoice. Created training data programmatically replacing 1000 hours of hand-labeling.

15

days to create training dataset & deploy model

97%

ML model accuracy

5M

invoices processed

Image

The Future of

Data-Centric AI


August 3-4, 2022 | Virtual

Register now