Document Classification




Build AI-powered document classification applications in a fraction of the time without hand-labeling data using Snorkel Flow.

Request demo

Image





Technology developed and deployed with the world’s leading organizations
Image
Image
Image
Image
Image
Image
Image
Image



Overview —

One Size Fits You, Not All


Achieve greater performance gains by exploiting domain-specific text features of your own data.



Image
Faster, Lower-cost Development
Use programmatic labeling to develop high-quality AI applications in hours instead of spending weeks or months on expensive hand-labeling.
Image
Higher-accuracy Models
Iterate on your application, using a closed-loop approach with intermediate results and analysis at every step to zero in on errors.
Image
Flexible Integrations
Easily integrate labeling, training and analysis pipelines defined over diverse input types–text, PDF, HTML, and more–with downstream applications using APIs or a Python SDK.
Image
Easier SME Collaboration
Build complex classification apps intuitively while preserving natural information about data taxonomies with subject matter expert (SME) collaboration.






Industry Use Cases —

Explore Enterprise Solutions For Classification


Build industry-specific AI applications combining state-of-the-art machine learning approaches with industry-specific best practices and last-mile connectors, all on an enterprise-scale platform.



FINANCIAL SERVICES



Contract Intelligence

Banks can classify contracts by terms and conditions to smoothly ensure regulatory complience.
TELECOM & CYBER



Customer Segmentation

Telecom organizations can classify customer usage documents to target promotional offers.
HEALTHCARE



Clinical Trial Matching

Biotech organizations can classify patient records to identify actionable clinical trial candidates.
INSURANCE



Risk Classification

Insurance underwriters can classify policy documents by behavioral or occupational variables to assess risk.
SOFTWARE



Search Engine Optimization

Software companies can recognize named entities in customer search queries and to optimize website content.
RETAIL



Product Recommendation

E-commerce sites can recognize entities in product descriptions (price, key words, etc.) to improve recommender systems.






Case Study —

Image
Google used Snorkel to replace 100K+ hand-annotated labels in critical ML pipelines for text classification.



Problem




Content, product, and event classification problems change too fast to hand-label, even with significant annotation budget.

Solution




Google deployed early versions of Snorkel's core technology with three high-impact teams, repurposing many resources as labeling functions.

Results




Hours of labeling function development replaced 10-100K+ hand labels, significantly impacting the bottom line and accelerating of ML adoption.

6 MONTHS
of hand-labeling data replaced in 30 mins
<0hrs
To develop the first custom ML model
52%
performance improvement
+0%
Accuracy for contract classification
100K+
hand labels replaced with programmatic approach
0K
Contracts processed in minutes

Read more






An End-to-end ML Platform —

Designed for Collaboration




Image

For Data Scientists


  • Ready-to-use model zoo
  • Auto-generated analysis tools
  • Integrated Python notebooks
Image

For Domain Experts


  • Rich data annotation suite
  • Intuitive, no-code labeling UI
  • Model error analysis reports
Image

For Developers


  • Fully interoperable API and web UI
  • Write custom operators with Python SDK
  • Integrations to deploy models at scale






Resources —

Explore More About Snorkel


Learn more about groundbreaking techniques for programmatic labeling and weak supervision developed by Team Snorkel and the broader data science community.