of crowd-worker labels replaced in a fraction of time using Snorkel
Targeted Applications to Tackle Any Entity
Train custom, high-accuracy NER models on your data without hand-labeling.
Faster, Lower-cost Development
Use programmatic labeling to develop high-quality AI applications in hours instead of spending weeks or months on expensive hand-labeling.
Monitor for changes in the data, and rapidly adapt using built-in error analysis tools. Zoom in on errors to fine-tune training data & models with guided iteration.
Leverage large amounts of labeled and unlabeled data, NLP primitives, and state-of-the-art model architectures to build high-accuracy models.
Easily integrate labeling, training and analysis pipelines defined over diverse input types–text, PDF, HTML, and more–with downstream applications using APIs or a Python SDK.
NER Customized for Your Workflow
Banks can extract entities like client ID, IBAN number, and transaction details to automate account identification.
TELECOM & CYBER
Malicious Attack Prevention
Cybersecurity teams can automatically identify and track CVEs in real-time.
Hospitals can find entities in patient records to identify health patterns and improve diagnoses.
Insurance firms can recognize entities like insured person, loss amount, and policyholder to process claims faster.
Search Engine Optimization
Software companies can recognize named entities in customer search queries and to optimize website content.
Intel used Snorkel to replace a high-cost, high-latency crowdsourcing pipeline and accelerate sales & marketing agents.
Deployed Snorkel to replace months-long crowd-worker effort with cheap and fast template-based programmatic labeling.
Better performance and major cost savings for sales & marketing and customer analytics.
of crowd-worker labels replaced
precision percentage points
coverage percentage points
How Snorkel Flow Works
Snorkel Flow allows you to recognize entities with ease and accuracy across an extensive collection of texts and documents. Collect and clean data with built-in pre-processors and taggers. Programmatically label documents with custom labeling functions, build end-to-end pipelines using your custom-trained models or tailored heuristics to perform entity tagging, linking, or classification. Deploy your NER application or seamlessly integrate it into downstream NLP tasks.
An End-to-end ML Platform
Designed for Collaboration
Data Scientist Friendly
- Integrated Jupyter notebooks
- Instant analysis tools
- Ready-to-use models
Domain Expert Friendly
- Intuitive, no-code UI
- Rich dashboards and visualizations
- Full-featured, push-button error analysis
- Platform access via Python SDK
- Online or batch API deployment
- Containerized software for cloud or on-premises deployments
Trove: Ontology-driven Weak Supervision for
Medical Entity Classification
Train and You’ll Miss It: Interactive Model Iteration with Weak Supervision…M. Chen, et al, 2020
The Role of Massively Multi-Task and Weak Supervision in Software 2.0A. Ratner, et al, CIDR 2019
SwellShark: A Generative Model for Biomedical NER without Labeled DataJ. Fries, et al, 2017
Slice-based Learning: A Programming Model for Residual Learning…V. Chen, et al, NeurIPS 2019
Medical Device Surveillance With Electronic Health RecordsA. Callahan, et al. NPJ Digital Medicine 2019