Snorkel AI Raises $85m Series C at $1b Valuation for Data-Centric AI
We started the Snorkel project at the Stanford AI lab in 2015 around two core hypotheses:
- As models became increasingly powerful and commoditized, success or failure in AI was going to be all about the training data, and as a result, AI development was going to shift from being model-centric to data-centric.
- If AI development was going to be data-centric, tasks like labeling, augmenting, slicing, cleaning, and monitoring data would all have to be increasingly programmatic for AI to be as practical and accessible as any other type of software development.
Over half a decade later, and two years since the start of our journey as a company, we couldn’t be more excited or humbled to see these bets become so relevant and resonant. Today, an increasing number of organizations see that data is the arbiter of their AI success or failure, as well as their risk, governance, and privacy compliance, fairness and equitability, and agility. And, as data moves to the forefront, so too does the pain of labeling and managing it all by hand. Even the largest organizations in the world are blocked from using AI when person-months of manual effort are needed every time a model needs to be built or updated. At the forefront of this shift from model-centric to data-centric AI, our platform, Snorkel Flow, has enabled some of the world’s largest and most sophisticated organizations to bridge this gap between the challenges of real-world data and the power of modern AI. Snorkel Flow is the first and only data-centric AI development platform that enables users to programmatically label datasets, train and deploy models, identify error modes in their data, and rapidly iterate to improve and adapt, all without needing to spend months manually labeling data. Snorkel Flow now supports customers, including top U.S. banks, government agencies, and global enterprises across insurance, telecommunications, biotech, and more, accelerating their AI development and driving seven- to eight-figure business ROIs.Today, we are delighted to announce that BlackRock and Addition are leading an $85 million Series C investment in Snorkel to help us drive even greater momentum in our product and technology development, team growth, and go-to-market velocity. As part of our Series C, we also welcomed Factory and Cooley as new investors. We are also grateful to have the continued strong support from our existing investors, including Greylock, GV, Lightspeed Venture Partners, Nepenthe Capital, Walden International, and Stone Bridge Ventures. We are beyond excited to work closely with this group to bring the full scope of our data-centric AI vision to market.
The Future of AI is Data-Centric
Five-plus years into the resurgent wave of modern deep learning techniques, machine learning (ML) models have never been so powerful, automated, or accessible as they are for most enterprise use cases today. However, ML models have also never been more data-hungry, requiring massive labeled training datasets to learn from for each new task. For most organizations, AI success or failure is now determined by how quickly and accurately this training data is developed–not which model is picked. This shift from model-centric to data-centric development is transforming the landscape of AI, and making it clear that the legacy approach of manually labeling and managing data is simply not a feasible path forward.Snorkel Flow is the first data-centric AI platform to move beyond the bottleneck of manual data labeling, via a unique programmatic approach to labeling and managing the data at the heart of AI development. In Snorkel Flow, users manage data throughout the full AI lifecycle by writing programs to label, manipulate, and monitor training data. These programmatic inputs are modeled and integrated using theoretically-grounded statistical techniques, made accessible to both developer and non-developer users alike via both a no-code UI and Python SDK, and used to drive a whole new programmatic, data-centric development and lifecycle process within the Snorkel Flow platform.The end result: rather than AI development projects that are 80-90% manually labeling data (or that are simply abandoned due to infeasibility), Snorkel Flow enables users to develop performant AI applications in hours or days, writing code or pushing buttons to rapidly and iteratively program ML models. This leads to a process that is not just faster but more agile, transparent, and more collaborative between data scientists and subject matter experts.With Snorkel Flow, our customers have reduced development time from months to days, delivered higher-accuracy models that are also more privacy-compliant, adaptable, auditable, and responsibly governable, and driven seven- to eight-figure ROIs; examples include:
- A top U.S. bank developed a contract processing application with over 99% accuracy in under 24 hours.
- A Fortune 50 bank built a news analytics application 45x faster and with +25% higher accuracy than a previous system.
- A global telco improved the quality of over 200,000 labels for network classification resulting in a 25% improvement in the accuracy over ground truth baseline.
- A large biotech firm saved an estimated $10M on unstructured data extraction, achieving 99.1% accuracy with Snorkel Flow
If you’d like to see how Snorkel Flow can be deployed in your organization request a demo today.
Accelerating Data-Centric AI Adoption with Increased Investment
One year out of stealth, and two years since our formation, Snorkel AI is one of the fastest-growing startups in the AI industry. However, we rest this growth on top of patiently built foundations, having spent over half a decade and counting driving the development of core data-centric AI algorithms, systems, theory, and applications at places like Stanford, University of Washington, Brown, and University of Wisconsin, spanning areas from weak supervision to multitask and transfer learning, and data modalities from text and semi-structured to image, video, time-series, and beyond.With this $85 million investment, we plan to accelerate turning our years of breakthrough AI research–and the continued advances of our team–into core product capabilities within Snorkel Flow. We also plan to significantly accelerate the build-out of our go-to-market and customer success infrastructure so as to redouble our commitment to every customer we onboard: that Snorkel Flow will drive immediate and measurable AI acceleration and business impact.And of course, to support all this, we plan to continue building one of the best teams in AI (I’m allowed to be openly biased here, right?) on both engineering and go-to-market fronts–so please check out open opportunities on our careers page!
Alex Ratner is the co-founder and CEO at Snorkel AI, and an affiliate assistant professor of computer science at the University of Washington. Prior to Snorkel AI and UW, he completed his Ph.D. in computer science advised by Christopher Ré at Stanford, where he started and led the Snorkel open source project. His research focused on data-centric AI, applying data management and statistical learning techniques to AI data development and curation.