Latest posts

Augmenting the clinical trial design process with information extraction

The future of data-centric AI talk series Background Michael DAndrea is the Principal Data Scientist at Genentech. He earned his MBA from Cornell University and a Master’s degree in Computing and Education from Columbia University. He currently works on using unstructured data sources for clinical trial analytics and his team is partnered with the Stanford “AI For Health” initiative as…

Dr. Bubbles, Snorkel AI's mascot
February 22, 2022

Q4 LTS Release of Snorkel Flow

We’re excited to announce the Q4 2021 LTS release of Snorkel Flow, our data-centric AI development platform powered by programmatic labeling. This latest release introduces a number of new product capabilities and enhancements, from a streamlined programmatic data development interface, to enhanced auto-suggest for labeling functions, to new machine learning capabilities like AutoML, to significant performance enhancements for PDF data…

February 8, 2022

Making Automated Data Labeling a Reality in Modern AI

Moving from Manual to Programmatic Labeling Labeling training data by hand is exhausting. It’s tedious, slow, and expensive—the de facto bottleneck most AI/ML teams face today 1. Eager to alleviate this pain point of AI development, machine learning practitioners have long sought ways to automate this labor-intensive labeling process (i.e., “automated data labeling”) 2, and have reached for classic approaches…

February 4, 2022

The Principles of Data-Centric AI Development

The Future of Data-Centric AI Talk Series Background Alex Ratner is CEO and co-founder of Snorkel AI and an Assistant Professor of Computer Science at the University of Washington. He recently joined the Future of Data-Centric AI event, where he presented the principles of data-centric AI and where it’s headed. If you would like to watch his presentation in full,…

Dr. Bubbles, Snorkel AI's mascot
January 25, 2022

Prompting Methods with Language Models and Their Applications to Weak Supervision

Machine Learning Whiteboard (MLW) Open-source Series  Today, Ryan Smith, machine learning research engineer at Snorkel AI, talks about prompting methods with language models and some applications they have with weak supervision. In this talk, we’re essentially going to be using this paper as a template—this paper is a great survey over some methods in prompting from the last few years…

Dr. Bubbles, Snorkel AI's mascot
January 19, 2022

Advancing Snorkel from research to production

The Snorkel AI founding team started the Snorkel Research Project at Stanford AI Lab in 2015, where we set out to explore a higher-level interface to machine learning through training data. This project was sponsored by Google, Intel, DARPA, and several other leading organizations and the research was represented in over 40 academic conferences such as ACL, NeurIPS, Nature and…

Dr. Bubbles, Snorkel AI's mascot
January 18, 2022

Building AI Applications Collaboratively Using Data-centric AI

The Future of Data-Centric AI Talk Series Background Roshni Malani received her PhD in Software Engineering from the University of California, San Diego, and has previously worked on Siri at Apple and as a founding engineer for Google Photos. She gave a presentation at the Future of Data-Centric AI virtual conference in September 2021. Her presentation is below, lightly edited…

Dr. Bubbles, Snorkel AI's mascot
January 14, 2022

Epoxy: Using Semi-Supervised Learning to Augment Weak Supervision

Machine Learning Whiteboard (MLW) Open-source Series We launched the machine learning whiteboard series (MLW) was launched earlier this year as an open-invitation forum to brainstorm ideas and discuss the latest papers, techniques, and workflows in artificial intelligence. Everyone interested in learning about machine learning can participate in an informal and open environment. If you are interested in learning about ML,…

Dr. Bubbles, Snorkel AI's mascot
December 16, 2021

Artificial Intelligence (AI) Facts and Myths

ScienceTalks with Abigail See. Diving into the misconceptions of AI, the challenges of natural language generation (NLG), and the path to large-scale NLG deployment In this episode of Science Talks, Snorkel AI’s Braden Hancock chats with Abigail See, an expert natural language processing (NLP) researcher and educator from Stanford University. We discuss Abigail’s path into machine learning (ML), her previous…

Dr. Bubbles, Snorkel AI's mascot
November 23, 2021

PonderNet: Learning to Ponder by DeepMind

Machine Learning Whiteboard (MLW) Open-source Series For our new visitors, we started our machine learning whiteboard (MLW) series earlier this year as an open-invite space to brainstorm ideas and discuss the latest papers, techniques, and workflows in the AI space. In which, we emphasize an informal and open environment to everyone interested in learning about machine learning. So, if you are interested…

Dr. Bubbles, Snorkel AI's mascot
November 10, 2021

Design Principles for Iteratively Building AI Applications

Enabling iterative development workflows with Snorkel Flow’s Application Studio. Consider this scenario— we’re AI engineers, and we’re building a social media monitoring application to track the sentiment of Fortune 500 company mentions in the news.

November 8, 2021

Snorkel’s Journey to Data-Centric AI, with Chris Ré

The Future of Data-Centric AI Talk Series Background Snorkel co-founder Chris Ré is an associate professor of Computer Science at Stanford University and an award-winning researcher in data-based theory and machine learning. He has co-founded four companies based on his research in machine learning systems. Chris recently presented at the Future of Data-Centric AI virtual event in September, where he…

Dr. Bubbles, Snorkel AI's mascot
November 3, 2021

Building a Successful AI Startup

ScienceTalks with Saam Motamedi We at Snorkel AI have received many requests from data scientists and machine learning engineers who aspire to be founders, where do they start and how should they get started on their entrepreneurial journey? We genuinely believe that data scientists and machine learning engineers will build the next generation of mega-enterprises. Over the summer, we’ve recorded…

Dr. Bubbles, Snorkel AI's mascot
October 18, 2021

Forager: Rapid Data Exploration for Rapid Model Development

Machine Learning Whiteboard (MLW) Open-source Series We started our machine learning whiteboard (MLW) series earlier this year as an open-invite space to brainstorm ideas and discuss the latest papers, techniques, and workflows in the AI space. We emphasize an informal and open environment to everyone interested in learning about machine learning.In this episode, Fait Poms, a Ph.D. student at Stanford…

Dr. Bubbles, Snorkel AI's mascot
October 14, 2021

Recap: The Future of Data-Centric AI Event

Main takeaways from The Future of Data-Centric AI Event We recently hosted The Future of Data-Centric AI, where academia, research, and industry experts and practitioners came together to discuss the shift from model-centric AI development to data-centric AI and what lies ahead. This post gives you a quick overview of the event and top takeaways from over eight hours of…

October 11, 2021

Building Malleable Machine Learning (ML) Systems

Defining and Building Malleable ML Systems – Machine Learning Whiteboard (MLW) Open-Source Series As you may know, earlier this year, we started our machine learning whiteboard (MLW) series, an open-invite space to brainstorm ideas and discuss the latest papers, techniques, and workflows in the AI space. We emphasize an informal and open environment to everyone interested in learning about machine…

Dr. Bubbles, Snorkel AI's mascot
September 22, 2021

Web Virtualization — Optimizing Data-Intensive App Performance

Frontend Development Best Practices for Working With Lots of Data From Snorkel AI Engineering As a frontend engineer, it’s often easy to run into limitations when scaling large applications. At Snorkel AI, we often run into times where our users work with data that scales into the gigabytes when using Snorkel Flow. We have built Snorkel Flow around two core…

Shubham Naik portrayed, front end software engineer at Snorkel AI
September 16, 2021

Multi-Label Classification, Sequence Labeling, and More

Snorkel Flow LTS Release Summer ‘21 By adopting Snorkel Flow, a data-centric AI development platform powered by programmatic labeling, our customers have changed how they build and deploy AI applications. We’ve seen our customers save tens-of-millions of dollars in manual labeling costs and person-years of time by applying weak supervision with Snorkel Flow.Over the last few months, we’ve been hard…

Patrick Kolencherry portrayed
September 15, 2021

Applying Weak Supervision Research

ScienceTalks with Paroma Varma In this episode of Science Talks, Snorkel AI’s Braden Hancock chats with Paroma Varma – a co-founder of Snorkel AI and one of the first and leading contributors to the Snorkel project. We discuss Paroma’s path into machine learning, her work in optimization and signal processing during her undergrad, weak supervision and image data during her…

Dr. Bubbles, Snorkel AI's mascot
September 13, 2021

Sliceline: Fast, Linear-Algebra-Based Slice Finding for ML Model Debugging

Diving Into SliceLine – Machine Learning Whiteboard (MLW) Open-source Series Earlier this year, we started our machine learning whiteboard (MLW) series, an open-invite space to brainstorm ideas and discuss the latest papers, techniques, and workflows in the AI space. We emphasize an informal and open environment to everyone interested in learning about machine learning.In this episode, Kaushik Shivakumar dives into…

Dr. Bubbles, Snorkel AI's mascot
September 8, 2021

The Future of Data-Centric AI – Virtual Live Event

Join the live discussion. Learn how to unlock data-centric AI and make AI development practical in your organization Working with vast unstructured and unlabeled data is one of the bottlenecks in the machine learning lifecycle. Machine learning models can only get as reliable and accurate as the data being fed to them. With a data-centric approach 1, your data science…

Dr. Bubbles, Snorkel AI's mascot
August 31, 2021

Snorkel AI Raises $85m Series C at $1b Valuation for Data-Centric AI

We started the Snorkel project at the Stanford AI lab in 2015 around two core hypotheses:

August 9, 2021

Developing and Managing Systems to Extract Structured Data

Machine Learning Whiteboard (MLW) Open-source Series Earlier this year, we started our machine learning whiteboard (MLW) series, an open-invite space to brainstorm ideas and discuss the latest papers, techniques, and workflows in the AI space. We emphasize an informal and open environment to everyone interested in learning about machine learning.In this episode, Manan Shah dives into “Glean: Structured Extractions from…

Dr. Bubbles, Snorkel AI's mascot
August 2, 2021

How to Use Snorkel to Build AI Applications

The how, what, and why of Snorkel’s programmatic data labeling approach and the state-of-the-art Snorkel Flow platform. The year was 2015. For the first time, machine learning (ML) had outperformed humans in the annual ImageNet challenge.

July 9, 2021

Multi-Resolution Weak Supervision for Sequential Data

Machine Learning Whiteboard (MLW) Open-source Series Our machine learning whiteboard (MLW) is an open-invite space to brainstorm ideas and discuss the latest papers, techniques, and workflows in the AI space. We emphasize an informal and open environment to everyone interested in discovering more about machine learning.In this episode, Hiromu Hota, Vincent Sunn Chen, Daniel Y. Fu, and Frederic Sala dive…

Dr. Bubbles, Snorkel AI's mascot
June 25, 2021

Weak Supervision in Biomedicine

In this episode of Science Talks, Snorkel AI’s Braden Hancock chats with Jason Fries – a research scientist at Stanford University’s Biomedical Informatics Research lab and Snorkel Research, and one of the first contributors to the Snorkel open-source library. We discuss Jason’s path into machine learning, empowering doctors and scientists with weak supervision, and utilizing organizational resources in biomedical applications of Snorkel. This episode is part…

Dr. Bubbles, Snorkel AI's mascot
June 16, 2021

Training Classifiers With Natural Language Explanations

Machine Learning Whiteboard (MLW) Open-source Series Earlier this year, we started our machine learning whiteboard (MLW) series, an open-invite space to brainstorm ideas and discuss the latest papers, techniques, and workflows in the AI space. We emphasize an informal and open environment to everyone interested in learning about machine learning.In this episode, our Co-founder and Head of Technology. Braden Hancock…

Dr. Bubbles, Snorkel AI's mascot
May 24, 2021

Applying Information Theory to ML With Fred Sala

In this episode of Science Talks, Frederic Sala – an assistant professor of Computer Science at the University of Wisconsin Madison and a research scientist at Snorkel discusses his path into machine learning, the central thesis that ties together his multidisciplinary research, his thoughts on the future of weak supervision, as well as his decision to go into academia.

Dr. Bubbles, Snorkel AI's mascot
May 19, 2021

3 Impractical Assumptions About AI to Avoid

Impractical ML assumptions are made every day in research, which limit its adoption. In the real world, these assumptions do not hold up. Learn more about how to avoid making these assumptions about AI application development.

May 4, 2021

Building Industrial-Strength NLP Applications With Ines Montani

In this episode of Science Talks, Explosion AI’s Ines Montani sat down with Snorkel AI’s Braden Hancock to discuss her path into machine learning, key design decisions behind the popular spaCy library for industrial-strength NLP, the importance of bringing together different stakeholders in the ML development process, and more.This episode is part of the #ScienceTalks video series hosted by the…

Dr. Bubbles, Snorkel AI's mascot
April 29, 2021
1 7 8 9 10