The Future of Data-Centric AI Talk Series Background Roshni Malani received her PhD in Software Engineering from the University of California, San Diego, and has previously worked on Siri at Apple and as a founding engineer for Google Photos. She gave a presentation at the Future of Data-Centric AI virtual conference in September 2021. Her presentation is below, lightly edited…
Machine Learning Whiteboard (MLW) Open-source Series We launched the machine learning whiteboard series (MLW) was launched earlier this year as an open-invitation forum to brainstorm ideas and discuss the latest papers, techniques, and workflows in artificial intelligence. Everyone interested in learning about machine learning can participate in an informal and open environment. If you are interested in learning about ML,…
ScienceTalks with Abigail See. Diving into the misconceptions of AI, the challenges of natural language generation (NLG), and the path to large-scale NLG deployment In this episode of Science Talks, Snorkel AI’s Braden Hancock chats with Abigail See, an expert natural language processing (NLP) researcher and educator from Stanford University. We discuss Abigail’s path into machine learning (ML), her previous…
Machine Learning Whiteboard (MLW) Open-source Series For our new visitors, we started our machine learning whiteboard (MLW) series earlier this year as an open-invite space to brainstorm ideas and discuss the latest papers, techniques, and workflows in the AI space. In which, we emphasize an informal and open environment to everyone interested in learning about machine learning. So, if you are interested…
Enabling iterative development workflows with Snorkel Flow’s Application Studio. Consider this scenario— we’re AI engineers, and we’re building a social media monitoring application to track the sentiment of Fortune 500 company mentions in the news.
The Future of Data-Centric AI Talk Series Background Snorkel co-founder Chris Ré is an associate professor of Computer Science at Stanford University and an award-winning researcher in data-based theory and machine learning. He has co-founded four companies based on his research in machine learning systems. Chris recently presented at the Future of Data-Centric AI virtual event in September, where he…
ScienceTalks with Saam Motamedi We at Snorkel AI have received many requests from data scientists and machine learning engineers who aspire to be founders, where do they start and how should they get started on their entrepreneurial journey? We genuinely believe that data scientists and machine learning engineers will build the next generation of mega-enterprises. Over the summer, we’ve recorded…
Machine Learning Whiteboard (MLW) Open-source Series We started our machine learning whiteboard (MLW) series earlier this year as an open-invite space to brainstorm ideas and discuss the latest papers, techniques, and workflows in the AI space. We emphasize an informal and open environment to everyone interested in learning about machine learning.In this episode, Fait Poms, a Ph.D. student at Stanford…
Main takeaways from The Future of Data-Centric AI Event We recently hosted The Future of Data-Centric AI, where academia, research, and industry experts and practitioners came together to discuss the shift from model-centric AI development to data-centric AI and what lies ahead. This post gives you a quick overview of the event and top takeaways from over eight hours of…
Defining and Building Malleable ML Systems – Machine Learning Whiteboard (MLW) Open-Source Series As you may know, earlier this year, we started our machine learning whiteboard (MLW) series, an open-invite space to brainstorm ideas and discuss the latest papers, techniques, and workflows in the AI space. We emphasize an informal and open environment to everyone interested in learning about machine…
Frontend Development Best Practices for Working With Lots of Data From Snorkel AI Engineering As a frontend engineer, it’s often easy to run into limitations when scaling large applications. At Snorkel AI, we often run into times where our users work with data that scales into the gigabytes when using Snorkel Flow. We have built Snorkel Flow around two core…
Snorkel Flow LTS Release Summer ‘21 By adopting Snorkel Flow, a data-centric AI development platform powered by programmatic labeling, our customers have changed how they build and deploy AI applications. We’ve seen our customers save tens-of-millions of dollars in manual labeling costs and person-years of time by applying weak supervision with Snorkel Flow.Over the last few months, we’ve been hard…
ScienceTalks with Paroma Varma In this episode of Science Talks, Snorkel AI’s Braden Hancock chats with Paroma Varma – a co-founder of Snorkel AI and one of the first and leading contributors to the Snorkel project. We discuss Paroma’s path into machine learning, her work in optimization and signal processing during her undergrad, weak supervision and image data during her…
Diving Into SliceLine – Machine Learning Whiteboard (MLW) Open-source Series Earlier this year, we started our machine learning whiteboard (MLW) series, an open-invite space to brainstorm ideas and discuss the latest papers, techniques, and workflows in the AI space. We emphasize an informal and open environment to everyone interested in learning about machine learning.In this episode, Kaushik Shivakumar dives into…
Join the live discussion. Learn how to unlock data-centric AI and make AI development practical in your organization Working with vast unstructured and unlabeled data is one of the bottlenecks in the machine learning lifecycle. Machine learning models can only get as reliable and accurate as the data being fed to them. With a data-centric approach 1, your data science…
We started the Snorkel project at the Stanford AI lab in 2015 around two core hypotheses:
Machine Learning Whiteboard (MLW) Open-source Series Earlier this year, we started our machine learning whiteboard (MLW) series, an open-invite space to brainstorm ideas and discuss the latest papers, techniques, and workflows in the AI space. We emphasize an informal and open environment to everyone interested in learning about machine learning.In this episode, Manan Shah dives into “Glean: Structured Extractions from…
The how, what, and why of Snorkel’s programmatic data labeling approach and the state-of-the-art Snorkel Flow platform. The year was 2015. For the first time, machine learning (ML) had outperformed humans in the annual ImageNet challenge.
Machine Learning Whiteboard (MLW) Open-source Series Our machine learning whiteboard (MLW) is an open-invite space to brainstorm ideas and discuss the latest papers, techniques, and workflows in the AI space. We emphasize an informal and open environment to everyone interested in discovering more about machine learning.In this episode, Hiromu Hota, Vincent Sunn Chen, Daniel Y. Fu, and Frederic Sala dive…
In this episode of Science Talks, Snorkel AI’s Braden Hancock chats with Jason Fries – a research scientist at Stanford University’s Biomedical Informatics Research lab and Snorkel Research, and one of the first contributors to the Snorkel open-source library. We discuss Jason’s path into machine learning, empowering doctors and scientists with weak supervision, and utilizing organizational resources in biomedical applications of Snorkel. This episode is part…
Machine Learning Whiteboard (MLW) Open-source Series Earlier this year, we started our machine learning whiteboard (MLW) series, an open-invite space to brainstorm ideas and discuss the latest papers, techniques, and workflows in the AI space. We emphasize an informal and open environment to everyone interested in learning about machine learning.In this episode, our Co-founder and Head of Technology. Braden Hancock…
In this episode of Science Talks, Frederic Sala – an assistant professor of Computer Science at the University of Wisconsin Madison and a research scientist at Snorkel discusses his path into machine learning, the central thesis that ties together his multidisciplinary research, his thoughts on the future of weak supervision, as well as his decision to go into academia.
Impractical ML assumptions are made every day in research, which limit its adoption. In the real world, these assumptions do not hold up. Learn more about how to avoid making these assumptions about AI application development.
In this episode of Science Talks, Explosion AI’s Ines Montani sat down with Snorkel AI’s Braden Hancock to discuss her path into machine learning, key design decisions behind the popular spaCy library for industrial-strength NLP, the importance of bringing together different stakeholders in the ML development process, and more.This episode is part of the #ScienceTalks video series hosted by the…
Over the past year, we’ve worked hard to deliver Snorkel Flow, the first AI platform to provide all the power of machine learning without the pains of hand-labeling. Snorkel Flow lets you label data programmatically, train models flexibly, improve performance iteratively, and deploy AI applications quickly. We are incredibly proud of the value that our customers, including two of the…
In this episode of Science Talks, Sebastian Ruder, Research Scientist at DeepMind, shares his thoughts on making AI practical with Snorkel AI’s Braden Hancock. This conversation covers progress made in the NLP domain with emerging research, new benchmarks like SuperGLUE, rich repositories and news sources that keep you in the loop and on top of what’s new in NLP, and more.
In this episode of ScienceTalks, Snorkel AI’s Braden Hancock Hugging Face’s Chief Science Officer, Thomas Wolf. Thomas shares his story about how he got into machine learning and discusses important design decisions behind the widely adopted Transformers library, as well as the challenges of bringing research projects into production. ScienceTalks is an interview series from Snorkel AI, highlighting some of the best work and ideas to make AI practical.
We’ll analyze major sources of errors during the four steps of building AI applications: data labeling, feature engineering, model training, and model evaluation.
AI is already transforming the business of government. But the positive impacts of this transformation, from increasing the efficiency of public services to enhancing the effectiveness of tax dollars, are still in the earliest stages. Public sector organizations generally have access to the same talent, software models, and hardware infrastructure as any private sector company, but they face a number of relatively unique practical challenges that hinder their operationalization of AI.
Advancements in artificial intelligence promise efficiency gains for financial institutions. AI-powered applications can revolutionize an organization’s risk management, fraud detection, compliance monitoring, and other processes. Financial services companies have smart data scientists and good infrastructure needed for deploying AI. But their ability to rapidly develop and deploy AI applications is hampered by several unique challenges.