Applied AI Archives

Data extraction from SEC filings (10-Ks) with Snorkel Flow

Leveraging Snorkel Flow to extract critical data from annual quarterly reports (10-Ks) Introduction It can surprise those who have never logged into EDGAR how much information is available in annual reports from public companies. You can find tactical details like the names of senior leadership, top shareholders, and more strategic information like earnings, risk factors, and the company strategy and vision. Warren…

Banking & Finance, Data Development, Data Labeling, Data-Centric AI, NLP

Jonathan Dahlberg

May 10, 2022

AI in cybersecurity an introduction and case studies

An introduction to AI in cybersecurity with real-world case studies in a Fortune 500 organization and a government agency Despite all the recent advances in artificial intelligence and machine learning (AI/ML) applied to a vast array of application areas and use cases, success in AI in cybersecurity remains elusive. The key component to building AI/ML applications is training data, which…

Data Labeling, Data-Centric AI, Evaluation, MLOps

Nic Acton

May 5, 2022

Accelerating AI in healthcare

How can data-centric AI speeds your end-to-end healthcare AI development and deployment Healthcare is a field that is awash in data, and managing it all is complicated and expensive. As an industry, it benefits tremendously from the ongoing development of machine learning and data-centric AI. The potential benefits of AI integration in healthcare can be broken down into two categories:…

Data Development, Data Labeling, Data-Centric AI, Healthcare, Partners

Team Snorkel

April 29, 2022

Bill of materials for responsible AI: collaborative labeling

In our previous posts, we discussed how explainable AI is crucial to ensure the transparency and auditability of your AI deployments and how trustworthy AI adoption and its successful integration into our country’s critical infrastructure and systems are paramount. In this post, we dive into making trustworthy and responsible AI possible with Snorkel Flow, the data-centric AI platform for government and federal agencies. Collaborative labeling and…

Alignment, Annotation, Data Development, Data Labeling, Data-Centric AI, Evaluation, Foundation Models, Partners

Alexis Zumwalt

April 28, 2022

Explainability through provenance and lineage

In our previous post, we discussed how trustworthy AI adoption and its successful integration into our country’s critical infrastructure and systems are paramount. In this post, we discuss how explainability in AI is crucial to ensure the transparency and auditability of your AI deployments. Outputs from trustworthy AI applications must be explainable in understandable terms based on the design and implementation of…

Data Labeling

Alexis Zumwalt

April 19, 2022

Introduction to trustworthy AI

The adoption of trustworthy AI and its successful integration into our country’s most critical systems is paramount to achieving the goal of employing AI applications to accelerate economic prosperity and national security. However, traditional approaches to developing AI applications suffer from a critical flaw that leads to significant ethics and governance concerns. Specifically, AI today relies on massive, hand-labeled training datasets…

Data Labeling

Alexis Zumwalt

April 7, 2022

How to better govern ML models? Hint: auditable training data

ML models will always have some level of bias. Rather than relying on black-box algorithms, how can we make the entire AI development workflow more auditable? How do we build applications where bias can be easily detected and quickly managed? Today, most organizations focus their model governance efforts on investigating model performance and the bias within the predictions. Data science…

Annotation, Data Labeling, Data-Centric AI, MLOps

Jonathan Dahlberg

April 6, 2022

Tips for using a data-centric AI approach

The future of data-centric AI talk series Background An AI system consists of two parts: the model— algorithm or some code—and data. The dominant paradigm in machine-learning researchers has been for most data scientists, including myself, to download a fixed dataset and iterate on the model. That this has become conventional is a tribute to how successful this model-centric approach…

Computer vision, Data Development, Data Labeling, Data-Centric AI, Evaluation

Team Snorkel

March 9, 2022

Resilient enterprise AI application development

Using a data-centric approach to capture the best of rule-based systems and ML models for enterprise AI One of the biggest challenges to making AI practical for the enterprise is keeping the AI application relevant (and therefore valuable) in the face of ever-changing input data and evolving business objectives. Practitioners typically use one of two approaches to build these AI applications:…

Data Development, Data Labeling, Data-Centric AI, MLOps

Arjun Prakash

March 3, 2022

How AI can be used to rapidly respond to information warfare in the Russia-Ukraine conflict

Proliferating web technology has contributed to information warfare in recent conflicts. Artificial Intelligence (AI) can play a significant role in stemming disinformation campaigns, cyber-attacks, and informing diplomacy in the rapidly evolving situation in Ukraine. Snorkel AI is dedicated to supporting the National Security community and other enterprise organizations with state-of-the-art AI technology. We see this as our responsibility in the…

Annotation, Data Labeling, NLP, Partners

Charlie Greenbacker, Nic Acton

February 28, 2022

How Genentech extracted information for clinical trial analytics with Snorkel Flow

Genentech, a global biotech leader and member of the Roche Group, leveraged Snorkel Flow to extract critical information from lengthy clinical trial protocol (CTP) pdf documents. They built AI applications that used NER, entity linking, text extraction, and classification models to determine inclusion/ exclusion criteria and to analyze Schedules of Assessments. Genentech’s team achieved 95-99% model accuracy by using Snorkel…

Annotation, Data Labeling, Data-Centric AI, Evaluation, NLP

Team Snorkel

February 26, 2022

Augmenting the clinical trial design process with information extraction

The future of data-centric AI talk series Background Michael DAndrea is the Principal Data Scientist at Genentech. He earned his MBA from Cornell University and a Master’s degree in Computing and Education from Columbia University. He currently works on using unstructured data sources for clinical trial analytics and his team is partnered with the Stanford “AI For Health” initiative as…

Annotation, Data Development, Data-Centric AI, Evaluation, NLP

Team Snorkel

February 22, 2022

Building AI Applications Collaboratively Using Data-centric AI

The Future of Data-Centric AI Talk Series Background Roshni Malani received her PhD in Software Engineering from the University of California, San Diego, and has previously worked on Siri at Apple and as a founding engineer for Google Photos. She gave a presentation at the Future of Data-Centric AI virtual conference in September 2021. Her presentation is below, lightly edited…

Annotation, Data Development, Data Labeling, Data-Centric AI, Evaluation

Team Snorkel

January 14, 2022

Design Principles for Iteratively Building AI Applications

Enabling iterative development workflows with Snorkel Flow’s Application Studio. Consider this scenario— we’re AI engineers, and we’re building a social media monitoring application to track the sentiment of Fortune 500 company mentions in the news.

Annotation, Data Development, Data Labeling, Evaluation, Fine-Tuning, MLOps, NLP

Vincent Sunn Chen

November 8, 2021

Building a Successful AI Startup

ScienceTalks with Saam Motamedi We at Snorkel AI have received many requests from data scientists and machine learning engineers who aspire to be founders, where do they start and how should they get started on their entrepreneurial journey? We genuinely believe that data scientists and machine learning engineers will build the next generation of mega-enterprises. Over the summer, we’ve recorded…

Data Development, MLOps, NLP, Partners, Product Releases

Team Snorkel

October 18, 2021

All articles on
Applied AI

Data extraction from SEC filings (10-Ks) with Snorkel Flow

AI in cybersecurity an introduction and case studies

Accelerating AI in healthcare

Bill of materials for responsible AI: collaborative labeling

Explainability through provenance and lineage

Introduction to trustworthy AI

How to better govern ML models? Hint: auditable training data

Tips for using a data-centric AI approach

Resilient enterprise AI application development

How AI can be used to rapidly respond to information warfare in the Russia-Ukraine conflict

How Genentech extracted information for clinical trial analytics with Snorkel Flow

Augmenting the clinical trial design process with information extraction

Building AI Applications Collaboratively Using Data-centric AI

Design Principles for Iteratively Building AI Applications

Building a Successful AI Startup

Product

Solutions

Services

Industries

Customers

Resources

Learn

Engage

AI Primers

Docs

AI Research

Company

Contact

Compliance

All articles on Applied AI

Product

Solutions

Services

Industries

Customers

Resources

Learn

Engage

AI Primers

Docs

AI Research

Company

Contact

Compliance

All articles on
Applied AI