Applied AI
Customers

Top-10 US bank uses AI/ML to triage loan documents based on risk exposure

September 30, 2022
3 min read

To meet the requirements of unexpected regulatory changes brought on by the pandemic, a top-10 US. bank needed to urgently adapt its underperforming model-centric artificial intelligence and machine learning development approach to a data-centric one. The team used Snorkel Flow to automatically classify thousands of loan documents and extract critical clauses in just 24 hours, saving loan managers thousands of hours of manual document review.

Triaging loan documents based on risk exposure with AI/ML
In March 2020, the benchmark interest rates at which major global banks borrow from one another plummeted due to the pandemic. LIBOR, which stands for London Interbank Offered Rate, fell from 1.8% in April to 0.7% in March. Banks all over the globe had to reevaluate their risk exposure as many went from earning interest to owing interest practically overnight.

Challenge

When faced with the sudden onset of the pandemic, this top-10 US bank had to reassess its lending policies and risk exposure quickly. The bank needed to review loan contracts to determine the impact of rate changes. Given the urgency and scale of loans to review, the bank couldn’t rely on humans to triage risk. They knew they needed to use a machine-learning model. However, based on their experience with a recent project (before Snorkel Flow), that had a few blocking challenges to address:

  • Time to label training data for ML solution was prohibitively slow, given the reliance on manual labeling carried out by domain experts and the inability to outsource.
  • Lack of adaptability to various document structures, including unseen PDF and tabular formats.
  • Poor collaboration between domain experts and data scientists made it difficult to solve for ambiguous labels.
Sample loan document templates to illustrate the diversity of formats

Goal

Reduce data labeling and development time by auto-labeling—without reducing data or AI application quality.

Solution

Leveraging Snorkel Flow, the data science team, working closely with Snorkel AI experts, built an ML model that achieved better-than-human accuracy for more than 250,000 documents in under 24 hours. They worked alongside finance and tax subject matter experts within the bank to capture their heuristics as labeling functions. The Snorkel Flow platform intelligently combined these to auto-label a high-quality training data set. This labeled training data was then used to train a model which successfully extracted the key information from 250k documents, greatly expediting the bank’s ability to review loans. Moving forward, any time a central bank’s interest rate changes or if they need to process a new loan format or document structure, the team can quickly change a few labeling functions instead of going back through the slog of relabelling data by hand. 

Using Snorkel Flow, this top US bank overcame its original challenges with:

  • Auto-labeled training data by capturing labeling expertise as labeling functions and applying intelligently 250k documents with better-than-human quality (99.1%). 
  • Ensured adaptability with rapid code edits to labeling functions, not wholesale manual relabeling. 
  • Improved collaboration between domain experts and data scientists across labeling, troubleshooting, and iteration

Instead of spending six months labeling data by hand to get the training datasets they need to improve their model, the team relies on Snorkel Flow. Shifting from a model-centric approach to a data-centric one had significant performance improvements, both in terms of productivity as well as model performance.

250k

documents processed with better-than-human quality

99.1%

accuracy achieved with the extraction mode

Share this article
Nick Harvey author profile
Nick Harvey
Director of Product Marketing

Recommended articles

View all articles
agentic-in-action
The Standard for Agents You Can Trust: Lessons from the Federal Front Lines
In the first installment of Agentic in Action — a series about real AI deployments, not demos — Snorkel AI’s Kevin Olivieri sat down with three people who have spent their careers where trust isn’t optional: Chris Sniffen, Federal Applied AI Lead at Snorkel AI; John Hickey, President of August Schell; and Mike Baca, CIO of August Schell. The conversation focused on
June 5, 2026
Snorkel Team
collab-gym-thumbnail
Collaborative Gym: A Framework for Enabling and Evaluating Human-Agent Collaboration
At our latest Snorkel AI Reading Group, Yijia Shao (Stanford NLP) stopped by our San Francisco office to present Collaborative Gym: A Framework for Enabling and Evaluating Human-Agent Collaboration. As LLM agents get better at automating tasks on their own, a large class of real-world problems still needs a human in the loop – for their preferences, their domain expertise, or simply for control.
June 4, 2026
Alexis Sobel
Image
Benchtalks #2: The future of coding benchmarks
For our second Benchtalks, the series dedicated to the researchers building the measurement toolkits that frontier labs hill-climb on, Snorkel AI co-founder Vincent Sunn Chen sat down with John Yang, a Stanford PhD student and creator of the SWE-bench franchise, SWE-smith, CodeClash, and most recently ProgramBench. Highlights More on ProgramBench: See the benchmark and the upcoming leaderboard at programbench.com. More from John Yang: Publications and writing at john-b-yang.github.io. Snorkel
June 3, 2026
Vincent Sunn Chen
Image

Join our newsletter

For expert advice, the latest research, and exclusive events.
By submitting this form, I acknowledge I will receive email updates from Snorkel AI, and I agree to the Terms of Use and acknowledge that my information will be used in accordance with the Privacy Policy.