CUSTOMER STORY

How SLB uses Snorkel Flow to enhance proactive well management

Industry:
Energy
Solution:
Information Extraction
<3

Days to build a performant ML application

47%

Improved generalization over old system

99%

Reduction in document processing time

Image

SLB is the world’s leading provider of technology and services for the energy industry, operating in over 120 countries. The company provides well maintenance and analytics services to the world’s biggest oil companies, and it believes that large-scale data analysis and artificial intelligence/machine learning will help them remain a leader in the market. One way they’ve been able to achieve this is by collaborating with Snorkel’s experts, who used our proprietary technology to build an AI application that automatically extracts geological entities and critical field data across a variety of document structures and report types SLB receives from their customers.


Providing proactive well maintenance with automated information extraction

SLB is a technology company that partners with customers to access energy. The Software Technology Innovation Center (STIC), within the 85,000-person industry leader, is dedicated to using new AI/ML applications to support the company’s mission to improve the performance and sustainability of the global energy industry. One way is to streamline information extraction from critical field data that underpins SLB’s efforts to do a large-scale analysis of business operations and deliver data-driven insights into their performance.

Challenge

The energy industry generates tons of daily reports ranging from daily drilling reports to well maintenance logs. Each document has its own structure and format, which makes it difficult for the SLB team to extract crucial information quickly. Automating the information extraction of the text within these unstructured PDFs using Named Entity Recognition (NER) would greatly accelerate the team toward their goal of delivering highly accurate large-scale analysis.

The team explored typical off-the-shelf ML models but wasn’t able to identify the scientific terms related to the Exploration and Production (E&P) industry. They also tried to create a domain-specific training dataset using various labeling tools and borrowing from precious subject matter expert (SME) time, but that took anywhere from 1–3 hours per document, which wasn’t scalable. Ultimately, the team needed to identify 18 different industry-specific entities and automatically associate data with these entities, but a few things stood in the way:

  • Rich information was buried within tabular and raw text in PDFs with varied formatting across reports from different companies.
  • Poor collaboration between domain experts and data scientists with cumbersome file sharing and ad hoc meetings.
  • Time to label training data manually was a bottleneck to building AI to automate this effort.

“What would have taken us months to go through an iteration can happen in minutes now. Literally!”


Swaroop Kalasapur
Head of SLB Technology Innovation Center

Goal

Minimize the time SMEs spend labeling training data while ensuring that the system can adapt to new or changing document formats.

Solution

In just three days, a team of SLB and Snorkel experts using our proprietary technology built an AI application to automatically extract key scientific data from geological and field data reports and use it to guide recommendations for better well management across multiple clients. By applying a data-centric artificial intelligence (AI) development lifecycle accelerated by programmatic labeling, Monisha Manoharan, a machine learning engineer at SLB, and her team—together with Snorkel’s experts—built a classification task that reached an 85% F1 score in those initial three days.

After a few rounds of rapid iteration using our proprietary technology’s model-guided error analysis and programmatic labeling, the joint team improved their F1 score to 91.4%. Which was “impressive compared to what we had achieved previously,” Monisha said.

The AI application developed in partnership with Snorkel reduced the processing time of reports from 1 to 3 hours per report to just a few seconds. Using this new AI capability, they extracted several different entities from unstructured data, including well maintenance activity description (textual), time of activity (numerical), and more. They also overcame the challenge of non-standard reporting formats, successfully identifying entities across 15 different document structures.

  • ML solution generalized to a variety of document structures, including unseen PDF and tabular formats.
  • Improved collaboration between domain experts and data scientists across labeling, troubleshooting, and iteration.
  • Auto-labeled by capturing labeling expertise as labeling functions and applying intelligently en masse.

Not only did Monisha and the STIC team, in partnership with Snorkel’s experts, successfully develop an AI-enhanced tool to help SLB extract key field/scientific data automatically, they’ve also established a repeatable data-centric AI development lifecycle as a foundation for the future of data science development at SLB.

“We created a binary classification task and we were able to reach an 85 F1 score in under three days… later improving that score to 91.4 which is highly impressive compared to what we had before.”


Monisha Manoharan
Senior Machine Learning Engineer, SLB
Snorkel Logo

Ready to get started?

Take the next step and see how you can accelerate AI development by 100x.