Advancements in artificial intelligence promise efficiency gains for financial institutions. AI-powered applications can revolutionize an organization’s risk management, fraud detection, compliance monitoring, and other processes. Financial services companies have smart data scientists and good infrastructure needed for deploying AI. But their ability to rapidly develop and deploy AI applications is hampered by several unique challenges. These challenges mostly stem from having to wait for data to be labeled, or re-labeled, any time something changes. Snorkel Flow, our end to end ML platform, addresses these challenges using a unique programmatic approach to creating training data.

Practical Challenges

  1. Privacy: Financial data such as PII in customer records, investment information, etc is private and sensitive, and often cannot be taken off-premises. This makes it difficult or impossible to outsource or crowdsource manual labeling to large groups of annotators, increasing the time and expertise cost of labeled training data.
  2. Deep Subject Matter Expertise: Real-world AI solutions need input from both data scientists and subject matter experts (SME) such as financial analysts and legal experts. But these contributor groups don’t have the same skill sets or may not work on the same team. SMEs may have insufficient familiarity with data science or coding, and data scientists may not have a deep understanding of the domain. For example, an SME might know the precise difference between a Term Loan Termination Date and a Term Loan Maturity Date, while a data scientist might have the knowledge needed to express the difference as a function. Thus these groups need a platform that enables collaboration and lets SMEs express their domain expertise in a push-button way.

Image credit: Londonmedarb.com

  1. Unstructured, Multimodal, Long-tail Data: Financial documents (see the ‘Motor Vehicle Title Pledge Agreement’ Example above) often contain a combination of plain unstructured text and other modalities like tables, images, and structures rendered through OCR. Simple pre-packaged NLP models cannot deal with all the data modalities, and models for complex data types can be very data-hungry. To handle these complex data types without spending months hand labeling at each step, practitioners need more automated labeling and managing training data.
  2. Training Data Auditability: Because processes are strictly regulated in financial services, transparency at every step is vital. Many AI-powered decisions are made inside a proverbial black box, which makes it impossible to trace or understand the reasoning behind every outcome.  And while the machine learning field is just beginning to wrap its head around general theories of interpretability, being able to perform basic audits of the data that is informing production ML models is an immediate must-have.  However, with hand-labeled training data, even this is a near impossibility.
  3. Data Drift: Data often changes quite rapidly in financial services, directly affecting model performance and rendering existing hand-labeled training data obsolete. For example, fraudulent behavior is designed to evade detection. Applications built to detect fraud see a high rate of data drift, or shift over time to new tactics that make old detection mechanisms obsolete. To combat drift, data scientists need to re-label and re-train models frequently on fresh data to remain effective. Reliance on frequent re-labeling by hand quickly becomes a bottleneck for adapting to data drift.
  4. Changing Pipelines and Business Objectives: The downstream consuming applications and end business goals of an AI application, and how they are precisely defined, nearly always change over time due to changes in regulations, market conditions, and so on. This can make previously labeled data inapplicable and necessitate expensive re-labeling.
  5. Bias: Models can encode and reinforce societal biases. This happens when the underlying data comes from interactions that are not representative of society or from humans who have overt or unconscious world views about particular groups. When these biases are not transparent in models, they can creep into financial decision-making processes—for example, channeling certain ethnic groups toward higher-interest credit cards. Relying on hand-labeled data makes it harder to audit the source of training labels and prevent bias from creeping into models.

Putting Snorkel Flow to Use for Financial Services

At Snorkel AI, we are helping our financial services customers unlock value from AI using Snorkel Flow, our end-to-end ML platform. Snorkel Flow uses a unique programmatic approach to create training data, which enables rapid development and deployment of custom AI applications. It drastically decreases the time to value for AI-powered solutions and addresses many of the practical challenges to adoption of AI methods in financial institutions.

Data Privacy: With Snorkel Flow, training data can be labeled programmatically in a fully eyes-off environment, and all data can be kept on-premises.

Easier Collaboration for Subject Matter Experts: Snorkel Flow’s no-code platform allows subject matter experts to encode their domain expertise into AI applications. Its intuitive user interface facilitates cross-functional and cross-team collaboration.

Unstructured, Multimodal, Long-tail Data Support:  Snorkel Flow enables users to generate labels and features based on different modalities of data, including raw text, structural metadata (like font size), and visual data (like a row or column alignment within tables).

Auditability at Every Step: The data and rules used to train AI systems in Snorkel Flow are auditable and interpretable. Users can analyze all sources of supervision used to train the model as well as outputs from any step in the pipeline.

Rapid Development: Snorkel Flow supports fast, iterative deployment of AI systems that is not blocked by the time required for hand labeling. This means that even in fields plagued by high rates of data drift, fresh data can be quickly labeled and models can be instantly retrained to retain their real-world power.

Transparency: Snorkel Flow allows users to create labeling functions that are transparent and interpretable and provides analysis tools to help identify sources of bias. 

The unique logistics, ethics, and practical and legal requirements of financial services organizations present challenges for data scientists working to implement state-of-the-art AI technology to solve industry problems. However, with a transparent, user-friendly platform, these obstacles can be overcome, and financial services providers can harness the power of cutting-edge technology for practical use.

Sign up for a demo if you are interested in learning more about how Snorkel Flow can help make AI practical for your organization.

Let’s put AI to work.