How Pixability uses foundation models to accelerate NLP application development by months

Pixability is a data and technology company that allows advertisers to quickly pinpoint the right content and audience on YouTube. To help brands maximize their reach, they need to constantly and accurately categorize billions of YouTube videos. Using Snorkel Flow, Pixability leveraged foundation models to build small, deployable classification models capable of categorizing videos across more than 600 different classes with 90% accuracy in just a few weeks.

Using AI to help customers optimize ad spending and maximize their reach on YouTube.

There are billions of videos on YouTube. Every minute, another 500 hours are added to the platform. Given this deluge of content, advertisers struggle to identify whether videos are brand-aligned (and brand-safe). Pixability uses machine learning to automatically identify and categorize YouTube content so that advertisers can maximize their reach with suitable content and optimize ad spend.

Challenge

As of 2022, viewers watch, on average, over 700 million hours of YouTube content daily¹. To maintain relevancy, Pixability needs to continuously and accurately categorize billions of videos to provide advertisers with the necessary insight to be sure their ads run on brand-suitable content. To do this, Pixability had trained a natural language processing (NLP) model to classify videos automatically, yet the performance wasn’t strong enough.

To improve the training data quality (and reduce the number of revision cycles required to translate domain knowledge to a third-party service), the team realized they needed an alternative to hand-labeling data.

Time to label training data for ML solution was prohibitively slow given the reliance on external data labeling services that required multiple iterations.
Constrained collaboration due to the limited amount of time domain experts and data scientists had to solve for ambiguous labels, which blocked their ability to iterate quickly.
Rich information was buried within titles, descriptions, content, and tags and was difficult to normalize.

Goal

Minimize the time spent labeling high-cardinality training data while expanding their ability to provide more granular insights to their customers.

Solution

Using Snorkel Flow’s Data-centric Foundation Model Development workflow, Pixability was able to build an NLP application in less time than it took a third-party data labeling service to label a single dataset. This data-centric workflow allowed Pixability to scale up the number of classes they could classify to over 600 while also increasing model accuracy to over 90% with the new workflow. The large increase in possible classes means Pixability can better place their customers’ ads on the most suitable YouTube content, improving the return on customer video ad spend and satisfaction with Pixability’s services.

The team began by using Snorkel Flow’s Foundation Model Warm Start with zero-shot learning to jump-start training data creation using foundation model (FM) knowledge. From there, they used Foundation Model Prompt Builder to develop and refine prompts to correct out-of-the-box FM errors and pull more domain-specific knowledge from various FMs (rather than relying on a single one). As an example, they used the unstructured video title tags and descriptions stored in their Snowflake data warehouse and created prompts that asked the FM to classify videos based on the description.

**Using foundation models to classify videos with Snorkel Flow**

Referencing the results of a 50-class multi-label classification model, Jackie Swansburg, Pixability’s Chief Product Officer, said, “With Snorkel Flow, we can apply data-centric workflows to distill knowledge from foundation models and build high-cardinality classification models with more than 90% accuracy in days.”

With this programmatic approach to labeling data using knowledge from foundation models, the team generated 500,000 labeled training data points (with virtually no ground truth) that were used to train a model with 90% accuracy. Additionally, the team was able to unlock multi-label NLP capabilities, further improving the granularity Pixability can provide its customers. Now instead of being able to just classify a video as related to “sports,” they could classify it more specifically as “basketball” or “hockey”.

Auto-labeled by capturing domain expertise and foundation model knowledge as labeling functions and applying intelligently en-masse.
Improved collaboration with domain experts across lines of business to drive programmatic data labeling and iteration, unblocking the data science team.
Unified platform for training data creation and model training, including guided error analysis for efficient, effective iteration.

Pixability was able to create a model in weeks instead of months by relying on the Snorkel AI team’s expertise with foundation models and Snorkel Flow’s ability to integrate easily into their existing cloud data warehouse. Furthermore, by labeling programmatically in-house, the Pixability team had greater control over their NLP training data creation and rapid iteration, freeing the capacity to expand to more use cases. As a result, Pixability advanced their product roadmap by several months, unlocking new capabilities that will help them provide deeper insights and improved services to their customers.