Google Patents
|
2024

Systems and methods for programmatic labeling of training data for machine learning models via clustering and language model prompting

RN Smith, et all.

Abstract

Embodiments introduce an approach to semi-automatically generate labels for data based on implementation of a clustering or language model prompting technique and can be used to implement a form of programmatic labeling to accelerate the development of classifiers and other forms of models. The disclosed methodology is particularly helpful in generating labels or annotations for unstructured data. In some embodiments, the disclosed approach may be used with data in the form of text, images, or other form of unstructured data.

Share this article
Image

Join our newsletter

For expert advice, the latest research, and exclusive events.
By submitting this form, I acknowledge I will receive email updates from Snorkel AI, and I agree to the Terms of Use and acknowledge that my information will be used in accordance with the Privacy Policy.