Tasks Algorithmically Given Labels Established via Transferred Symbols (TAGLETS)
Abstract
We conducted research to reduce the amount of labeled data required to train machine learning systems. The pinnacle of this effort is the development of TAGLETS, a machine learning system that seamlessly integrates widely known collections of labeled data with a diverse array of machine learning algorithms, known as weak labelers. The system’s evolution has been significantly influenced by comprehensive theoretical explorations into effectively aggregating these weak labelers within the system. The research’s scope expands to the application of large pre-trained models in low-resource settings. The result of these efforts is Alfred, a second-generation system tailored for programmatic weak supervision that allows to labeled data via prompts. Furthermore, the research encompassed extensive investigations to elucidate strategies for fine-tuning large models for specific domain tasks with limited labeled data, enhancing adaptability and performance.