The paper explores the use of prompts that build synthetic training data with specific, documented attributes using large language models (LLMs) for natural language processing (NLP) tasks. The authors present an empirical study on data generation encompassing aspects like bias, diversity, and efficiency.