Data-centric AI starts here
Opening Keynote
Bridging the gap between LLMs and enterprise AI
Citi: Driving Better Customer Outcomes and Customer-Centric Growth with Data-Centric AI
with Murli Buluswar
Head of Analytics, US Personal Bank, Citi
Featured speakers
Alex Ratner
Co-founder and CEO
Snorkel AI
Murli Buluswar
Head of Analytics, US Personal Bank
Citi
Aarti Bagul
Head of Field Engineering
Snorkel AI
James Lin
Head of AI/ML Innovation
Experian
Fred Sala
Professor of CS
University of Wisconsin
Vinny DeGenova
Associate Director of Machine Learning
Wayfair
Chris Glaze
Principal Research Scientist
Snorkel AI
Vincent Chen
Product Leader
Snorkel AI
Day 1
Oct 16Hands-On Training Day
Attendees will learn the fundamentals of AI data development and how to apply them with Snorkel Flow – for both predictive and GenAI use cases.
The workshops will include exercises which walk attendees through the process of curating training data and using it to fine-tune and evaluate an LLM for specialized tasks in the enterprise.
- The morning workshop will include an exercise on intent classification for chatbots.
- The afternoon workshop will include an exercise on information extraction from PDF documents.
What to expect
- Learn about the principles of AI data development
- Get hands-on with the Snorkel Flow platform
- Curate training data and fine-tune an LLM
- Build a model to classify customer requests
- Build a model to extract information from financial documents
Ink 48 Hotel, 653 11th Avenue,
New York, NY 10036
Day 2
Oct 17Conference Day
Registration and Breakfast
Opening Keynote
Alex Ratner
CDAO Round Table
Coffee Break
Graduating from Data Labeling to AI Data Development
Chris Glaze
Venkatesh Rao
Experian: The Importance of LLM Evaluation for Domain-Specific Use Cases
James Lin
James Lin will discuss the importance of LLM evaluation along with common challenges and strategies to overcome them.
Unlocking Hidden Insights: Snorkel’s Solution for Complex and High-Value PDF Documents
Jennifer Lei
PDF documents represent a vast, untapped source of enterprise data, posing unique challenges due to their diverse formats and structures. From messy, massive files to varying types of content such as text, multi-columns, and tables, extracting valuable insights from these documents requires innovative solutions.
In this session, we will explore how to unlock the full value of your PDF documents, from simple classification to the most complex, high-value extraction use cases, all using a single Snorkel Flow platform. By leveraging OCR and parsing, using a collaborative platform for domain experts and data scientists, and utilizing leading LLMs or models of your choice, you can easily overcome format variations, ensure accurate information extraction, and efficiently build custom PDF use cases.
Join us to discover how Snorkel AI’s integrated platform can help you navigate the complexities of PDF document processing and fully leverage the potential of your enterprise data.
Citi: Driving Better Customer Outcomes and Customer-Centric Growth with Data-Centric AI
Murli Buluswar
Lunch
Evaluating LLM Systems
Vincent Chen
Rebekah Westerlind
LLM evaluation is critical for generative AI in the enterprise, but measuring how well an LLM answers questions or performs tasks is difficult. Thus, LLM evaluations must go beyond standard measures of “correctness” to include a more nuanced and granular view of quality.
In practice, enterprise LLM evaluations (e.g., OSS benchmarks) often come up short because they’re slow, expensive, subjective, and incomplete. They leave AI initiatives blocked because there is no clear path to production quality.
In this session, Vincent Sunn Chen, Founding Engineer at Snorkel AI, and Rebekah Westerlind, Software Engineer at Snorkel AI, will discuss the importance of LLM evaluation, highlight common challenges and approaches, and explain the core concepts behind Snorkel AI's approach to data-centric LLM evaluation.
Join us to learn more about:
- Understanding the nuances of LLM evaluation
- Evaluating LLM response performance at scale
- Identifying where additional LLM fine-tuning is needed
How Wayfair is Transforming Customer Experiences with Data-Centric AI
Vinny DeGenova
Learn how Wayfair is harnessing the power of machine learning and data to make it easier for customers to find the exact home products they’re looking for with Snorkel AI.
You’ll find out how highly accurate product tags can be extracted from supplier-provided labels and product images to clean and enrich online catalogs. This delivers higher-quality content for customers and the ability to quickly adapt as customer searches evolve. Bottom line - they were able to increase their add-to-cart rates, reduce their cart abandonment rates, and increase their average order size and customer lifetime value.
Enhancing RAG Pipelines for Enterprise-Specific Tasks: Strategies for Accuracy and Reliability
Bryan Woods
In the realm of LLM-powered AI applications, Retrieval-Augmented Generation (RAG) is a pivotal component for enterprise use cases. However, to ensure responses are consistently accurate, helpful, and compliant, RAG pipelines must undergo meticulous optimization.
Critical to this process is the incorporation of only the most relevant information as context. This can be achieved through techniques such as semantic document chunking, fine-tuned embeddings, reranking models, and efficient context-window utilization.
In this presentation, we will:
- Introduce fundamental RAG concepts and outline a standard pipeline.
- Detail optimization strategies for each stage of a sophisticated RAG pipeline to ensure the LLM receives proper context.
- Demonstrate how to leverage Snorkel Flow to optimize RAG pipelines.
By attending, you will gain insights on how to:
- Enhance LLM responses by minimizing retrieval errors.
- Fine-tune various stages of the RAG pipeline.
- Expedite the deployment of production-grade RAG applications.
Join us to elevate your RAG systems and drive superior AI outcomes for your enterprise.
AI in Healthcare
Fine-Tuning and Aligning LLMs with Enterprise Data
Marty Moesta
LLMs often require fine-tuning and alignment on domain-specific knowledge before they can accurately, and reliably, perform specialized tasks within the enterprise.
The key to transforming foundation models such as Meta's Llama 3 into specialized LLMs is high-quality training data which can be applied via fine-tuning and alignment.
In this webinar, we'll provide an overview of methods such as SFT and DPO, show how to curate high-quality instruction and preference data 10-100x faster (and at scale) and demonstrate how to fine-tune, align and evaluate an LLM.
Join us, and learn more about:
- Curating high-quality training data 10-100x faster
- Emerging LLM fine-tuning and alignment methods
- Evaluating LLM accuracy for production deployment
2025 AI Trends
Closing Keynote
Ronaldo Ama
Ajay Singh
Conference Party
Special Hotel Room Rate:
Ink 48 Hotel, 653 11th Avenue,
New York, NY 10036
Deadline: Sept. 16, 2024
*First come, first available
Reserve
your spot