Snorkel Flow 2024.R3: Supercharge your AI development with enhanced data-centric workflows
Snorkel AI has made building production-ready, high-value enterprise AI applications faster and easier than ever. The 2024.R3 update to our Snorkel Flow AI data development platform streamlines data-centric workflows, from easier-than-ever generative AI evaluation to multi-schema annotation.
Let’s dive in.
Revolutionizing generative AI development
One of the biggest highlights of R3 is the introduction of Snorkel’s GenAI Evaluation Suite. This suite tackles a major challenge in Generative AI development: ensuring your models are ready for real-world use.
Here’s how the GenAI Evaluation Suite empowers you:
- Specialized and flexible evaluation: Go beyond subjective assessments or off-the-shelf benchmarks. Define custom acceptance criteria and leverage ground truth data alongside automatic response evaluators to measure your model’s performance against use case and domain-specific requirements.
- Fine-grained analysis: Snorkel Flow allows you to programmatically slice your data to focus data development on critical subsets. Tag data according to specific topics, languages, or customer scenarios.
- Actionable insights: Snorkel doesn’t just identify errors; it empowers you to fix them. Evaluation dashboards provide clear insights, allowing you to seamlessly transition from evaluation to data development workflows within the platform.
The GenAI Evaluation Suite complements our comprehensive LLM fine-tuning workflow. This workflow guides users through five distinct steps, from connecting to your large language model inference provider (such as Amazon SageMaker or Databricks Mosaic AI) to curating high-quality training data.
2024.R3 also brings exciting new features such as:
- Freeform LLM prompting: Safely connect to your LLM provider and leverage freeform prompting.
- Synthetic data generation: Address data sparsity by leveraging synthetic data generation techniques directly within the SDK.
- Enhanced LLM provider integrations: We’ve improved logging and performance with major LLM providers to ensure a smoother development experience.
Learn more about new GenAI and LLM features in 2024.R3.
Enhanced NLP workflows in Snorkel Flow 2024.R3
Snorkel Flow’s 2024.R3 release introduces significant advancements in natural language processing (NLP) capabilities, designed to streamline workflows and enhance the ability to extract insights from unstructured and structured text.
The new release includes the following:
- Named entity recognition (NER) for PDFs (beta): Extract key information directly from your PDFs—including scanned documents. Snorkel Flow now supports word-based NER, bounding boxes, and pattern-based labeling functions, making it easier to capture the structure of your documents and enhance model performance.
- Improved annotation suite: We’ve upgraded the annotation suite with multi-schema support for PDFs, annotation instructions, and a “Highlight-to-Label” feature for faster sequence tagging tasks.
- Spotlight mode for focused debugging: Spotlight mode highlights and isolates incorrectly predicted entities, allowing you to identify and resolve errors faster.
- Greater visibility of class-level metrics: See how your model performs on an entity-by-entity basis.
Learn more about the new NLP and PDF features in 2024.R3.
Strengthened and expanded enterprise-readiness features
At the beginning of the year, we introduced our first wave of role-based access control features, and we’ve built upon them in this release.
Snorkel Flow administrators can now keep their data safer than ever with:
- Feature access control: Grant granular access to specific features, ensuring data security and scalability within your organization.
- Audit trails and support bundles: Improved audit coverage and more robust support bundle exports enhance data privacy and system stability.
Learn more about our substantial data compliance and security upgrades here.
A streamlined user experience
Snorkel’s engineers have labored to make using Snorkel Flow a more delightful experience with the 2024.R3 release.
The platform boasts a revamped user interface (UI) with improvements to:
- Labeling function (LF) table and suggestions: Navigate labeling functions more intuitively for faster data labeling.
- Homepage and sidebar navigation: Enhanced navigation and organization streamline workflows and improve efficiency.
Ready to take your AI projects to the next level?
Snorkel Flow’s 2024.R3 release offers a comprehensive suite of tools to accelerate your AI development journey. From specialized GenAI evaluation to enhanced NLP workflows and a user-friendly interface, R3 empowers you to build robust, production-ready models faster.
Learn more about what Snorkel can do for your organization
Snorkel AI offers multiple ways for enterprises to uplevel their AI capabilities. Our Snorkel Flow data development platform empowers enterprise data scientists and subject matter experts to build and deploy high quality models end-to-end in-house. Our Snorkel Custom program puts our world-class engineers and researchers to work on your most promising challenges to deliver data sets or fully-built LLM or generative AI applications, fast.
See what Snorkel option is right for you. Book a demo today.
Matt Casey leads content production at Snorkel AI. In prior roles, Matt built machine learning models and data pipelines as a data scientist. As a journalist, he produced written and audio content for outlets including The Boston Globe and NPR affiliates.