Image
  • Product
      • SNORKEL AI DATA DEVELOPMENT PLATFORM
      • Snorkel Expert Data-as-a-Service
      • Platform Overview
      • Snorkel Evaluate
      • Snorkel Develop
      • Snorkel Predictive ML
  • Expert Data
    • CUSTOM EXPERT-LEVEL DATA
    • Expert Data-as-a-Service
    • Use Cases
    • Leaderboard
    • Expert Network
  • Leaderboards
  • Solutions
      • SERVICES
      • Snorkel Expert Data-as-a-Service – Learn more about Snorkel’s white-glove service for creating expert training and evaluation data.
      • INDUSTRIES
      • Banking & Finance
      • Healthcare
      • Insurance
      • Public Sector
      • Customers
      • Customer Stories – See how Snorkel is powering innovation in the Fortune 500 and beyond.
  • Research
  • Resources
      • LEARN
      • Customer Stories
      • Blog
      • Resource Library
      • Docs
      • ENGAGE
      • Webinars
      • AI PRIMERS
      • Data-centric AI
      • Data Labeling
      • Generative AI
      • Large Language Models
      • LLM evaluation
  • Company
    • About Us
    • Careers
    • Partners
    • Press & News
    • Contact Us
  • Docs
    • Welcome to Snorkel
    • Installation Overview
    • SDK Reference
      • Glossary
    • Full Documentation
  • Talk to an AI expert
Talk to an AI expert
Get a demo
Search result for:
See all articles
Awards

The AI 50 2023

Date: April 11, 2023
Updated: September 27, 2024

Recommended
articles

See all articles
Data development

Evaluating Multi-Agent Systems in Enterprise Tool Use

In recent months, there has been increasing interest in the area of multi-agent systems and how they can be used to solve more complex tasks than a single agent could accomplish on its own. The topic is particularly interesting and raises several questions and ideas to consider: Anthropic’s blog post about how they architected a multi-agent deep research system is…

Bhavishya Pohani
October 9, 2025
Data development

Evaluating Coding Agent Capabilities with Terminal-Bench: Snorkel’s Role in Building the Next Generation Benchmark

Terminal-Bench, developed through a collaboration between Stanford University and Laude Institute, has quickly become the gold standard benchmark for evaluating AI agent capabilities in a command line environment. This comprehensive evaluation framework measures how effectively AI agents can perform complex, real-world tasks within terminal environments. At Snorkel AI, we’re excited to share that we’re one of the top collaborators contributing…

Kobie Crawford, Jeong Shin, Tom Walshe
September 30, 2025
Data development, Research

Parsing Isn’t Neutral: Why Evaluation Choices Matter

Behind every AI benchmark is a hidden choice: how to read the model’s answers. That choice—parsing—can quietly tilt results more than the model itself. Parsing is where we take an AI system’s raw response and extract the “answer” we use for scoring. It sounds mechanical, but as our research shows, the choice of parser can dramatically change measured accuracy. In…

Justin Bauer
September 26, 2025

Join our newsletter for expert advice, the latest research, and exclusive events.

By submitting this form, I acknowledge I will receive email updates from Snorkel AI, and I agree to the Terms of Use and acknowledge that my information will be used in accordance with the Privacy Policy.
Image

Product

  • Platform Overview
  • Snorkel Evaluate
  • Snorkel Develop
  • Snorkel Expert Data-as-a-Service
  • Predictive ML

Solutions

Services

  • Snorkel Expert Data-as-a-Service

Industries

  • Banking & finance
  • Healthcare
  • Insurance
  • Public sector

Customers

  • Customer stories

Resources

Learn

  • Blog
  • Resource library
  • Docs

Engage

  • Events & conferences
  • Webinars
  • Weekly demos

AI Primers

  • Data-centric AI
  • Data labeling
  • Generative AI
  • Large language models
  • LLM evaluation

Docs

  • Welcome to Snorkel
  • Installation overview
  • SDK reference
  • Glossary
  • Full documentation

AI Research

  • Snorkel research
  • Research papers

Company

  • About
  • Careers
  • Partners
  • Press & news
  • Security

Contact

  • Contact us
  • Request a demo

Compliance

ImageImage

Copyright © 2025 Snorkel AI, Inc. All rights reserved.
Terms of Use Privacy Cookie Policy
Image
Image
Image