Image
author

Tom Walshe

Staff Research Scientist
,
Snorkel AI

Tom Walshe is a Staff Research Scientist at Snorkel AI. Before Snorkel, Tom worked in LegalTech and finance services, where he focussed on building end-to-end AI systems and researching data-centric AI. Prior to industry, Tom completed a PhD in Computer Science from the University of Oxford.

The latest from Tom Walshe

Evaluating coding agent capabilities with Terminal-Bench: Snorkel’s role in building the next generation benchmark
Blog
Evaluating coding agent capabilities with Terminal-Bench: Snorkel’s role in building the next generation benchmark

Terminal-Bench, developed through a collaboration between Stanford University and Laude Institute, has quickly become the gold standard benchmark for evaluating AI agent capabilities in a command line environment. This comprehensive evaluation framework measures how effectively AI agents can perform complex, real-world tasks within terminal environments. At Snorkel AI, we’re excited to share that we’re one of the top collaborators contributing…

Sep 30, 2025
Learn more about Evaluating coding agent capabilities with Terminal-Bench: Snorkel’s role in building the next generation benchmark
The right tool for the job: An A-Z of rubrics
Blog
The right tool for the job: An A-Z of rubrics

Rubrics turn fuzzy “good vs. bad” into measurable criteria for GenAI. In Part 2, we map what to measure (granularity and dataset-level vs instance-specific), where to measure (process vs outcome), and how to measure (humans, LLM-as-judge, code, reward models)—with examples like HHH, FLASK, HealthBench, and PaperBench.

Sep 02, 2025
Learn more about The right tool for the job: An A-Z of rubrics
LLM alignment techniques: 4 post-training approaches
Blog
LLM alignment techniques: 4 post-training approaches

Ensure your LLMs align with your values and goals using LLM alignment techniques. Learn how to mitigate risks and optimize performance.

Mar 04, 2025
Learn more about LLM alignment techniques: 4 post-training approaches
Walking safely before building flying saucer seatbelts: introducing Enterprise Alignment
Blog
Walking safely before building flying saucer seatbelts: introducing Enterprise Alignment

Snorkel takes a step on the path to enterprise superalignment with new data development workflows for enterprise alignment

Learn more about Walking safely before building flying saucer seatbelts: introducing Enterprise Alignment

For models that need to be right. Not just good enough.