In this post, we will show you a specialized benchmark dataset we developed with our expert network of Chartered Property and Casualty Underwriters (CPCUs). The benchmark uncovers several model-specific and actionable error modes, including basic tool use errors and a surprising number of insidious hallucinations from one provider. This is part of an ongoing series of benchmarks we are releasing across verticals…
The Snorkel Flow label model plays an instrumental role in driving the enterprise value we create. Here’s a peek at how it works.
Snorkel takes a step on the path to enterprise superalignment with new data development workflows for enterprise alignment
Enterprises that aim to build valuable GenAI applications must view them from a systems-level. LLMs are just one part of an ecosystem.
We used weak supervision to programmatically curate instruction tuning data for open-source LLMs to build a better GenAI.