Snorkel AI researchers work on the cutting edge of AI innovation to help expand the boundaries of AI knowledge.

That’s a bold statement, but accurate.

As part of a research-first culture, the Snorkel team has contributed to over 150 academic papers on topics covering LLM data curation, LLM evaluation, model distillation, and more. Snorkel AI founders and researchers present regularly at distinguished AI conferences such as NeurIPS, where the team was recognized with a Best Paper award for “Low-Resource Languages Jailbreak GPT-4. A recent co-presentation on MedAlign, a curated open-source benchmark dataset for the evaluation of LLMs for EHR data retrieval, was awarded Best Findings Paper in GenAI for Health.

Snorkel’s partnership with Microsoft plays a critical role in equipping its research team to experiment with new techniques and approaches. Snorkel is a member of the Microsoft for Startups Pegasus Program, Microsoft’s flagship go-to-market program. Through the Pegasus program, Snorkel has access to premier sales resources and technical assets to accelerate AI workloads including early access to Azure AI services, leading models from OpenAI and Mistral, and accelerated high-performance compute. The ability for Snorkel’s Research team to execute its most demanding projects on Azure AI infrastructure powered by NVIDIA GPUs has been a game changer. 

Snorkel’s recent top tier ranking on the AlpacaEval 2.0 LLM leaderboard would not have been possible without the program’s dedicated startups GPU cluster benefit. Access to state-of-the-art Azure NDm NVIDIA A100 instances via a seamless Azure experience has empowered Snorkel to drive cutting-edge research in programmatic alignment/DPO in a quick & efficient manner. Azure AI Infrastructure VMs, which come pre-configured with InfiniBand and NVLINK for optimized scale-out and scale-up, allow Snorkel researchers to run quick experiments from small projects to large-scale distributed jobs on multiple GPUs reliably and with full monitoring mechanisms.

The value of this research extends far beyond academic papers and benchmark results. The Snorkel Flow data development platform was intentionally designed with flexible abstractions and extensible interfaces that allow for the continual integration of the latest and most remarkable technologies from academic collaborations to create an ever more powerful tool for users. Ultimately, Snorkel’s cutting-edge research plays a key role help enterprises successfully move AI projects from prototype to production.

Today, Snorkel is proud to congratulate Microsoft on the new generally available Azure ​​NC H100 v5 VMs, which are tailored to accelerate large-scale AI model training and batch inference. As Microsoft advances the state-of-the-art with optimized AI GPU VMs leveraging the latest NVIDIA technologies, the Snorkel research team can launch increasingly ambitious and challenging projects.  

To learn more, we invite you to read the Microsoft blog post “Microsoft and NVIDIA partnership continues to deliver on the promise of AI” and visit us at NVIDIA GTC.

Learn more in person at NVIDIA GTC

Visit the Microsoft booth at NVIDIA GTC to learn more about the Snorkel research that resulted in a top tier ranking on the AlpacaEval 2.0 LLM leaderboard. Plus, Snorkel will share how designing projects on Azure AI infrastructure powered by NVIDIA GPUs helps our researchers deliver value for our customers and the OSS community even faster.

Topic: Snorkel AI research leverages Azure AI Infrastructure powered by NVIDIA GPUs for cutting-edge AI/ML.
Location: Microsoft booth #1108
Timing: March 19, 3:40-4pm

This blog post is a collaboration between Microsoft for Startups Senior AI Advisor Doug Kelly and Snorkel Head of Partnerships Friea Berg.