LIVE WEBINAR WITH DEMO

How to classify and extract information from complex PDFs at scale

December 17, 2024
10 AM PT / 1 PM ET
Image
Grace King

Senior Product Manager
Snorkel AI

Image
Shane Johnson

Senior Director of Product Marketing
Snorkel AI

Transforming unstructured data such as text and documents into structured data is crucial for enterprise AI development, whether it’s training specialized models to extract information from complex documents or improving RAG retrieval accuracy.

In this webinar, we’ll explain how to capture SME domain knowledge, and use it to automate and scale PDF classification and information extraction tasks. Specifically, we’ll demonstrate techniques such as candidate span generation, sequence tagging, and LLM prompting.

Join this live webinar and demo, and learn how to:

  • Classify diverse and complex PDFs accurately, and at scale
  • Automate the extraction of relevant information within large PDFs
  • Leverage classification and information extraction to improve RAG
  • Date: December 17, 2024
    Time: 10 AM PT / 1 PM ET

Register now

Register now

By submitting this form, I agree to the Terms of Use and acknowledge that my information will be used in accordance with the Privacy Policy.

Speakers

Image

Grace King

Senior Product Manager
Snorkel AI

Grace King is a Senior Product Manager at Snorkel, where she leads various use cases. A passionate problem solver, she has a background in generative AI for synthetic data generation through her product role at Gretel AI. Grace graduated from Thayer School of Engineering at Dartmouth College with degrees in Electrical and Computer Engineering and from Vassar College with degrees in Physics and Mathematics.

Image

Shane Johnson

Senior Director of Product Marketing
Snorkel AI

I started out as a developer and architect before pivoting to product/marketing. I'm still a developer at heart (and love coding for fun), but I love advocating for innovative products -- particularly to developers.

I've spent most of my time in the database space, but lately I've been going down the LLM rabbit hole.