

I’m an Assistant Professor of Biomedical Data Science and of Medicine at Stanford University. My research focuses on training and evaluating foundation models for healthcare and is positioned at the intersection of computer science, medical informatics, and hospital systems. Much of my work explores using electronic health record (EHR) data to contextualize human health, leveraging longitudinal patient information to inform model development and evaluation. My work has appeared in NeurIPS, ICLR, AAAI, Nature Communications, and npj Digital Medicine.
The latest from Jason
Background: The aortic valve is an important determinant of cardiovascular physiology and anatomic location of common human diseases. Methods: From a sample of 34 287 white British ancestry participants, we estimated functional aortic valve area by planimetry from prospectively obtained cardiac magnetic resonance imaging sequences of the aortic valve. Aortic valve area measurements were submitted to genome-wide association testing, followed by…
Objective: Responding to the COVID-19 pandemic requires accurate forecasting of health system capacity requirements using readily available inputs. We examined whether testing and hospitalization data could help quantify the anticipated burden on the health system given shelter-in-place (SIP) order. Materials and Methods: 16,103 SARS-CoV-2 RT-PCR tests were performed on 15,807 patients at Stanford facilities between March 2 and April 11,…
There is substantial interest in using presenting symptoms to prioritize testing for COVID-19 and establish symptom-based surveillance. However, little is currently known about the specificity of COVID-19 symptoms. To assess the feasibility of symptom-based screening for COVID-19, we used data from tests for common respiratory viruses and SARS-CoV-2 in our health system to measure the ability to correctly classify virus…
Accurate transcription of audio recordings in psychotherapy would improve therapy effectiveness, clinician training, and safety monitoring. Although automatic speech recognition software is commercially available, its accuracy in mental health settings has not been well described. It is unclear which metrics and thresholds are appropriate for different clinical use cases, which may range from population descriptions to individual safety monitoring. Here…
A third of adults in America use the Internet to diagnose medical concerns, and online symptom checkers are increasingly part of this process. These tools are powered by diagnosis models similar to clinical decision support systems, with the primary difference being the coverage of symptoms and diagnoses. To be useful to patients and physicians, these models must have high accuracy…
This work formalizes a deep learning baseline for aortic valve classification and outlines a general strategy for using weak supervision to train machine learning models using unlabeled medical images at scale.
Since manually labeling training data is slow and expensive, recent industrial and scientific research efforts have turned to weaker or noisier forms of supervision sources. However, existing weak supervision approaches fail to model multi-resolution sources for sequential data, like video, that can assign labels to individual elements or collections of elements in a sequence. A key challenge in weak supervision…
Showcasing state-of-the-art deep learning methods that identify patient outcomes from clinical notes without requiring hand-labeled training data.
Introducing SwellShark, a framework for building biomedical named entity recognition (NER) systems quickly.



