The Stanford Medicine data science ecosystem for clinical and translational research
Abstract
Research patient data repositories are essential for health systems to learn from the experiences of their patients and for advancing the mission of academic medical centers. In this paper, we describe methods, tools, and practices at Stanford Medicine to maintain its research patient data repository and computing resources to support clinical and translational research, which together comprise the Stanford Medicine Data Science Resources (SDSR). The SDSR includes computing infrastructure and tools to create, search, retrieve, and analyze patient data. Data are made available via self-service and staff supported access, on secure computers. The Stanford Medicine Research Data Repository functions as the SDSR data integration point, and includes patient records such as clinical images, text, bedside monitoring data and administrative records. SDSR tools include a search engine for patient data and data analysis tools for identifying and retrieving data about groups of patients with shared characteristics, such as a diagnosis or treatment. The SDSR also supports patient data collection, reproducible research, and teaching using healthcare data, and facilitates industry collaborations and observational studies. Challenges to maintaining the SDSR include ensuring sufficient financial support while providing researchers and clinicians with maximal access to data and digital infrastructure, balancing tool development with user training, and supporting the diverse needs of users.