Serverless OpenHealth at Data Commons Scale - Traversing the 20 Million Patient Records of New York's SPARCS dataset in real-time

BMI Grand Rounds March 28, 2018Speaker: Jonas Almeida, PhD Professor and Chief Technology Officer, Graduate Program Director, Department of Biomedical Informatics

Title: Serverless OpenHealth at Data Commons Scale - Traversing the 20 Million Patient Records of New York's SPARCS dataset in real-time

Time: Wednesday, March 28, 2018 3:00 pm- 4:00 pm

Location: Health Science Center – Level 3 Classroom 152

Abstract: The serverless OpenHealth approach to the Web as a Global Compute space brings data-intensive computation to consumer-facing platforms with no need for download or installation. This solution is validated with an accompanying interactive web application (bit.ly/loadsparcs) capable of real-time traversal of New York’s 20 million patient records of the Statewide Planning and Research Cooperative System (SPARCS). This application brings the data into the hands of those who use it, in an interactive, graphic display which includes mapping tools, the ability to export the results to Plotly for hands on data visualization and to the Google cloud where artificial intelligence can be applied for immediate data analysis, all on a cell phone That approach relies on the modern browser full stack, and, in particular, its configuration for application assembly by code injection. The opportunity, and need, to expand this approach has since increased markedly, reflecting a wider adoption of Open Data policies by Public Health Agencies. Here, we describe how the serverless scaling challenge can be achieved by the isomorphic mapping between the remote data layer API and a local (client-side, in-browser) operator. This strengthens the argument that the FAIR reproducibility needed for Population Science applications in the age of P4 Medicine is particularly well served by the Web platform.