Translational Research Informatics and Bioinformatics

Active Projects   

High Performance Computing and Databases

It is highly desirable in research studies to use large datasets in order to obtain robust, statistically significant results, but the scale of an image-based study is often limited by how efficiently image datasets can be processed through image segmentation, feature computation, and classification pipelines. Modern HPC systems provide significant processing power, through clusters of hybrid computation nodes with multi-core CPUs and multiple graphics processing units (GPUs), and memory capacity, distributed across computation nodes or accessible via shared-memory mechanisms. Nevertheless, implementing analysis applications on HPC systems is not an easy task, because of the heterogeneous nature, complexity, and scale of the contemporary systems. Our work researches and develops methods and runtime middleware systems that can carry out high throughput processing of large numbers of images by coordinated use of multi-core CPUs and GPUs on computing clusters.

Algorithm evaluation provides a means to characterize variability across image analysis algorithms, validate algorithms by comparison of multiple results, and facilitate algorithm sensitivity studies. Sensitivity quantification process in algorithm evaluation is a data intensive and computationally expensive process that involves processing datasets using variations of an analysis pipeline, comparing results from these variations, and quantifying agreements and disagreements between the results. Analysis parameter tuning is another important process in which the parameter space of an analysis workflow is searched by comparing analysis results with ground truth to find the set of parameters, which produces results that are closest to the ground truth with respect to some comparison metric. The sizes of images and analysis results in pathology image analysis pose significant challenges in these processes. We develop methods and an integrated software framework that addresses the processing and data management challenges of sensitivity quantification and parameter tuning by carefully distributing and coordinating operations and data across multiple machines, multi-core CPUs and co-processors and by reducing data movement and computation costs.

Past Projects

In Silico Brain Tumor Research Center - funded by the National Cancer Institute as one of the In Silico Research Centers of Excellence, this center explores new ideas in brain tumor translational research through multi-scale, integrative in silico experiments. The experiments make use of molecular data, pathology image data, radiology data, and clinical outcome data. The experiments involve execution of pipelines of image processing operations on large image datasets, execution of multiple bioinformatics analysis methods, and comparison and correlation of data from multiple data types and sources. The center is a collaborative effort between Stony Brook University, Emory University, Henry Ford Hospital, and Thomas Jefferson University.

Related Publications

  • A. Post, T. Kurc, S. Cholleti, J. Gao, X. Lin, W. Bornstein, D. Cantrell, D. Levine, S. Hohmann, J. Saltz: The Analytic Information Warehouse (AIW): A platform for analytics using electronic health record data, Journal of Biomedical Informatics, 46(3), pp. 410-424, 2013. [paper]
  • A. Post, T. Kurc, R. Willard, H. Rathod, M. Mansour, A. Pai, W. Torian, S. Agravat, S. Sturm, J. Saltz, “Temporal Abstraction-based Clinical Phenotyping with Eureka!”, accepted for presentation and publication at the AMIA 2013 Annual Symposium, 2013.
  • Post A, Kurc T, Overcash M, Cantrell D, Morris T, Eckerson K, Tsui C, Willey T, Quyyumi A, Eapen D, Umpierrez G, Ziemer D, Saltz J. A Temporal Abstraction-based Extract, Transform and Load Process for Creating Registry Databases for Research. AMIA Joint Clinical Research Informatics and Translational Bioinformatics Summit; San Francisco, 2011. [paper]
  • Winslow, R. L., Saltz, J., Foster, I., Carr, J. J., Ge, Y., Miller, M. I, Younes, L., Geman, D., Graniote, S., Kurc, T., Madduri, R., Ratnanather, T., Larkin, J., Ardekani, S., Brown, T., Kolasny, A., Reynolds, K., Shipway, M., Toerper, M. (2011) The CardioVascular Research Grid (CVRG) Project, Proceedings of the AMIA Summit on Translational Bioinformatics, 2011, pgs. 77-81. [paper]
  • Post A, Kurc T, Butler J, Saltz J. Architecture of an Analytic Information Warehouse for Discovering Risk Factor Models of Disease in Quality Improvement and Research. American Medical Informatics Association (AMIA) Summit on Translational Bioinformatics. San Francisco, CA; 2010.
  • T. Kurc, S. Hastings, V.S. Kumar, S. Langella, A. Sharma, T. Pan, S. Oster, D. Ervin, J. Permar, S. Narayanan, Y. Gil, E. Deelman, M. Hall and J. Saltz: HPC and Grid Computing for Integrative Biomedical Research. International Journal of High Performance Computing Applications, Special Issue, the Workshop on Clusters and Computational Grids for Scientific Computing, Vol. 23(3), pp. 252-264, 2009. [paper]
  • S. Langella, S. Hastings, S. Oster, T. Pan,A. Sharma, J. Permar, D. Ervin, B. Cambazoglu, T. Kurc, and J. Saltz, ”Sharing Data and Analytical Resources Securely in a Biomedical Research Grid Environment”, Journal of American Medical Informatics Association, Vol. 15(3), pp. 363-373, 2008. [paper]
  • S. Oster, S. Langella, S. L. Hastings, D. W. Ervin, R. Madduri, J. Phillips, T. Kurc, F. Siebenlist, P. A. Covitz, K. Shanbhag, I. Foster, J. H. Saltz, ”caGrid 1.0: An Enterprise Grid Infrastructure for Biomedical Research”, Journal of the American Medical Informatics Association (JAMIA), Vol. 15, pp. 138-149, 2008. [paper]
  • S. Langella, S. Oster, S. Hastings, F. Siebenlist, J. Phillips, D. Ervin, J. Permar, T. Kurc, J. Saltz, The Cancer Biomedical Informatics Grid (caBIG) Security Infrastructure, The American Medical Informatics Association (AMIA) Symposium, November 2007. [paper]
  • S. Oster, S. Langella, S. Hastings, E. David, R. Madduri, T. Kurc, F. Siebenlist, I. Foster, K. Shanbhag, P. Covitz, J. Saltz, caGrid 1.0: A Grid Enterprise Architecture for Cancer Research, The American Medical Informatics Association (AMIA) Symposium, November 2007. [paper]