Joel Saltz

 

Joel Saltz
Distinguished Professor and Founding Chair, Department of Biomedical
Informatics
Endowed Charith Professor
Department of Biomedical Informatics
HSC L3-043
Stony Brook, NY 11794
 
INTERESTS
 
Artificial Intelligence,  Digital Pathology,  Biomedical Informatics
 
 
BIOGRAPHY
 
Dr. Saltz has led numerous cross‑disciplinary initiatives spanning biomedicine, engineering, and computer science and has founded and chaired three highly successful Biomedical Informatics departments at Ohio State, Emory and now Stony Brook. The Stony Brook Department of Biomedical Informatics is dually housed in the School of Medicine and Engineering with a strong focus on artificial intelligence methods. Dr. Saltz also founded the Clinical Informatics group at Stony Brook; this group offers ACGME Clinical Informatics subspecialty fellowship positions.
 
Dr. Joel Saltz is a leader in research on advanced information technologies for large scale data science and biomedical/scientific research. He has developed innovative pathology informatics methods, including: the first published whole slide virtual microscope system; pioneering pathology computer-aided diagnosis techniques; and methods for decomposing pathology images into features and linking those features to cancer “omics”, response to treatment and outcome. He has broken new ground in big data through development of the filter-stream based DataCutter system, the map-reduce style Active Data Repository and the inspector-executor runtime compiler framework. He has also been an active contributor in clinical informatics, having developed predictive models for hospital readmissions, point of care laboratory testing quality assurance systems, decision support systems for electrophoresis interpretation and graphical user  interfaces to support clinical data warehouse queries. Dr. Saltz is trained both as a computer scientist and as a physician through the MSTP program at Duke University. He has deep experience in computer science, having served on the computer science faculties at Yale University and the University of Maryland. He completed his residency in clinical pathology at Johns Hopkins University and he is a board certified clinical pathologist with a subspecialty certification in Clinical Informatics.
 
RESEARCH
 

Dr. Saltz is a pioneer in developing Digital Pathology tools, methods and algorithms with the ultimate goal of extracting and leveraging digitalized Pathology information to better predict cancer outcome and to steer cancer therapy. He is also an expert in high end computing and has developed a variety of highly cited systems software methods.

 

His research in Pathology spans twenty years and consists of closely coordinated efforts in image analysis, machine learning, database design and high end computing. He has developed tools and methods through years of funded projects supported by a wide range of institutes and agencies including NCI, NLM, NIBIB, NSF, DARPA, AFOSR, NASA, DOD and DOE. His seminal work in digital imaging laid the foundation for digital pathology as it is today. He was the first to develop the “Virtual Microscope,” and pioneered developments in digital pathology whole slide image navigation, data management and computer aided classification.

 

Dr. Saltz’s initial efforts included development of the first whole slide image viewer, and devising efficient methods for management, caching and supporting analytics carried out on whole slide datasets. This work became the foundation of the new field of Pathology Imaging Informatics, today investigators he mentored can be found carrying out exciting research in institutions across the country.

 

Over the years, Dr. Saltz has developed a rich set of Pathology informatics tools, methods and algorithms. Dr. Saltz's team applies generative AI models to digital pathology, developing systems that can create realistic microscopic tissue images. Their diffusion models can generate pathology images from text descriptions and synthesize large-scale tissue samples without requiring time-consuming manual annotations. The team's ZoomLDM system can create gigapixel-sized images that maintain both microscopic detail and overall tissue structure across different magnification levels—similar to how a pathologist would zoom in and out when examining slides. These AI models have shown utility for disease diagnosis, with their learned features performing better than existing methods in detecting breast cancer and genetic mutations. Through their Gen-SIS framework, the team has demonstrated how synthetic images can enhance AI training without additional human labeling, which could help develop diagnostic tools while reducing the burden on medical professionals.

 

The team has developed a variety of methods for interpretable AI. SI-MIL (Self-Interpretable MIL) integrates handcrafted pathological features into a linear prediction branch for Multiple Instance Learning, enabling interpretable predictions through human-understandable descriptors like tumor cellularity and necrosis rather than opaque attention maps. GECKO (Gigapixel Vision-Concept Contrastive Pretraining) aligns whole slide images with pathology concept priors through contrastive learning, producing concept-aware embeddings that allow pathologists to inspect predictions via concept activation maps. HIPPO (Histopathology Interpretability via Prototypes and Perturbations of WSI) leverages prototypical learning and counterfactual explanations to enable visual interpretation of model decisions through human-understandable image patches and perturbation-based feature importance. These frameworks demonstrate that computational pathology models can achieve strong performance while maintaining clinical interpretability through concept-based reasoning and visual explanations. These frameworks demonstrate that computational pathology models can achieve strong performance while maintaining clinical interpretability through concept-based reasoning.

 

Members of the Saltz team are carrying out research with the ultimate aim of creating an autonomous pathology system that would function in a manner analogous to a self-driving car by using attention heatmap predictions as perception models to identify diagnostic regions (like detecting road features), while Pathologist scanpath prediction acts as navigation planning to determine optimal viewing sequences through the WSI. The system would maintain dynamic working memory of examined areas and automatically adjust magnification levels based on tissue complexity, similar to how autonomous vehicles track surroundings and adapt speed to conditions. The groundwork for this project has been laid in a set of carefully controlled experiments involving development of models to predict Pathologist attention and scan path during prostate cancer classification.


He and his group have developed a rich set of pipelines to compute a variety of biologically significant Pathology features, including spatial maps of tumor infiltrating lymphocytes (TILs). The quantification of TILs is well documented to have prognostic value in many contexts; understanding patient immune response to tumors is increasingly important with the advance of cancer immunotherapy. For most cancer types, Pathologists do not comment on TILs in their reports and even in cancer types where TIL comments are sometimes made, they only use high level terms such as “brisk” or “sparse”. Dr. Saltz is able to precisely quantify percent tumor infiltrating lymphocytes and is also able to generate detailed TIL Maps. He has used these methods to generate publicly available whole slide TIL maps and spatial statistics for 13 cancer types and 5,000 subjects – these are available on The Cancer Imaging Archive. His team is now extending and leveraging these methods in many other studies. His MSTP student, Jakub Kaczmarzyk has developed WSinfer -- QuPath linked software to efficiently computer patch based predictions including tumor/TIL maps.

 

Til Map
An H&E image, its TIL map, and the diagram of clusters of TIL locations for spatial pattern analysis [3].

 

Dr. Saltz and collaborators have also developed a variety of machine learning methods to target analysis of whole slide images. This includes a variety of methods that target Neuroblastoma classification as well as methods that are able to leverage coarse grained whole slide training data through combining patch- level convolutional neural networks with supervised decision fusion. He achieved state-of-the-art results in predicting the subtypes of brain and lung tumors.


Finally, Dr. Saltz made foundational contributions to data science through development of innovative methods for moving computation to data, development of a very early prototype map/reduce type framework, foundational methods for runtime compiler analysis of adaptive applications and analyses of key data structures employed in the management of spatial data (Hilbert Space-Filling Curves). Tools derived from these methods was awarded the 2024 VLDB Test of Time award(https://www.vldb.org/awards_10year.html)


Joel Saltz received his MD and Computer Science PhD from the Medical Scientist Training Program at Duke University. He was trained in Clinical Pathology at Johns Hopkins Medical School and served on the Johns Hopkins faculty as Professor and Director of Pathology Informatics. He has launched new Departments of Biomedical Informatics at Stony Brook, Emory and Ohio State and has also served on the Computer Science faculty at Yale University and the University of Maryland College Park.


Joel Saltz, Dimitris Samaras, Tahsin Kurc, Chao Chen, Prateek Prasanna, Raj Gupta, Gregory Zelinsky and Fusheng Wang form the Digital Pathology group at Stony Brook. This closely integrated research group targets development of AI and machine learning algorithms, numerical methods, software tools and database architectures for digital Pathology and multi-scale tissue analysis.

 
SELECTED REFERENCES:

2025

  1. Kapse, S., Pati, P., Yellapragada, S., Das, S., Gupta, R. R., Saltz, J., Samaras, D., & Prasanna, P. (2025). Gecko: Gigapixel vision-concept contrastive pretraining in histopathology. arXiv preprint arXiv:2504.01009.

  2. Yellapragada, S., Graikos, A., Triaridis, K., Prasanna, P., Gupta, R., Saltz, J., & Samaras, D. (2025). ZoomLDM: Latent Diffusion Model for multi-scale image generation. In Proceedings of the Computer Vision and Pattern Recognition Conference (pp. 23453–23463).

  3. Zelinsky, G., Chakraborty, S., Saltz, J., & Samaras, D. (2025). Predicting pathologist attention during cancer-image readingsJournal of Vision, 25(9), 2736.

2024
4. Kapse, S., Das, S., Zhang, J., Gupta, R. R., Saltz, J., Samaras, D., & Prasanna, P. (2024). Attention de-sparsification matters: Inducing diversity in digital pathology representation learningMedical Image Analysis, 93, 103070. https://doi.org/10.1016/j.media.2023.103070

5. Kaczmarzyk, J. R., O'Callaghan, A., Inglis, F., Gat, S., Kurc, T., Gupta, R., Bremer, E., Bankhead, P., & Saltz, J. H. (2024). Open and reusable deep learning for pathology with WSINFER and QuPathnpj Precision Oncology, 8(1). https://doi.org/10.1038/s41698-024-00499-9

6. Kapse, S., Pati, P., Das, S., Zhang, J., Chen, C., Vakalopoulou, M., Saltz, J., Samaras, D., Gupta, R. R., & Prasanna, P. (2024). Si-MIL: Taming deep MIL for self-interpretability in gigapixel histopathology. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 11226–11237).

7. Kaczmarzyk, J. R., Saltz, J. H., & Koo, P. K. (2024). Explainable AI for computational pathology identifies model limitations and tissue biomarkers. arXiv preprint arXiv:2409.

8. Graikos, A., Yellapragada, S., Le, M. Q., Kapse, S., Prasanna, P., Saltz, J., & Samaras, D. (2024). Learned representation-guided diffusion models for large-image generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 8532–8542).

9. Belagali, V., Yellapragada, S., Graikos, A., Kapse, S., Li, Z., Nandi, T. N., Madduri, R. K., Prasanna, P., Saltz, J., & Samaras, D. (2024). Gen-SIS: Generative self-augmentation improves self-supervised learning. arXiv preprint arXiv:2412.01672.

2023
10. Yellapragada, S., Graikos, A., Prasanna, P., Kurc, T., Saltz, J., & Samaras, D. (2023). PATHLDM: Text conditioned latent diffusion model for histopathologyhttps://doi.org/10.48550/arXiv.2309.00748

11. Abousamra, S., Gupta, R., Kurc, T., Samaras, D., Saltz, J., & Chen, C. (2023). Topology-guided multi-class cell context generation for digital pathology. In 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)https://doi.org/10.1109/cvpr52729.2023.00324

2021
12. Abousamra, S., Belinsky, D., Van Arnam, J., Allard, F., Yee, E., Gupta, R., Kurc, T., Samaras, D., Saltz, J., & Chen, C. (2021). Multi-class cell detection using spatial context representation. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV)https://doi.org/10.1109/iccv48922.2021.00397

2019
13. Hou, L., Agarwal, A., Samaras, D., Kurc, T. M., Gupta, R. R., & Saltz, J. H. (2019). Robust histopathology image analysis: To label or to synthesize? In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)https://doi.org/10.1109/cvpr.2019.00873

2018
14. Saltz, J., Gupta, R., Hou, L., Kurc, T., Singh, P., Nguyen, V., Samaras, D., Shroyer, K. R., Zhao, T., Batiste, R., & Van Arnam, J. (2018). Spatial organization and molecular correlation of tumor-infiltrating lymphocytes using deep learning on pathology imagesCell Reports, 23(1), 181. https://doi.org/10.1016/j.celrep.2018.03.086

2017
15. Saltz, J., Sharma, A., Iyer, G., Bremer, E., Wang, F., Jasniewski, A., DiPrima, T., Almeida, J. S., Gao, Y., Zhao, T., Saltz, M., & Kurc, T. (2017). A containerized software system for generation, management and exploration of features from whole slide tissue imagesCancer Research, 77(21). PMID: 29092946

2016
16. Hou, L., Samaras, D., Kurc, T., Gao, Y., Davis, J., & Saltz, J. (2016). Patch-based convolutional neural network for whole slide tissue image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV.

2013
17. Kong, J., et al. (2013). Machine-based morphologic analysis of glioblastoma using whole-slide pathology images uncovers clinically relevant molecular correlatesPLoS One, 8, e81049. https://doi.org/10.1371/journal.pone.0081049

18. Aji, A., Wang, F., Vo, H., Lee, R., Liu, Q., Zhang, X., & Saltz, J. (2013). Hadoop-GIS: A high performance spatial data warehousing system over MapReduce. In Proceedings of the VLDB Endowment International Conference on Very Large Data Bases, 6(11), 1009.

2012
19. Cooper, L. A., Kong, J., Gutman, D. A., Wang, F., Gao, J., Appin, C., Cholleti, S., Pan, T., Sharma, A., Scarpace, L., Mikkelsen, T., Kurc, T., Moreno, C. S., Brat, D. J., & Saltz, J. H. (2012). Integrated morphologic analysis for the identification and characterization of disease subtypesJournal of the American Medical Informatics Association, 19, 317–323. https://doi.org/10.1136/amiajnl-2011-000700

2010

20. Cooper, L. A., et al. (2010). An integrative approach for in silico glioma researchIEEE Transactions on Biomedical Engineering, 57, 2617–2621. https://doi.org/10.1109/tbme.2010.2060338

2001
21. Moon, B., Jagadish, H. V., Faloutsos, C., & Saltz, J. H. (2001). Analysis of the clustering properties of the Hilbert space-filling curveIEEE Transactions on Knowledge and Data Engineering, 13, 124–141.

22. Kurc, T., Çatalyürek, Ü., Chang, C., Sussman, A., & Saltz, J. (2001). Visualization of large data sets with the active data repositoryIEEE Computer Graphics and Applications, 21, 24–33.
23. Beynon, M. D., et al. (2001). Distributed processing of very large datasets with DataCutterParallel Computing, 27, 1457–1478.

1991
24. Saltz, J. H., Mirchandaney, R., & Crowley, K. (1991). Run-time parallelization and scheduling of loopsIEEE Transactions on Computers, 40, 603–612.

 
AWARDS AND ACTIVITIES
 
Dr. Saltz is a fellow of the American College of Medical Informatics, Cherith Chair of Biomedical Informatics, Recent winner of the Very Large Database foundation test of time award (2024), co-leads the Institute for Engineering Driven Medicine (IEDM). He has participated in over 70 grants and contracts, serving as principal investigator on roughly half of those, and has an extensive publication track record
with over 39,000 citations.
 
PUBLICATIONS

Scholar  |   NCBI   |   DBLP  |   PubMed

 
TEACHING SUMMARY
 
Stony Brook Medicine: CS 595 Topics in Computer Science:  Data Analytics Software - Stacks
At Emory, Georgia Tech, Ohio State and U Maryland: Graduate seminars in Data Analytics, Graduate seminars in High Performance Computing
Biomedical Informatics I and II, Senior undergraduate database course, Senior undergraduate operating systems, Senior undergraduate computer architectures