Ramana V Davuluri


Profile Photo     

Ph.D., Professor

Department of Biomedical Informatics
Stony Brook Cancer Center
Stony Brook, NY 11794
Phone:  (631) 638-2590
Email:  Ramana.Davuluri@stonybrookmedicine.edu

Bioinformatics; Cancer Genomics; Isoform-level Gene Regulation; Epigenetics; Alternative Promoters and Alternative Splicing. 

Dr Davuluri is a leader in bioinformatics and data science research. He has more than 20 years of progressive working experience in the areas of bioinformatics and computational genomics; a strong background in statistics and computer science; and synergistic expertise in statistical pattern recognition, machine learning, genomics and bioinformatics. He has served as PI of several multi-investigator and multi-site projects. His lab currently focuses on developing machine learning algorithms and informatics solutions for problems in isoform-level gene regulation and precision-medicine. The overarching goal of the lab is to translate data from high dimensional (-omic) platforms (e.g., NextGen sequencing) to derive experimentally interpretable and testable discovery models towards genomics-based decision support systems. His group is developing bioassays that can rapidly identify biomarkers from human tissue and blood samples. Towards these goals, his group applies a combination of state-of-the-art statistically rigorous data-mining methods and high-throughput experimental procedures in a systems biology setting.
Hypothesis: The central hypothesis of his laboratory research is that the isoform-level gene products “transcript variants” and “protein isoforms” are the basic functional units in the mammalian cell, and accordingly, the informatics platforms – ranging from basic molecular biology data management systems to the biomarker and therapeutic drug target discovery for precision medicine – should adapt “gene isoform centric” rather than “gene centric” approaches
Project Figure
The complexity of mammalian gene structure and its regulation (see our review paper in Pharmacology and Therapeutics journal for further discussion).

Summary of Ongoing and Past Research Projects:

DNABERT: pre-trained Bidirectional Encoder Representations from Transformers model for DNA-language in genome: Understanding the hidden instructions within genome on gene regulation is crucial for biological research. However, complex language patterns widely exist in DNA, such as polysemy and distant semantic relationship, which previous methods often fail to capture especially in data-scarce scenarios. For the first time, Davuluri group (in collaboration with Dr Han Liu, Department of Computer Science, Northwestern University) is developing DNABERT to form global understanding of genomic sequences based on up and downstream sequence contexts. Using an innovative global contextual embedding of input sequences, DNABERT attempts to tackle the problem of sequence specificity prediction with a “top-down” approach by developing general understanding of DNA language via self-supervised pre-training and applying it to specific tasks (for example, prediction of promoters, transcription factor binding sites and splice sites), in contrast to the traditional “bottom-up” approach using task-specific data. Various modules of DNABERT are currently under development. It is anticipated that the pre-trained DNABERT on the human genome can also be readily applied to data from other organisms with exceptional performance.

Algorithms and bioinformatics software for promoter prediction: As a postdoctoral fellow in Michael Zhang group, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, Davuluri developed novel algorithms and computer programs for predicting gene-promoters and non-coding first exons, a highly complex problem that remained as a critical gap in gene-prediction for several years. These creative and groundbreaking approaches facilitated the prediction and annotation of Pol-II promoters in human and mouse genomes.

Data-mining methods for integrative analysis of transcriptome and epigenome data: In collaboration with Dr. Tim Huang group at Ohio State University Comprehensive Cancer Center, Davuluri performed integrative microarray technology and statistical modeling approaches to predict which proteins work with estrogen to contribute to breast cancer development. The computational predictions in this study indicated that the interaction of estrogen with one of seven different partner proteins determines whether the gene is activated or suppressed in breast cancer cells. This was a noteworthy data-mining methodology breakthrough because it allowed integrative analysis of big datasets, often consisting of expression, chromatin landscaping data and TF-binding information for thousands of genomic loci, for predicting small networks of transcription factors, which then could be validated by more traditional experimental biology techniques. This is one of the early integrative chromatin immunoprecipitation (ChIP)-chip/ChIP-seq and gene expression profiling studies, which laid foundations for efficient methods for integrative analysis of multi-omics data on a genomic scale.

Isoform-level gene expression and regulation in mammalian development and cancer: Recent genome-wide studies have discovered that majority of human genes produce multiple transcript-variants/protein-isoforms, which could be involved in different functional pathways. Moreover, altered expression of specific isoforms for numerous genes is linked with cancer and its prognosis, as cancer cells manipulate regulatory mechanisms to express specific isoforms that confer drug resistance and survival advantages. For example, cancer-associated alterations in alternative exons and splicing machinery have been identified in cancer samples, suggesting that specific transcript-variants could be more effective as diagnostic and prognostic markers than corresponding genes. In a recent study, Davuluri group discovered that majority of genes associated with neurological diseases expressed multiple transcripts through alternative promoters by using integrative NextGen sequencing based experimental approaches and bioinformatics analysis. The study also observed aberrant use of alternative promoters and splice variants in different cancers. Subsequently, his group demonstrated that cancer cell-lines regardless of their tissue of origin can be effectively discriminated from non-cancer cell-lines at isoform-level, but not at gene-level. The novel informatics methods have been successfully applied by his collaborators in different cancer studies.

Platform-independent isoform-level gene signatures for stratification of cancer patients into molecular subgroups: Based on recent studies from Davuluri group and others, significant expression differences were observed between different sample groups (e.g., developmental stages, cancer subtypes, normal vs cancer) for numerous genes at the isoform-level but not at the overall gene-level. Davuluri group investigated whether the isoform-level transcriptome changes could provide better patient stratification in terms of overall prognosis and classification accuracy. His group developed novel methods, by integrating data discretization, feature selection, and meta-classification algorithms, for derivation of platform-independent gene signature for multi-label molecular stratification of cancer patients, from exon-array and RNA-seq data. The application of these algorithms has led to the development of new methods for diagnosis of glioblastoma and other cancers and investigation of alternative splicing on drug-target gene interactions.

Algorithms and bioinformatics software for analyses of NextGen sequence data: Mapping genome-wide data to human subtelomeres has been problematic due to the incomplete assembly and challenges of lowcopy repetitive DNA elements. Davuluri group developed novel bioinformatics pipelines for incorporating multi-read mapping for annotation of the updated assemblies using short-read data sets from ChIP-seq data, and RNA-seq data. As part of other collaborative efforts, his group developed bioinformatics methods for identification of single-nucleotide polymorphisms (SNPs) that alter miRNA gene regulation and influence tumor susceptibility. Similarly, his group played a pivotal role in the development of informatics methods required for analysis of small-RNA sequence data, with Nishikura group at Wistar Institute, Philadelphia, PA.


Google Scholar | PubMed

Selected publications:

  1. Davuluri RV, Grosse I, Zhang MQ. Computational identification of promoters and first exons in the human genome. Nat Genet. 2001;29(4):412-7.PubMed PMID: 11726928.
  2. Sun H, Wu J, Wickramasinghe P, Pal S, Gupta R, Bhattacharyya A, Agosto-Perez FJ, Showe LC, Huang TH, Davuluri RV. Genome-wide mapping of RNA Pol-II promoter usage in mouse tissues by ChIP-seq. Nucleic Acids Res. 2011;39(1):190-201. PubMed PMID: 20843783; PMCID: 3017616.
  3. Pal S, Gupta R, Kim H, Wickramasinghe P, Baubet V, Showe LC, Dahmane N, Davuluri RV. Alternative transcription exceeds alternative splicing in generating the transcriptome diversity of cerebellar development. Genome Res. 2011;21(8):1260-72. PubMed PMID: 21712398; PMCID: 3149493.
  4. Cheng AS, Jin VX, Fan M, Smith LT, Liyanarachchi S, Yan PS, Leu YW, Chan MW, Plass C, Nephew KP, Davuluri RV, Huang TH. Combinatorial analysis of transcription factor partners reveals recruitment of c-MYC to estrogen receptor-alpha responsive promoters. Mol Cell. 2006;21(3):393-404. PubMed PMID: 16455494.
  5. Jin VX, Leu YW, Liyanarachchi S, Sun H, Fan M, Nephew KP, Huang TH, Davuluri RV. Identifying estrogen receptor alpha target genes using integrated computational genomics and chromatin immunoprecipitation microarray. Nucleic Acids Res. 2004;32(22):6627-35. PubMed PMID: 15608294; PMCID: 545447.
  6. Jin HJ, Jung S, DebRoy AR, Davuluri RV. Identification and validation of regulatory SNPs that modulate transcription factor chromatin binding and gene expression in prostate cancer. Oncotarget. 2016;7(34):54616-26. doi: 10.18632/oncotarget.10520. PubMed PMID: 27409348; PMCID: PMC5338917.
  7. Pal S, Gupta R, Davuluri RV. Alternative transcription and alternative splicing in cancer. Pharmacol Ther. 2012;136(3):283-94. PubMed PMID: 22909788.
  8. Pal S, Gupta R, Kim H, Wickramasinghe P, Baubet V, Showe LC, Dahmane N, Davuluri RV. Alternative transcription exceeds alternative splicing in generating the transcriptome diversity of cerebellar development. Genome Res. 2011;21(8):1260-72. PubMed PMID: 21712398; PMCID: 3149493.
  9. Zhang Z, Pal S, Bi Y, Tchou J, Davuluri RV. Isoform-level expression profiles provide better cancer signatures than gene-level expression profiles. Genome medicine. 2013;5(4):33. PubMed PMID: 23594586.
  10. Wang G, Biswas AK, Ma W, Kandpal M, Coker C, Grandgenett PM, Jain R, Tanji K, Lόpez-Pintado S, Borczuk A, Hebert D, Jenkitkasemwong S, Knutson MD, Fukada T, Davuluri R, Sage J, Acharyya S. Metastatic cancers promote cachexia through Zip14 upregulation in skeletal muscle. Nat Medicine. 2018; 24(6):770-781. PubMed PMID: 21712398; PMCID: PMC6015555.
  11. Pal S, Bi Y, Macyszyn L, Showe LC, O'Rourke DM, Davuluri RV. Isoform-level gene signature improves prognostic stratification and accurately classifies glioblastoma subtypes. Nucleic Acids Res. 2014;42(8):e64. PubMed PMID: 24503249; PMCID: 4005667.
  12. Shilpi A, Kandpal M, Ji Y, Seagle BL, Shahabi S, Davuluri RV. Platform-independent classification system for predicting high-grade serous ovarian carcinoma molecular subtypes. JCO Clinical Cancer Informatics 2019; 3:1-9,. PMCID: PMC5224237.
  13. Ji Y, Mishra R, Davuluri RV (2020) In silico analysis of alternative splicing on drug target genes. Sci Rep. 10(1):134. PMCID: PMC6954184.
  14. Dapas M, Kandpal M, Bi Y, Davuluri RV. Comparative evaluation of isoform-level gene expression estimation algorithms for RNA-seq and exon-array platforms. Briefings in Bioinformatics. 2017;18(2):260-9 doi: 10.1093/bib/bbw016. PubMed PMID: 26944083; PMCID: PMC5444266.
  15. Ota H, Sakurai M, Gupta R, Valente L, Wulff BE, Ariyoshi K, Iizasa H, Davuluri RV, Nishikura K. ADAR1 forms a complex with Dicer to promote microRNA processing and RNA-induced gene silencing. Cell. 2013;153(3):575-89. PubMed PMID: 23622242; PMCID: 3651894.
Dr. Davuluri’s awards include Young Scientist Award – Merit Certificate (Statistics) from the Indian Science Congress Association, 84th annual session (1996-97); V Scholar Award, The V foundation for Cancer Research. He held Philadelphia Healthcare Trust Endowed Chair and Tobin Kestenbaum Family Endowed Professor, while on faculty at The Wistar Institute, Philadelphia. He is currently serving as a regular member of Biomedical Informatics, Library and Data Science (BILDS) Review Committee, National Library of Medicine, NIH.

Davuluri has trained 20 graduate students and postdocs, and 4 junior investigators so far. These include that have gone on to faculty positions in academia (Sharmistha Pal, Scientist, Dana Farber Cancer Institute, Harvard University, Boston, MA; Hao Sun, Associate Professor, The Chinese University of Hong Kong, Hong Kong; Victor Jin, Professor, Dept of Molecular Medicine, UTHSCSA, San Antonio, TX) or leadership positions in industry or academia (Yingtao Bi, Bioinformatics Director, AbbVie Inc, Boston).

Course Director and Teaching:

  1. Course Director, Advanced Bioinformatics and Genome Informatics, offered to PhD (Informatics Track) and MS (Biostatistics/Bioinformatics) students at Northwestern University, Chicago, IL.
  2. Course Director, Cancer Genetics: High throughput Technologies, offered to PhD and MS students in genomics and bioinformatics tracks at The Ohio State University, Columbus, OH.