Skip to Main Content


We are developing comparative genomics and phylogenic profiling concepts and approaches to analyze thousands of species genomes. We use them to uncover novel gene networks, predict genes function, suggest drug repositioning, and target specific genomic regions (at the nucleotide level) that enhance super-traits or reduce the risk of cancer. We aim to optimize phylogenetic profiling methods, which describe the conservation pattern of each gene in all species, and look for genes with similar patterns as possible associates in common biological pathways. We utilize the endless OMICS data available today and perform machine-learning based data integration.

Comparative Genomics

We use sequence conservation patterns to determine the significance of variants in genetic diseases and cancer. In another project, we compare protein sequences to look for unique sequences which render species resistant to cancer.

Phylogenetic Profiling

Phylogenetic profiling (PP) follows the conservation pattern for each gene across thousands of genomes and identifies genes with similar evolutionary patterns to infer functional interactions. We aim to use PP as a discovery tool to characterize gene function, understand biological pathways, and identify disease causing genes.

Through the years we’ve published many papers optimizing PP, and adding clade-wise analysis to improve predictability. Recently we employed PP to look for the genetic network of the gene MECP2 which is responsible for RETT syndrome, and ACE2 which is the target of the COVID19 virus.

Currently we are searching for disease causing genes in patients with hereditary diseases, and building gene networks to identify novel biological pathways in humans and plants.

Omics (Data Integration)

There is an exponentially growing amount of data freely available today. These data include genomic sequences, clinical data, mutations data, protein-protein interactions, gene and protein expression, phenotypes of different model species, diseases etc.

We take advantage of these enormous data to perform functional genomics, and use machine learning to integrate all these sources to uncover new functions for uncharacterized genes.