Integrated multi-omics analysis of ovarian cancer using variational autoencoders

Muta Tah Hira, M A Razzaque, Claudio Angione, James Scrivens, Saladin Sawan, Mosharraf Sarker

Research output: Contribution to journalArticlepeer-review

265298 Downloads (Pure)


Cancer is a complex disease that deregulates cellular functions at various molecular levels (e.g., DNA, RNA, and proteins). Integrated multi-omics analysis of data from these levels is necessary to understand the aberrant cellular functions accountable for cancer and its development. In recent years, Deep Learning (DL) approaches have become a useful tool in integrated multi-omics analysis of cancer data. However, high dimensional multi-omics data are generally imbalanced with too many molecular features and relatively few patient samples. This imbalance makes a DL based integrated multi-omics analysis difficult. DL-based dimensionality reduction technique, including variational autoencoder (VAE), is a potential solution to balance high dimensional multi-omics data. However, there are few VAE-based integrated multi-omics analyses, and they are limited to pancancer. In this work, we did an integrated multi-omics analysis of ovarian cancer using the compressed features learned through VAE and an improved version of VAE, namely Maximum Mean Discrepancy VAE (MMD-VAE). First, we designed and developed a DL architecture for VAE and MMD-VAE. Then we used the architecture for mono-omics, integrated di-omics and tri-omics data analysis of ovarian cancer through cancer samples identification, molecular subtypes clustering and classification, and survival analysis. The results show that MMD-VAE and VAE-based compressed features can respectively classify the transcriptional subtypes of the TCGA datasets with an accuracy in the range of 93.2-95.5% and 87.1-95.7%. Also, survival analysis results show that VAE and MMD-VAE based compressed representation of omics data can be used in cancer prognosis. Based on the results, we can conclude that (i) VAE and MMD-VAE outperform existing dimensionality reduction techniques, (ii) integrated multi-omics analyses perform better or similar compared to their mono-omics counterparts, and (iii) MMD-VAE performs better than VAE in most omics dataset.

Original languageEnglish
Article number6265
JournalNature Scientific Reports
Issue number1
Publication statusPublished - 18 Mar 2021


Dive into the research topics of 'Integrated multi-omics analysis of ovarian cancer using variational autoencoders'. Together they form a unique fingerprint.

Cite this