Motivation

Cancer is a molecular complex and heterogeneous disease. Each type of cancer is usually composed of several subtypes with different treatment responses and clinical outcomes. Therefore, subtyping is a crucial step in cancer diagnosis and therapy. The rapid advances in high-throughput sequencing technologies provide an increasing amount of multi-omics data, which benefits our understanding of cancer genetic architecture, and yet poses new challenges in multi-omics data integration.

Results

We propose a graph convolutional network model, called MRGCN for multi-omics data integrative representation. MRGCN simultaneously encodes and reconstructs multiple omics expression and similarity relationships into a shared latent embedding space. In addition, MRGCN adopts an indicator matrix to denote the situation of missing values in partial omics, so that the full and partial multi-omics processing procedures are combined in a unified framework. Experimental results on 11 multi-omics datasets show that cancer subtypes obtained by MRGCN with superior enriched clinical parameters and log-rank test P-values in survival analysis over many typical integrative methods.

Availability and implementation

https://github.com/Polytech-bioinf/MRGCN.githttps://figshare.com/articles/software/MRGCN/23058503.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.