Abstract
Building gene co-expression network (GCN) from gene expression data is an important field of bioinformatic research. Nowadays, RNA-seq data provides high dimensional information to quantify gene expressions in term of read counts for individual exons of genes. Such an increase in the dimension of expression data during the transition from microarray to RNA-seq era made many previous co-expression analysis algorithms based on simple univariate correlation no longer applicable. Recently, two vector-based methods, SpliceNet and RNASeqNet, have been proposed to build GCN. However, they failed to work when sample size is less than the number of exons.
We develop an algorithm called VCNet to construct GCN from RNA-seq data to overcome this dimensional problem. VCNet performs a new statistical hypothesis test based on the correlation matrix of a gene–gene pair using the Frobenius norm. The asymptotic distribution of the new test is obtained under the null model. Simulation studies demonstrate that VCNet outperforms SpliceNet and RNASeqNet for detecting edges of GCN. We also apply VCNet to two expression datasets from TCGA database: the normal breast tissue and kidney tumour tissue, and the results show that the GCNs constructed by VCNet contain more biologically meaningful interactions than existing methods.
VCNet is a useful tool to construct co-expression network.
VCNet is open source and freely available from https://github.com/wangzengmiao/VCNet under GNU LGPL v3
Supplementary data are available at Bioinformatics online.