Motivation

In high-dimensional genetic/genomic data, the identification of genes related to clinical survival trait is a challenging and important issue. In particular, right-censored survival outcomes and contaminated biomarker data make the relevant feature screening difficult. Several independence screening methods have been developed, but they fail to account for gene–gene dependency information, and may be sensitive to outlying feature data.

Results

We improve the inverse probability-of-censoring weighted (IPCW) Kendall’s tau statistic by using Google’s PageRank Markov matrix to incorporate feature dependency network information. Also, to tackle outlying feature data, the nonparanormal approach transforming the feature data to multivariate normal variates are utilized in the graphical lasso procedure to estimate the network structure in feature data. Simulation studies under various scenarios show that the proposed network-adjusted weighted Kendall’s tau approach leads to more accurate feature selection and survival prediction than the methods without accounting for feature dependency network information and outlying feature data. The applications on the clinical survival outcome data of diffuse large B-cell lymphoma and of The Cancer Genome Atlas lung adenocarcinoma patients demonstrate clearly the advantages of the new proposal over the alternative methods.

Supplementary information

Supplementary data are available at Bioinformatics online.

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model)