Motivation

Clustering analysis in a biological network is to group biological entities into functional modules, thus providing valuable insight into the understanding of complex biological systems. Existing clustering techniques make use of lower-order connectivity patterns at the level of individual biological entities and their connections, but few of them can take into account of higher-order connectivity patterns at the level of small network motifs.

Results

Here, we present a novel clustering framework, namely HiSCF, to identify functional modules based on the higher-order structure information available in a biological network. Taking advantage of higher-order Markov stochastic process, HiSCF is able to perform the clustering analysis by exploiting a variety of network motifs. When compared with several state-of-the-art clustering models, HiSCF yields the best performance for two practical clustering applications, i.e. protein complex identification and gene co-expression module detection, in terms of accuracy. The promising performance of HiSCF demonstrates that the consideration of higher-order network motifs gains new insight into the analysis of biological networks, such as the identification of overlapping protein complexes and the inference of new signaling pathways, and also reveals the rich higher-order organizational structures presented in biological networks.

Availability and implementation

HiSCF is available at https://github.com/allenv5/HiSCF.

Supplementary information

Supplementary data are available at Bioinformatics online.

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model)