Motivation

Identifying the kinase–substrate relationships is vital to understanding the phosphorylation events and various biological processes, especially signal transductions. Although large amount of phosphorylation sites have been detected, unfortunately, it is rarely known which kinases activate those sites. Despite distinct computational approaches have been proposed to predict the kinase–substrate interactions, the prediction accuracy still needs to be improved.

Results

In this paper, we propose a novel probabilistic model named as PhosD to predict kinase–substrate relationships based on protein domains with the assumption that kinase–substrate interactions are accomplished with kinase–domain interactions. By further taking into account protein–protein interactions, our PhosD outperforms other popular approaches on several benchmark datasets with higher precision. In addition, some of our predicted kinase–substrate relationships are validated by signaling pathways, indicating the predictive power of our approach. Furthermore, we notice that given a kinase, the more substrates are known for the kinase the more accurate its predicted substrates will be, and the domains involved in kinase–substrate interactions are found to be more conserved across proteins phosphorylated by multiple kinases. These findings can help develop more efficient computational approaches in the future.

Availability and Implementation

The data and results are available at http://comp-sysbio.org/phosd

Supplementary information

Supplementary data are available at Bioinformatics online.

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/about_us/legal/notices)