Motivation

Cells are deemed the basic unit of life. However, many important functions of cells as well as their growth and reproduction are performed via the protein molecules located at their different organelles or locations. Facing explosive growth of protein sequences, we are challenged to develop fast and effective method to annotate their subcellular localization. However, this is by no means an easy task. Particularly, mounting evidences have indicated proteins have multi-label feature meaning that they may simultaneously exist at, or move between, two or more different subcellular location sites. Unfortunately, most of the existing computational methods can only be used to deal with the single-label proteins. Although the ‘iLoc-Animal’ predictor developed recently is quite powerful that can be used to deal with the animal proteins with multiple locations as well, its prediction quality needs to be improved, particularly in enhancing the absolute true rate and reducing the absolute false rate.

Results

Here we propose a new predictor called ‘pLoc-mAnimal’, which is superior to iLoc-Animal as shown by the compelling facts. When tested by the most rigorous cross-validation on the same high-quality benchmark dataset, the absolute true success rate achieved by the new predictor is 37% higher and the absolute false rate is four times lower in comparison with the state-of-the-art predictor.

Availability and implementation

To maximize the convenience of most experimental scientists, a user-friendly web-server for the new predictor has been established at http://www.jci-bioinfo.cn/pLoc-mAnimal/, by which users can easily get their desired results without the need to go through the complicated mathematics involved.

Supplementary information

Supplementary data are available at Bioinformatics online.

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/about_us/legal/notices)