Motivation

Whole-exome sequencing (WES) is now being used in clinical practice for the diagnosis of the causal genes of Mendelian diseases. In order to make the diagnosis, however, the clinical phenotypes [e.g. Human Phenotype Ontology (HPO) terms] of a patient are needed for prioritizing the variants called from the WES data of the patient. Computational tools are therefore needed to standardize and accelerate this process.

Results

Here, we introduce a tool named PhenoPro for prioritizing the causal gene of Mendelian disease given both the HPO terms assigned to and the variants called from the WES data of a patient. PhenoPro has been benchmarked using both simulated patients and 287 real diagnosed patients of Chinese ancestry, and shows significant improvements over five previous tools. Moreover, the addition of an internal variant data of Chinese ancestry and the variant data from the patients’ parents can further improve PhenoPro’s performance. To make PhenoPro a fully automated tool, we also include a natural language processing component for automated HPO term assignment from clinical reports, and demonstrate that the natural language processing is as effective as manual HPO assignment using real clinical reports. In conclusion, PhenoPro can be used as a pre-screening tool to assist in the diagnosis of Mendelian disease genes.

Availability and implementation

The web server of PhenoPro is freely available at http://app.tianlab.cn.

Supplementary information

Supplementary data are available at Bioinformatics online.

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model)