Motivation

The efficiency of CRISPR/Cas9-mediated protein knockout is determined by three factors: sequence-specific sgRNA activity, frameshift probability and the characteristics of targeted amino acids. A number of computational methods have been developed for predicting sgRNA efficiency from different perspectives. However, an integrative method that combines all three factors for rational sgRNA selection is still lacking.

Results

We developed GuidePro, a two-layer ensemble predictor that enables the integration of multiple factors for the prioritization of sgRNAs in protein knockouts. Tested on independent datasets, GuidePro outperforms existing methods and demonstrates consistent superior performance in predicting phenotypes caused by protein loss-of-function, suggesting its robustness for prioritizing sgRNAs in various applications of CRISPR/Cas9 knockouts.

Availability and implementation

GuidePro is available at https://github.com/MDhewei/GuidePro. A web application for prioritizing sgRNAs that target protein-coding genes in human, monkey and mouse genomes is available at https://bioinformatics.mdanderson.org/apps/GuidePro.

Supplementary information

Supplementary data are available at Bioinformatics online.

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model)