Summary: Tests for differential gene expression with RNA-seq data have a tendency to identify certain types of transcripts as significant, e.g. longer and highly-expressed transcripts. This tendency has been shown to bias gene set enrichment (GSE) testing, which is used to find over- or under-represented biological functions in the data. Yet, there remains a surprising lack of tools for GSE testing specific for RNA-seq. We present a new GSE method for RNA-seq data, RNA-Enrich, that accounts for the above tendency empirically by adjusting for average read count per gene. RNA-Enrich is a quick, flexible method and web-based tool, with 16 available gene annotation databases. It does not require a P-value cut-off to define differential expression, and works well even with small sample-sized experiments. We show that adjusting for read counts per gene improves both the type I error rate and detection power of the test.

Availability and implementation: RNA-Enrich is available at http://lrpath.ncibi.org or from supplemental material as R code.

Contact:  [email protected]

Supplementary information: Supplementary data are available at Bioinformatics online.