Studies, mostly from the operations/management literature, have shown that the rate of human error increases with task complexity. What is not known is how many errors make it into the published literature, given that they must slip by peer-review. By identifying paired, dependent values within text for reported calculations of varying complexity, we can identify discrepancies, quantify error rates and identify mitigating factors.


We extracted statistical ratios from MEDLINE abstracts (hazard ratio, odds ratio, relative risk), their 95% CIs, and their P-values. We re-calculated the ratios and P-values using the reported CIs. For comparison, we also extracted percent–ratio pairs, one of the simplest calculation tasks. Over 486 000 published values were found and analyzed for discrepancies, allowing for rounding and significant figures. Per reported item, discrepancies were less frequent in percent–ratio calculations (2.7%) than in ratio–CI and P-value calculations (5.6–7.5%), and smaller discrepancies were more frequent than large ones. Systematic discrepancies (multiple incorrect calculations of the same type) were higher for more complex tasks (14.3%) than simple ones (6.7%). Discrepancy rates decreased with increasing journal impact factor (JIF) and increasing number of authors, but with diminishing returns and JIF accounting for most of the effect. Approximately 87% of the 81 937 extracted P-values were ≤ 0.05.


Using a simple, yet accurate, approach to identifying paired values within text, we offer the first quantitative evaluation of published error frequencies within these types of calculations.

Supplementary information

Supplementary data are available at Bioinformatics online.

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (