Motivation

The existing epistasis analysis approaches have been criticized mainly for their: (i) ignoring heterogeneity during epistasis analysis; (ii) high computational costs; and (iii) volatility of performances and results. Therefore, they will not perform well in general, leading to lack of reproducibility and low power in complex disease association studies. In this work, a fast scheme is proposed to accelerate exhaustive searching based on multi-objective optimization named ESMO for concurrently analyzing heterogeneity and epistasis phenomena. In ESMO, mutual entropy and Bayesian network approaches are combined for evaluating epistatic SNP combinations. In order to be compatible with heterogeneity of complex diseases, we designed an adaptive framework based on non-dominant sort and top k selection algorithm with improved time complexity O(k*M*N). Moreover, ESMO is accelerated by strategies such as trading space for time, calculation sharing and parallel computing. Finally, ESMO is nonparametric and model-free.

Results

We compared ESMO with other recent or classic methods using different evaluating measures. The experimental results show that our method not only can quickly handle epistasis, but also can effectively detect heterogeneity of complex population structures.

Availability and implementation

https://github.com/XiongLi2016/ESMO/tree/master/ESMO-common-master.

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/about_us/legal/notices)