Motivation: To discover and study periodic processes in biological systems, we sought to identify periodic patterns in their gene expression data. We surveyed a large number of available methods for identifying periodicity in time series data and chose representatives of different mathematical perspectives that performed well on both synthetic data and biological data. Synthetic data were used to evaluate how each algorithm responds to different curve shapes, periods, phase shifts, noise levels and sampling rates. The biological datasets we tested represent a variety of periodic processes from different organisms, including the cell cycle and metabolic cycle in Saccharomyces cerevisiae, circadian rhythms in Mus musculus and the root clock in Arabidopsis thaliana.

Results: From these results, we discovered that each algorithm had different strengths. Based on our findings, we make recommendations for selecting and applying these methods depending on the nature of the data and the periodic patterns of interest. Additionally, these results can also be used to inform the design of large-scale biological rhythm experiments so that the resulting data can be used with these algorithms to detect periodic signals more effectively.

Contact:  [email protected]

Supplementary information:  Supplementary data are available at Bioinformatics online.