Summary

Metagenomics and single-cell genomics have revolutionized the study of microorganisms, increasing our knowledge of microbial genomic diversity by orders of magnitude. A major issue pertaining to metagenome-assembled genomes (MAGs) and single-cell amplified genomes (SAGs) is to estimate their completeness and redundancy. Most approaches rely on counting conserved gene markers. In miComplete, we introduce a weighting strategy, where we normalize the presence/absence of markers by their median distance to the next marker in a set of complete reference genomes. This approach alleviates biases introduced by the presence/absence of shorter DNA pieces containing many markers, e.g. ribosomal protein operons.

Availability and implementation

miComplete is written in Python 3 and released under GPLv3. Source code and documentation are available at https://bitbucket.org/evolegiolab/micomplete.

Supplementary information

Supplementary data are available at Bioinformatics online.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.