Motivation

The introduction of portable DNA sequencers such as the Oxford Nanopore Technologies MinION has enabled real-time and in the field DNA sequencing. However, in the field sequencing is actionable only when coupled with in the field DNA classification. This poses new challenges for metagenomic software since mobile deployments are typically in remote locations with limited network connectivity and without access to capable computing devices.

Results

We propose new strategies to enable in the field metagenomic classification on mobile devices. We first introduce a programming model for expressing metagenomic classifiers that decomposes the classification process into well-defined and manageable abstractions. The model simplifies resource management in mobile setups and enables rapid prototyping of classification algorithms. Next, we introduce the compact string B-tree, a practical data structure for indexing text in external storage, and we demonstrate its viability as a strategy to deploy massive DNA databases on memory-constrained devices. Finally, we combine both solutions into Coriolis, a metagenomic classifier designed specifically to operate on lightweight mobile devices. Through experiments with actual MinION metagenomic reads and a portable supercomputer-on-a-chip, we show that compared with the state-of-the-art solutions Coriolis offers higher throughput and lower resource consumption without sacrificing quality of classification.

Availability and implementation

Source code and test data are available from http://score-group.org/?id=smarten.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.