Motivation

Reference sequences are essential in creating a baseline of knowledge for many common bioinformatics methods, especially those using genomic sequencing.

Results

We have created refget, a Global Alliance for Genomics and Health API specification to access reference sequences and sub-sequences using an identifier derived from the sequence itself. We present four reference implementations across in-house and cloud infrastructure, a compliance suite and a web report used to ensure specification conformity across implementations.

Availability and implementation

The refget specification can be found at: https://w3id.org/ga4gh/refget.

Supplementary information

Supplementary data are available at Bioinformatics online.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.