Summary

Based on the Genomic Data Sharing Policy issued in August 2007, the National Institutes of Health (NIH) has supported several repositories such as the database of Genotypes and Phenotypes (dbGaP). dbGaP is an online repository that provides access to large-scale genetic and phenotypic datasets with more than 1000 studies. However, navigating the website and understanding the relationship between the studies are not easy tasks. Moreover, the decryption of the files is a complex procedure. In this study we propose the dbgap2x R package that covers a broad range of functions for searching dbGaP studies, exploring the characteristics of a study and easily decrypting the files from dbGaP.

Availability and implementation

dbgap2x is an R package with the code available at https://github.com/gversmee/dbgap2x. A containerized version including the package, a Jupyter server and with a Notebook example is available at https://hub.docker.com/r/gversmee/dbgap2x.

Supplementary information

Supplementary data are available at Bioinformatics online.

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model)