Summary

Efficient storage and querying of large amounts of genetic and phenotypic data is crucial to contemporary clinical genetic research. This introduces computational challenges for classical relational databases, due to the sparsity and sheer volume of the data. Our Java based solution loads annotated genetic variants and well phenotyped patients into a graph database to allow fast efficient storage and querying of large volumes of structured genetic and phenotypic data. This abstracts technical problems away and lets researchers focus on the science rather than the implementation. We have also developed an accompanying webserver with end-points to facilitate querying of the database.

Availability and implementation

The Java and Python code are available at https://github.com/phenopolis/pheno4j.

Supplementary information

Supplementary data are available at Bioinformatics online.

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/about_us/legal/notices)