Motivation

Generalized Read-Across (GenRA) is a data-driven approach to estimate physico-chemical, biological or eco-toxicological properties of chemicals by inference from analogues. GenRA attempts to mimic a human expert’s manual read-across reasoning for filling data gaps about new chemicals from known chemicals with an interpretable and automated approach based on nearest-neighbors. A key objective of GenRA is to systematically explore different choices of input data selection and neighborhood definition to objectively evaluate predictive performance of automated read-across estimates of chemical properties.

Results

We have implemented genra-py as a python package that can be freely used for chemical safety analysis and risk assessment applications. Automated read-across prediction in genra-py conforms to the scikit-learn machine learning library's estimator design pattern, making it easy to use and integrate in computational pipelines. We demonstrate the data-driven application of genra-py to address two key human health risk assessment problems namely: hazard identification and point of departure estimation.

Availability and implementation

The package is available from github.com/i-shah/genra-py.

This work is written by US Government employees and is in the public domain in the US.