Motivation

The inference of gene regulatory networks (GRNs) from DNA microarray measurements forms a core element of systems biology-based phenotyping. In the recent past, numerous computational methodologies have been formalized to enable the deduction of reliable and testable predictions in today’s biology. However, little focus has been aimed at quantifying how well existing state-of-the-art GRNs correspond to measured gene-expression profiles.

Results

Here, we present a computational framework that combines the formulation of probabilistic graphical modeling, standard statistical estimation, and integration of high-throughput biological data to explore the global behavior of biological systems and the global consistency between experimentally verified GRNs and corresponding large microarray compendium data. The model is represented as a probabilistic bipartite graph, which can handle highly complex network systems and accommodates partial measurements of diverse biological entities, e.g. messengerRNAs, proteins, metabolites and various stimulators participating in regulatory networks. This method was tested on microarray expression data from the M3D database, corresponding to sub-networks on one of the best researched model organisms, Escherichia coli. Results show a surprisingly high correlation between the observed states and the inferred system’s behavior under various experimental conditions.

Availability and implementation

Processed data and software implementation using Matlab are freely available at https://github.com/kotiang54/PgmGRNs. Full dataset available from the M3D database.

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model)