A growing amount of noncoding genetic variants, including single-nucleotide polymorphisms, are found to be associated with complex human traits and diseases. Their mechanistic interpretation is relatively limited and can use the help from computational prediction of their effects on epigenetic profiles. However, current models often focus on local, 1D genome sequence determinants and disregard global, 3D chromatin structure that critically affects epigenetic events.


We find that noncoding variants of unexpected high similarity in epigenetic profiles, with regards to their relatively low similarity in local sequences, can be largely attributed to their proximity in chromatin structure. Accordingly, we have developed a multimodal deep learning scheme that incorporates both data of 1D genome sequence and 3D chromatin structure for predicting noncoding variant effects. Specifically, we have integrated convolutional and recurrent neural networks for sequence embedding and graph neural networks for structure embedding despite the resolution gap between the two types of data, while utilizing recent DNA language models. Numerical results show that our models outperform competing sequence-only models in predicting epigenetic profiles and their use of long-range interactions complement sequence-only models in extracting regulatory motifs. They prove to be excellent predictors for noncoding variant effects in gene expression and pathogenicity, whether in unsupervised “zero-shot” learning or supervised “few-shot” learning.

Availability and implementation

Codes and data can be accessed at and

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.