Motivation

The rapid development of spatial transcriptomics (ST) approaches has provided new insights into understanding tissue architecture and function. However, the gene expressions measured at a spot may contain contributions from multiple cells due to the low-resolution of current ST technologies. Although many computational methods have been developed to disentangle discrete cell types from spatial mixtures, the community lacks a thorough evaluation of the performance of those deconvolution methods.

Results

Here, we present a comprehensive benchmarking of 14 deconvolution methods on four datasets. Furthermore, we investigate the robustness of different methods to sequencing depth, spot size and the choice of normalization. Moreover, we propose a new ensemble learning-based deconvolution method (EnDecon) by integrating multiple individual methods for more accurate deconvolution. The major new findings include: (i) cell2loction, RCTD and spatialDWLS are more accurate than other ST deconvolution methods, based on the evaluation of three metrics: RMSE, PCC and JSD; (ii) cell2location and spatialDWLS are more robust to the variation of sequencing depth than RCTD; (iii) the accuracy of the existing methods tends to decrease as the spot size becomes smaller; (iv) most deconvolution methods perform best when they normalize ST data using the method described in their original papers; and (v) the integrative method, EnDecon, could achieve more accurate ST deconvolution. Our study provides valuable information and guideline for practically applying ST deconvolution tools and developing new and more effective methods.

Availability and implementation

The benchmarking pipeline is available at https://github.com/SunXQlab/ST-deconvoulution. An R package for EnDecon is available at https://github.com/SunXQlab/EnDecon.

Supplementary information

Supplementary data are available at Bioinformatics online.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.