Motivation

The proteasome is the main proteolytic machine for targeted protein degradation in archaea and eukaryotes. While some bacteria also possess the proteasome, most of them contain a simpler and more specialized homolog, the heat shock locus V protease. In recent years, three further homologs of the proteasome core subunits have been characterized in prokaryotes: Anbu, BPH and connectase. With the inclusion of these members, the family of proteasome-like proteins now exhibits a range of architectural and functional forms, from the canonical proteasome, a barrel-shaped protease without pronounced intrinsic substrate specificity, to the monomeric connectase, a highly specific protein ligase.

Results

We employed systematic sequence searches to show that we have only seen the tip of the iceberg so far and that beyond the hitherto known proteasome homologs lies a wealth of distantly related, uncharacterized homologs. We describe a total of 22 novel proteasome homologs in bacteria and archaea. Using sequence and structure analysis, we analyze their evolutionary history and assess structural differences that may modulate their function. With this initial description, we aim to stimulate the experimental investigation of these novel proteasome-like family members.

Availability and implementation

The protein sequences in this study are searchable in the MPI Bioinformatics Toolkit (https://toolkit.tuebingen.mpg.de) with ProtBLAST/PSI-BLAST and with HHpred (database ‘proteasome_homologs’). The following data are available at https://data.mendeley.com/datasets/t48yhff7hs/3: (i) sequence alignments for each proteasome-like homolog, (ii) the coordinates for their structural models and (iii) a cluster-map file, which can be navigated interactively in CLANS and gives direct access to all the sequences in this study.

Supplementary information

Supplementary data are available at Bioinformatics online.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.