Bacteria use a variety of defense systems to protect themselves from phage infection. In turn, phages have evolved diverse counter-defense measures to overcome host defenses. Here, we use protein structural similarity and gene co-occurrence analyses to screen >66 million viral protein sequences and >330,000 metagenome-assembled genomes for the identification of anti-phage and counter-defense systems. We predict structures for ~300,000 proteins and perform large-scale, pairwise comparison to known anti-CRISPR (Acr) and anti-phage proteins to identify structural homologs that otherwise may not be uncovered using primary sequence search. This way, we identify a Bacteroidota phage Acr protein that inhibits Cas12a, ... More
Bacteria use a variety of defense systems to protect themselves from phage infection. In turn, phages have evolved diverse counter-defense measures to overcome host defenses. Here, we use protein structural similarity and gene co-occurrence analyses to screen >66 million viral protein sequences and >330,000 metagenome-assembled genomes for the identification of anti-phage and counter-defense systems. We predict structures for ~300,000 proteins and perform large-scale, pairwise comparison to known anti-CRISPR (Acr) and anti-phage proteins to identify structural homologs that otherwise may not be uncovered using primary sequence search. This way, we identify a Bacteroidota phage Acr protein that inhibits Cas12a, and an Akkermansia muciniphila anti-phage defense protein, termed BxaP. Gene bxaP is found in loci encoding Bacteriophage Exclusion (BREX) and restriction-modification defense systems, but confers immunity independently. Our work highlights the advantage of combining protein structural features and gene co-localization information in studying host-phage interactions.