Created by Yanbin Yin (yyin@unl.edu) 6/30/2019 This folder has flat files with all the computationally predicted Acrs and Acr-Aca loci, as well as known/published Acr and Acr proteins and their homologs. Statistics: 13732 bacterial genomes (completely assembled), 748 archaeal genomes (completely or partially assembled), and 7851 viral genomes (completely assembled) of the NCBI RefSeq database were analyzed. Please find the pipeline at http://bcb.unl.edu/AcrDB/Download/knownAcrAca/web-pipeline.pdf. Briefly, each genome was analyzed: 1. with both homology-based approach: diamond search with 45 experimentally characterized Acr proteins as query and 2. with non-homology-based methods 2.1. guilt-by-association approach: finding Aca homologs first and then looking at the genomic neighborhood; 2.2. self-targeting spacer idea: using CRISPRCAS-Finder to find complete CRISPR-Cas loci first and then looking for CRISPR spacer target within the self-genome for identical blastn match (self-targeting spacer). The knownAcrAca/ folder has a pdf doc of the computational pipeline and two folders: (i) 45 known Acr proteins and (ii) 401 Aca proteins. Each folder has a readme.txt file to explain howi/where are data collected. These data were used for identifying new Acr-Aca loci using the pipeline //bcb.unl.edu/AcrDB/Download/knownAcrAca/web-pipeline.pdf. The other three folders: archaea_result/ (237 genomes), bacteria_result/ (1790 genomes), and virus_result/ (80 genomes) organized the predictions with respect to the three kingdoms. - In each folder, you will find guilt_by_association/ and homologs/, which have Acr-Aca loci predicted by guilt by association (GBA) and Acr homologs predicted by homology search. Inside the two folders, you will see two tarballs and two folders. The tarballs were created from the folders. The folders are organized with each genome (NCBI GCF ID) as a file. A lot of file has size=0, meaning there is no Acrs and Acr-Aca loci in that genome. Particularly, none of the archaea genomes have homologs of known Acr proteins. - In archaea_result/ and bacteria_result/, you will also see CRISPRCas-Finder folder which has the CRISPR-Cas_summary.tar.gz file. This file was created by a CRISPRCas-Finder run on all the genomes to identify putative CRISPR-Cas systems.