What is dbCAN3

dbCAN3 server is a web server for automated Carbohydrate-active enzyme ANnotation, funded by the NSF (DBI-1933521) and NIH (R01GM140370). Similar resources on the web include CAZy, CAT (obsolete), and CUPP. dbCAN3 server is an updated version of dbCAN (obsolete) and dbCAN2 (obsolete) , and has the following new features (thanks to dbCAN users all over the world for suggestions):

  • dbCAN3 server allows users predict glycan substrates for CAZymes by searching against dbCAN-sub, and for CAZyme gene clusters (CGCs) by using two approaches: searching against PULs of dbCAN-PUL and dbCAN-sub majority voting
  • dbCAN3 server, like dbCAN2, allows submission of nucleotide sequences: prokaryotic genomes (fna file) or metagenome assembled genomes (MAGs); for eukaryotic genomes, please still submit protein seqs (faa file)
  • dbCAN3 server, like dbCAN2, integrates three state-of-the-art tools/databases for automated CAZyme annotation:
    1. HMMER search for CAZyme family annotation vs. dbCAN CAZyme domain HMM database
    2. DIAMOND search for BLAST hits in the CAZy database
    3. HMMER search for CAZyme subfamily annotation vs. dbCAN-sub HMM database of CAZyme subfamilies (derived from eCAMI classification of CAZyDB families)
  • dbCAN3 server can identify transcription factors (TFs), transporters (TCs), signal transduction proteins (STPs), and further CAZyme gene clusters (CGCs) using CGC-Finder if users submit faa+gff files or fna file
  • dbCAN3 server combines the results from the three tools and allows visualization of detailed results as tables/graphs

dbCAN3 server will be updated once a year to use the most updated CAZy database, dbCAN HMMdb and dbCAN-sub HMMdb

News

  • 8/2/2023: dbCAN HMMdb v12 is released (based on CAZyDB 7/26/2023). Now the HMMdb contains 783 CAZyme HMMs (470 family HMMs + 3 bacterial cellulosome HMMs + 2 fungal cellulosome HMMs + 308 subfamily HMMs). The CAZyDB for Diamond search is also updated, containing in total 2,816,770 fasta sequences. See readme for details.
  • 05/01/2023: dbCAN3 paper is published at Nucleic Acids Research featuring substrate prediction
  • 02/11/2023: dbCAN2 is updated to dbCAN3 with glycan substrate prediction functions: 1. CAZyme substrate prediction based on dbCAN-sub ; 2. CGC substrate prediction based on dbCAN-PUL searching and dbCAN-sub majority voting. For CGC substrate prediction, please see our dbCAN-seq update paper for details. With these new functions (esp. the dbCAN-sub search), dbCAN3 is now slower to get the result back to you. Please be patience!
  • 8/9/2022: dbCAN HMMdb v11 is released (based on CAZyDB 8/7/2022). Now the HMMdb contains 699 CAZyme HMMs (452 family HMMs + 3 cellulosome HMMs + 244 subfamily HMMs). The CAZyDB for Diamond search is also updated, containing in total 2,428,817 fasta sequences. See readme for details.
  • 06/29/2022: dbCAN-sub (HMMdb from eCAMI subfams and allows EC and substrate inferences) is now deployed on dbCAN meta server and replaces eCAMI (consumes too much RAM and too slow).
  • 12/21/2021: updated run_dbcan python package to V3.0.1. Major updates include: (1) replaced Hotpep with eCAMI (recommended by an evaluation study); (2) added EC number in the overview output file (inferred by eCAMI); (3) formated cgc.out to make it more readable. The web server has been updated accordingly.
  • 10/03/2021: updated CAZyDB for Diamond search. Now this file contains 2,161,786 fasta sequences. The old CAZyDB fasta file CAZyDB.07292021.fa was deleted in the download folder.See readme for details.
  • 8/17/2021: dbCAN HMMdb v10 is released (based on CAZyDB 7/26/2020). Now the HMMdb contains 692 CAZyme HMMs (445 family HMMs + 3 cellulosome HMMs + 244 subfamily HMMs). The CAZyDB for Diamond search is also updated, containing in total 1,776,583 fasta sequences. See readme for details.
  • 04/28/2021: We received an NIH R01 award to continue the development of dbCAN family tools
  • 8/04/2020: dbCAN HMMdb v9 is released (based on CAZyDB 7/30/2020). Now the HMMdb contains 681 CAZyme HMMs (434 family HMMs + 3 cellulosome HMMs + 244 subfamily HMMs). The CAZyDB for Diamond search is also updated, containing in total 1,716,043 fasta sequences. See readme for details.
  • 04/21/2020: dbCAN2 Hotpep PPR patterns updated to most recent release of CAZyDB (2019). Also missing group EC# files for families added in.
  • 10/07/2019  run_dbcan python package   is released. You should not only use pip install run-dbcan==2.0.0 to download it, but also install Miniconda or Anaconda as well to install dependencies packages(conda install -c bioconda diamond hmmer=3.1b2 prodigal fraggenescan). And then use only one command to download and compress all the related databases from Download section. Read more on run_dbcan2.
  • /08/2019: dbCAN HMMdb v8 is released (based on CAZyDB 7/26/2019). Now the HMMdb contains 641 CAZyme HMMs (421 family HMMs + 3 cellulosome HMMs + 217 subfamily HMMs). The CAZyDB for Diamond search is also updated, containing in total 1,386,849 fasta sequences. See readme for details.
  • 4/01/2019: dbCAN2 has a docker version written by Haidong Yi.
  • 3/19/2019: dbCAN2 web server has moved to UNL and has a new URL
  • 1/20/2019: dbCAN2 standalone package is available on github; if you prefer to still use the old hmmscan way, the data are available in the download page
  • 8/25/2018: dbCAN HMMdb v7 is released (based on CAZyDB 7/31/2018): HMMs of 15 new families were added (AA14, AA15, CBM82, CBM83, GH146, GH147, GH148, GH149, GH150, GH151, GH152, GH153, GT105, GT106, PL28), GT2 family HMM now is replaced with 8 Pfam HMMs (GT2_Chitin_synth_1, GT2_Chitin_synth_2, GT2_Glycos_transf_2, GT2_Glyco_tranf_2_2, GT2_Glyco_tranf_2_3, GT2_Glyco_tranf_2_4, GT2_Glyco_tranf_2_5, GT2_Glyco_trans_2_3)
  • 5/2/2018: dbCAN2 meta server paper is accepted to publish at Nucleic Acids Research
  • 8/15/2017: Tanner and Le Huang begin to work on dbCAN2 meta server
  • 7/1/2017: Yanbin is awarded the NSF CAREER grant for CAZyme bioinformatics research