Introduction to dbCAN-PUL

dbCAN-PUL is a data repository of prokaryotic CAZyme-containing gene clusters that have been experimentally validated to act on a carbohydrate substrate (also known as polysaccharide utilization loci or PULs). In contrast to similar resources such as PULDB, this repository serves as a database containing the most experimentally verified PULs with a confirmed carbohydrate substrate, as well as range from different phyla and comprised of different metabolism systems (as opposed to only the Bacteroidetes and the Starch utilization system (Sus) gene homologs). dbCAN-PUL has the following features:

PULs from 10 different phyla and 173 number of species as well as loci from metagenomic sequences
Contains both degradative and synthetic CAZyme-containing loci
Metadata for PULs such as substrate and method of experimental verification
Users can retrieve protein sequence, enzyme commission number and dbCAN2 annotations for CAZymes and other proteins
View homologous loci in NCBI GenBank though the integration of the MultiGeneBlast tool
Users can query their own sequences against proteins in PULs in dbCAN-PUL using a BLASTX search
Batch download of data for PULs

dbCAN-PUL will be updated once a year to include new experimentally validated CAZyme containing gene clusters.

1. dbCAN-PUL updated date: 05-09-2021
Top 20 distribution of PUL according to substrate00101020203030404050506060Get_Substrate_entries?substrate=capsule polysaccharidecapsule polysaccharide6257.865384615384606217.0Get_Substrate_entries?substrate=xylanxylan4082.82692307692307291.03846153846155Get_Substrate_entries?substrate=O-antigenO-antigen39107.78846153846152294.4038461538462Get_Substrate_entries?substrate=N-glycanN-glycan34132.75311.2307692307692Get_Substrate_entries?substrate=O-glycanO-glycan30157.71153846153845324.6923076923077Get_Substrate_entries?substrate=pectinpectin28182.6730769230769331.4230769230769Get_Substrate_entries?substrate=beta-glucanbeta-glucan28207.63461538461542331.4230769230769Get_Substrate_entries?substrate=cellobiosecellobiose23232.59615384615387348.25Get_Substrate_entries?substrate=alginatealginate21257.5576923076923354.9807692307692Get_Substrate_entries?substrate=lactoselactose17282.5192307692307368.44230769230774Get_Substrate_entries?substrate=glucomannanglucomannan15307.48076923076917375.1730769230769Get_Substrate_entries?substrate=galactomannangalactomannan15332.4423076923076375.1730769230769Get_Substrate_entries?substrate=arabinanarabinan15357.40384615384613375.1730769230769Get_Substrate_entries?substrate=chitinchitin15382.3653846153846375.1730769230769Get_Substrate_entries?substrate=cellulosecellulose15407.32692307692304375.1730769230769Get_Substrate_entries?substrate=host glycanhost glycan14432.2884615384615378.53846153846155Get_Substrate_entries?substrate=mucinmucin14457.24999999999994378.53846153846155Get_Substrate_entries?substrate=lichenanlichenan13482.2115384615384381.9038461538462Get_Substrate_entries?substrate=starchstarch11507.17307692307685388.63461538461536Get_Substrate_entries?substrate=exopolysaccharideexopolysaccharide11532.1346153846154388.63461538461536Top 20 distribution of PUL according to substratecapsule polysa…capsule polysaccharidexylanO-antigenN-glycanO-glycanpectinbeta-glucancellobiosealginatelactoseglucomannangalactomannanarabinanchitincellulosehost glycanmucinlichenanstarchexopolysacchar…exopolysaccharide
The characterized gene clusters in dbCAN-PUL have 126 different types of substrates. The most abundant substrate is characterized is capsule polysaccharide. Displayed here are the top 20 most frequent substrates in dbCAN-PUL, and an extended version of this barplot that illustrates the substrates across all PULs can be found on the Statistics page. Clicking a bar in the barplot links to a table of all PULs that share that type of substrate.
Top 20 distribution of PUL according to taxonomy rank (genus)002020404060608080100100120120140140160160Get_Organism_entries?organism=BacteroidesBacteroides17957.27692307692307217.00000000000003Get_Organism_entries?organism=EscherichiaEscherichia3981.98461538461538380.19295229909756Get_Organism_entries?organism=BifidobacteriumBifidobacterium39106.69230769230768380.19295229909756Get_Organism_entries?organism=uncultureduncultured34131.4386.0212720240653Get_Organism_entries?organism=StreptococcusStreptococcus24156.1076923076923397.6779114740009Get_Organism_entries?organism=LactobacillusLactobacillus19180.8153846153846403.50623119896863Get_Organism_entries?organism=AcinetobacterAcinetobacter17205.5230769230769405.83755908895574Get_Organism_entries?organism=FlavobacteriumFlavobacterium16230.23076923076925407.0032230339493Get_Organism_entries?organism=BacillusBacillus16254.93846153846155407.0032230339493Get_Organism_entries?organism=RuminiclostridiumRuminiclostridium13279.6461538461539410.5002148689299Get_Organism_entries?organism=ZobelliaZobellia12304.3538461538462411.66587881392354Get_Organism_entries?organism=ChitinophagaChitinophaga10329.0615384615385413.9972067039106Get_Organism_entries?organism=PseudoalteromonasPseudoalteromonas10353.7692307692308413.9972067039106Get_Organism_entries?organism=PrevotellaPrevotella9378.4769230769231415.1628706489041Get_Organism_entries?organism=GeobacillusGeobacillus9403.1846153846154415.1628706489041Get_Organism_entries?organism=ClostridiumClostridium8427.8923076923077416.3285345938977Get_Organism_entries?organism=XanthomonasXanthomonas8452.6416.3285345938977Get_Organism_entries?organism=AlteromonasAlteromonas8477.3076923076923416.3285345938977Get_Organism_entries?organism=VibrioVibrio8502.0153846153846416.3285345938977Get_Organism_entries?organism=GramellaGramella7526.723076923077417.4941985388913Top 20 distribution of PUL according to taxonomy rank (genus)BacteroidesEscherichiaBifidobacteriumunculturedStreptococcusLactobacillusAcinetobacterFlavobacteriumBacillusRuminiclostrid…RuminiclostridiumZobelliaChitinophagaPseudoalteromo…PseudoalteromonasPrevotellaGeobacillusClostridiumXanthomonasAlteromonasVibrioGramella
dbCAN-PUL features PULs from 87 prokaryotic genera as well as metagenomically derived organisms. Here we show the 20 most abundant genera/taxonomic groups, with Bacteroides being the most frequent. The extended version of this barplot illustrates the genera and taxonomic groups across all PULs can be found on the Statistics page. Clicking a bar in the barplot links to a table of all PULs that belong the same genus or taxonomic group.
Top 20 distribution of PUL according to characterization method002020404060608080100100120120140140160160Query_Entry_By_Experimental?method=enzyme activity assayenzyme activity assay16557.27692307692307216.99999999999997Query_Entry_By_Experimental?method=sequence homology analysissequence homology analysis11981.98461538461538275.17016317016316Query_Entry_By_Experimental?method=microarraymicroarray94106.69230769230768306.78438228438233Query_Entry_By_Experimental?method=RNA-SeqRNA-Seq57131.4353.57342657342656Query_Entry_By_Experimental?method=gene deletion mutant and growth assaygene deletion mutant and growth assay55156.1076923076923356.1025641025641Query_Entry_By_Experimental?method=NMRNMR47180.8153846153846366.2191142191142Query_Entry_By_Experimental?method=qPCRqPCR45205.5230769230769368.74825174825173Query_Entry_By_Experimental?method=sugar utilization assaysugar utilization assay41230.23076923076925373.8065268065268Query_Entry_By_Experimental?method=RT-PCRRT-PCR40254.93846153846155375.0710955710956Query_Entry_By_Experimental?method=fosmid library screenfosmid library screen36279.6461538461539380.12937062937067Query_Entry_By_Experimental?method=microscopymicroscopy33304.3538461538462383.9230769230769Query_Entry_By_Experimental?method=mass spectrometrymass spectrometry32329.0615384615385385.1876456876457Query_Entry_By_Experimental?method=qRT-PCRqRT-PCR30353.7692307692308387.71678321678326Query_Entry_By_Experimental?method=growth assaygrowth assay25378.4769230769231394.039627039627Query_Entry_By_Experimental?method=thin layer chromatographythin layer chromatography23403.1846153846154396.56876456876455Query_Entry_By_Experimental?method=clone and expressionclone and expression21427.8923076923077399.0979020979021Query_Entry_By_Experimental?method=high performance anion exchange chromatographyhigh performance anion exchange chromatography19452.6401.6270396270396Query_Entry_By_Experimental?method=Northern BlotNorthern Blot18477.3076923076923402.8916083916084Query_Entry_By_Experimental?method=RT-qPCRRT-qPCR17502.0153846153846404.15617715617714Query_Entry_By_Experimental?method=differential gene expressiondifferential gene expression17526.723076923077404.15617715617714Top 20 distribution of PUL according to characterization methodenzyme activit…enzyme activity assaysequence homol…sequence homology analysismicroarrayRNA-Seqgene deletion …gene deletion mutant and growth assayNMRqPCRsugar utilizat…sugar utilization assayRT-PCRfosmid library…fosmid library screenmicroscopymass spectrome…mass spectrometryqRT-PCRgrowth assaythin layer chr…thin layer chromatographyclone and expr…clone and expressionhigh performan…high performance anion exchange chromatographyNorthern BlotRT-qPCRdifferential g…differential gene expression
All of the PULs in dbCAN-PUL have been experimentally characterized as either degrading or synthesizing glycan substrates. There is a total of 74 characterization methods used to verify PULs. The barplot on the left displays the top 20 characterization methods among PULs, with enzyme activity assay being the most frequent. The extended version of this barplot that illustrates the characterization methods across all PULs can be found on the Statistics page. Clicking a bar in the barplot links to a table of all PULs that share that type of characterization method.