Introduction to dbCAN-PUL

dbCAN-PUL is a data repository of prokaryotic CAZyme-containing gene clusters that have been experimentally validated to act on a carbohydrate substrate (also known as polysaccharide utilization loci or PULs). In contrast to similar resources such as PULDB, this repository serves as a database containing the most experimentally verified PULs with a confirmed carbohydrate substrate, as well as range from different phyla and comprised of different metabolism systems (as opposed to only the Bacteroidetes and the Starch utilization system (Sus) gene homologs). dbCAN-PUL has the following features:

PULs from 10 different phyla and 173 number of species as well as loci from metagenomic sequences
Contains both degradative and synthetic CAZyme-containing loci
Metadata for PULs such as substrate and method of experimental verification
Users can retrieve protein sequence, enzyme commission number and dbCAN2 annotations for CAZymes and other proteins
View homologous loci in NCBI GenBank though the integration of the MultiGeneBlast tool
Users can query their own sequences against proteins in PULs in dbCAN-PUL using a BLASTX search
Batch download of data for PULs

dbCAN-PUL will be updated once a year to include new experimentally validated CAZyme containing gene clusters.

05-10-2023 update: 671 experimentally verified PULs in dbCAN-PUL.

03-02-2023 update: 654 experimentally verified PULs in dbCAN-PUL. PUL0185 and PUL0293 are removed due to repetition.

05-09-2021 update: 612 experimentally verified PULs in dbCAN-PUL.

Top 20 distribution of PUL according to substrate001010202030304040505060607070Get_Substrate_entries?substrate=capsule polysaccharidecapsule polysaccharide7057.865384615384606217.0Get_Substrate_entries?substrate=xylanxylan4882.82692307692307282.5769230769231Get_Substrate_entries?substrate=O-antigenO-antigen39107.78846153846152309.4038461538462Get_Substrate_entries?substrate=pectinpectin38132.75312.3846153846154Get_Substrate_entries?substrate=N-glycanN-glycan35157.71153846153845321.3269230769231Get_Substrate_entries?substrate=O-glycanO-glycan31182.6730769230769333.25Get_Substrate_entries?substrate=beta-glucanbeta-glucan29207.63461538461542339.21153846153845Get_Substrate_entries?substrate=alginatealginate24232.59615384615387354.11538461538464Get_Substrate_entries?substrate=sucrosesucrose23257.5576923076923357.0961538461538Get_Substrate_entries?substrate=cellobiosecellobiose23282.5192307692307357.0961538461538Get_Substrate_entries?substrate=lactoselactose17307.48076923076917374.9807692307692Get_Substrate_entries?substrate=galactomannangalactomannan16332.4423076923076377.96153846153845Get_Substrate_entries?substrate=arabinanarabinan16357.40384615384613377.96153846153845Get_Substrate_entries?substrate=cellulosecellulose16382.3653846153846377.96153846153845Get_Substrate_entries?substrate=glucomannanglucomannan15407.32692307692304380.94230769230774Get_Substrate_entries?substrate=chitinchitin15432.2884615384615380.94230769230774Get_Substrate_entries?substrate=host glycanhost glycan14457.24999999999994383.9230769230769Get_Substrate_entries?substrate=mucinmucin14482.2115384615384383.9230769230769Get_Substrate_entries?substrate=lichenanlichenan13507.17307692307685386.9038461538462Get_Substrate_entries?substrate=exopolysaccharideexopolysaccharide12532.1346153846154389.88461538461536Top 20 distribution of PUL according to substratecapsule polysa…capsule polysaccharidexylanO-antigenpectinN-glycanO-glycanbeta-glucanalginatesucrosecellobioselactosegalactomannanarabinancelluloseglucomannanchitinhost glycanmucinlichenanexopolysacchar…exopolysaccharide
The characterized gene clusters in dbCAN-PUL have 127 different types of substrates. The most abundant substrate is characterized is capsule polysaccharide. Displayed here are the top 20 most frequent substrates in dbCAN-PUL, and an extended version of this barplot that illustrates the substrates across all PULs can be found on the Statistics page. Clicking a bar in the barplot links to a table of all PULs that share that type of substrate.
Top 20 distribution of PUL according to taxonomy rank (genus)002020404060608080100100120120140140160160180180Get_Organism_entries?organism=BacteroidesBacteroides19057.27692307692307216.99999999999997Get_Organism_entries?organism=BifidobacteriumBifidobacterium4181.98461538461538380.62854251012146Get_Organism_entries?organism=EscherichiaEscherichia39106.69230769230768382.82489878542515Get_Organism_entries?organism=uncultureduncultured34131.4388.3157894736842Get_Organism_entries?organism=AcinetobacterAcinetobacter25156.1076923076923398.1993927125506Get_Organism_entries?organism=StreptococcusStreptococcus25180.8153846153846398.1993927125506Get_Organism_entries?organism=PrevotellaPrevotella22205.5230769230769401.49392712550605Get_Organism_entries?organism=LactobacillusLactobacillus20230.23076923076925403.6902834008097Get_Organism_entries?organism=FlavobacteriumFlavobacterium16254.93846153846155408.082995951417Get_Organism_entries?organism=BacillusBacillus16279.6461538461539408.082995951417Get_Organism_entries?organism=RuminiclostridiumRuminiclostridium13304.3538461538462411.37753036437243Get_Organism_entries?organism=ZobelliaZobellia12329.0615384615385412.4757085020243Get_Organism_entries?organism=ChitinophagaChitinophaga10353.7692307692308414.67206477732793Get_Organism_entries?organism=PseudoalteromonasPseudoalteromonas10378.4769230769231414.67206477732793Get_Organism_entries?organism=GeobacillusGeobacillus9403.1846153846154415.77024291497975Get_Organism_entries?organism=ClostridiumClostridium8427.8923076923077416.86842105263156Get_Organism_entries?organism=XanthomonasXanthomonas8452.6416.86842105263156Get_Organism_entries?organism=AlteromonasAlteromonas8477.3076923076923416.86842105263156Get_Organism_entries?organism=VibrioVibrio8502.0153846153846416.86842105263156Get_Organism_entries?organism=RoseburiaRoseburia8526.723076923077416.86842105263156Top 20 distribution of PUL according to taxonomy rank (genus)BacteroidesBifidobacteriumEscherichiaunculturedAcinetobacterStreptococcusPrevotellaLactobacillusFlavobacteriumBacillusRuminiclostrid…RuminiclostridiumZobelliaChitinophagaPseudoalteromo…PseudoalteromonasGeobacillusClostridiumXanthomonasAlteromonasVibrioRoseburia
dbCAN-PUL features PULs from 97 prokaryotic genera as well as metagenomically derived organisms. Here we show the 20 most abundant genera/taxonomic groups, with Bacteroides being the most frequent. The extended version of this barplot illustrates the genera and taxonomic groups across all PULs can be found on the Statistics page. Clicking a bar in the barplot links to a table of all PULs that belong the same genus or taxonomic group.
Top 20 distribution of PUL according to characterization method002020404060608080100100120120140140160160Query_Entry_By_Experimental?method=enzyme activity assayenzyme activity assay17857.27692307692307216.99999999999997Query_Entry_By_Experimental?method=sequence homology analysissequence homology analysis12581.98461538461538279.12726879861714Query_Entry_By_Experimental?method=microarraymicroarray94106.69230769230768315.46585998271394Query_Entry_By_Experimental?method=RNA-SeqRNA-Seq73131.4340.08232497839236Query_Entry_By_Experimental?method=NMRNMR60156.1076923076923355.3210890233362Query_Entry_By_Experimental?method=gene deletion mutant and growth assaygene deletion mutant and growth assay58180.8153846153846357.6655142610199Query_Entry_By_Experimental?method=qPCRqPCR52205.5230769230769364.6987899740709Query_Entry_By_Experimental?method=sugar utilization assaysugar utilization assay44230.23076923076925374.0764909248055Query_Entry_By_Experimental?method=RT-PCRRT-PCR40254.93846153846155378.7653414001728Query_Entry_By_Experimental?method=mass spectrometrymass spectrometry39279.6461538461539379.93755401901467Query_Entry_By_Experimental?method=fosmid library screenfosmid library screen36304.3538461538462383.4541918755402Query_Entry_By_Experimental?method=microscopymicroscopy33329.0615384615385386.9708297320657Query_Entry_By_Experimental?method=qRT-PCRqRT-PCR33353.7692307692308386.9708297320657Query_Entry_By_Experimental?method=thin layer chromatographythin layer chromatography28378.4769230769231392.83189282627484Query_Entry_By_Experimental?method=growth assaygrowth assay28403.1846153846154392.83189282627484Query_Entry_By_Experimental?method=high performance anion exchange chromatographyhigh performance anion exchange chromatography26427.8923076923077395.17631806395855Query_Entry_By_Experimental?method=clone and expressionclone and expression25452.6396.34853068280034Query_Entry_By_Experimental?method=recombinant protein expressionrecombinant protein expression24477.3076923076923397.52074330164214Query_Entry_By_Experimental?method=RT-qPCRRT-qPCR20502.0153846153846402.2095937770095Query_Entry_By_Experimental?method=liquid chromatography and mass spectrometryliquid chromatography and mass spectrometry18526.723076923077404.55401901469315Top 20 distribution of PUL according to characterization methodenzyme activit…enzyme activity assaysequence homol…sequence homology analysismicroarrayRNA-SeqNMRgene deletion …gene deletion mutant and growth assayqPCRsugar utilizat…sugar utilization assayRT-PCRmass spectrome…mass spectrometryfosmid library…fosmid library screenmicroscopyqRT-PCRthin layer chr…thin layer chromatographygrowth assayhigh performan…high performance anion exchange chromatographyclone and expr…clone and expressionrecombinant pr…recombinant protein expressionRT-qPCRliquid chromat…liquid chromatography and mass spectrometry
All of the PULs in dbCAN-PUL have been experimentally characterized as either degrading or synthesizing glycan substrates. There is a total of 77 characterization methods used to verify PULs. The barplot on the left displays the top 20 characterization methods among PULs, with enzyme activity assay being the most frequent. The extended version of this barplot that illustrates the characterization methods across all PULs can be found on the Statistics page. Clicking a bar in the barplot links to a table of all PULs that share that type of characterization method.