logo
sublogo
You are browsing environment: HUMAN GUT
help

CAZyme Information: MGYG000002968_00297

You are here: Home > Sequence: MGYG000002968_00297

Basic Information | Genomic context | Full Sequence | Enzyme annotations |  CAZy signature domains |  CDD domains | CAZyme hits | PDB hits | Swiss-Prot hits | SignalP and Lipop annotations | TMHMM annotations

Basic Information help

Species Collinsella sp900546455
Lineage Bacteria; Actinobacteriota; Coriobacteriia; Coriobacteriales; Coriobacteriaceae; Collinsella; Collinsella sp900546455
CAZyme ID MGYG000002968_00297
CAZy Family GH31
CAZyme Description hypothetical protein
CAZyme Property
Protein Length CGC Molecular Weight Isoelectric Point
2136 229466.77 4.3712
Genome Property
Genome Assembly ID Genome Size Genome Type Country Continent
MGYG000002968 2129771 MAG United States North America
Gene Location Start: 25238;  End: 31648  Strand: -

Full Sequence      Download help

MRRNSIFTKR  WVKGSAVLVA  TAATLVFASP  QQAVATPLKG  VESATVSQSA  SNVVDIDFAG60
GIKGRITFLE  DGIFRYNVDP  KGEFSEYATP  RSKSHTAKIQ  AQPDASDTYS  KPSATVADKG120
NTIEISGGKA  TVILDKATGK  MSIKSGDRVV  VSESASLDLD  KKGTVQTLAK  EKSENFFGGG180
TQNGRFLHTN  QTIAISNTNN  WTDGGVASPN  PFYWTSDGYG  VLRNTFAEGS  YDFGSTDAST240
VTTSHSENEF  DAYYFVATNE  GASNVATEML  QDYYKVTGNP  VLLPEYGFYL  GHLNAYNRDG300
WSTTSGQKKW  ETKGSKSSSK  EGDVKYESGM  ATGYKLDGTL  AAETLNGEGP  TVDTENMKAT360
DFDRQFSAQR  VIDDYEDNDM  PLGWFLPNDG  YGAGYGQNGY  YKTGGVNSDG  TSSAERLNAV420
RANVDNLKKF  TEYANSKGVS  TGLWTQSNLE  PDSNSETPWH  LLRDFDAEVK  TGGITTLKTD480
VAWVGSGYSF  GLNGIKTAYD  TVTTGANKRP  NIITLDGWAG  TQRFGGIWTG  DQYGGNWEYI540
RFHIPTYIGQ  SLSGNPNIGS  DMDGIFGGSP  IISARDYQWK  TFTPSMLDMD  GWGTYRKSPM600
THGDPYTGIS  RFYLKLKAQL  MPYIYTSAAS  AANIDTGNGD  TGLPMIRAML  LSDNSEYAAS660
TSTQYQYMFG  ENFLVAPVYQ  DTQADENGND  VRNGIYLPNY  GTDENPTIWI  DYFSGKQYRG720
GQVLNNYEAP  LWKLPLFVKA  NAIVPMYAEN  NNPEAVTETN  TKGTNRSQRI  VEFFATEGEG780
SFTQYEDDGS  SIENNTTEDD  SYGTIDNISY  GEHVSTVYKS  SVVGGTATFT  AEASQGGYSG840
YDSNRTTTFV  ANVSAKPTSV  SAKNGGSALN  VVEVDSQEAF  DVATPEAGQA  VYFYNETPNL900
NHYGSDAGST  EAEQGESFNN  TKITTNPKVY  VKFAKTDVAT  AAQTLELGDF  VNADATLAAD960
RFNENLSAPA  NLAAPEDATT  PTSIKLTWGK  VDGATSYDLK  VDGTVFAVGD  AAEFTHTDLA1020
YNSKHTYQIR  SRNADGYSAW  SDVLEANSAL  DPWRNVPEAE  EITWEGSLYG  SHKAELAFDH1080
EFQSGDGGFH  SGGNDLGKAL  TVDYGRAYKL  DKLEYYPRDD  AGNGTVTKMR  IETSLDGVHW1140
TTQEVDWERS  AACKTVAFDG  AVAARYVRMT  PLASVGNFFS  ASEIAIYKTD  GTDGFAVGSN1200
LNKATVSDGD  YSNMKNYLGL  ENREPDTSTF  DSQIRAHFAD  LNGNGVYDVY  DYSFTMAALD1260
GGTKQKGKAA  GTLAVIPSKM  EVEAGDEITV  DVYATDAENI  NALGALVNFN  SDQFEYVSES1320
IAQDASTAGM  ENLSREKVQF  TDGHQTINLA  FANRGDAKLF  NSTGVVASFK  LKAKTAGKVE1380
LPSMSWLVGP  ACDALEFVSD  GTVTMPGVPQ  PTKSEYGRDA  FDITMTNDEL  PTDDGTNVEK1440
LIQQKSYDGL  FDNNEGGNDF  EFLWNYGPNW  IDGKMPTYVK  LPATMTFAFK  TPSKLDSVEV1500
VNRNGGNGTV  QKIKSVVTFE  DGSTQEFSGG  SFDSYQARYA  FGLSAENKGK  KATKVEITPL1560
EASGDQMLTL  REINFNYTQG  VEAIEGVELG  KNQTEFFEGD  VTLVDAKATP  ESAGYPYVTV1620
ESSDPSVVSV  SAVQAGDSVS  YYLRANKAGK  ATITVASVLD  PSKTASYEVE  VKAGVDTSAL1680
MAALSKAAGY  SESVYTKDSY  AALTAAVEAA  QKLLDGGQGS  YSKKDVADAT  AKIEKAIDGL1740
KMRPVDEKSL  INTPENRENV  KVTGFSSEAD  YEGSNAVNAL  DGDEQTLWHS  DWAYKAGMPQ1800
YLTVDLGRDY  DLTDVTFLPR  QDGGTNGDIF  EAEVLVSDTA  DGYEMGTATS  MGTFTFDNDG1860
RVLTDRGDWK  QMSFGAARGR  YVTVKVLHAG  GGSVDAYCSM  AELKFYGTAV  PNPVAKDDLT1920
AAIEKVDAEV  AAGDLKASDY  TEASWTAFQN  ALASARAILN  DENATQDEVD  QALKALESAR1980
DALEKTPVTP  VENPTKKQLK  ALVKQAKETD  TRGKTDESVK  ALTDAITYAE  DVLGDENASD2040
IQLQAAYDQL  KLAIESLKDA  DSGNQGGSGN  QGSGNGSNGG  NQAGKPAGDK  AQGSKKSGSA2100
IPNTGDQTAA  TVAAVGGLGA  IIAAIGAFFH  RRSKAK2136

Enzyme Prediction      help

No EC number prediction in MGYG000002968_00297.

CAZyme Signature Domains help

Created with Snap1062133204275346407478549611068117412811388149516021708181519222029491744GH3110671184CBM32
Family Start End Evalue family coverage
GH31 491 744 1.3e-49 0.5386416861826698
CBM32 1067 1184 9.9e-16 0.8870967741935484

CDD Domains      download full data without filtering help

Created with Snap1062133204275346407478549611068117412811388149516021708181519222029276715GH31_CPE1046174792YicI428744Glyco_hydro_3112711438Type_III_cohesin_like363614GH31
Cdd ID Domain E-Value qStart qEnd sStart sEnd Domain Description
cd06596 GH31_CPE1046 2.44e-147 276 715 1 334
Clostridium CPE1046-like. CPE1046 is an uncharacterized Clostridium perfringens protein with a glycosyl hydrolase family 31 (GH31) domain. The domain architecture of CPE1046 and its orthologs includes a C-terminal fibronectin type 3 (FN3) domain and a coagulation factor 5/8 type C domain in addition to the GH31 domain. Enzymes of the GH31 family possess a wide range of different hydrolytic activities including alpha-glucosidase (glucoamylase and sucrase-isomaltase), alpha-xylosidase, 6-alpha-glucosyltransferase, 3-alpha-isomaltosyltransferase and alpha-1,4-glucan lyase. All GH31 enzymes cleave a terminal carbohydrate moiety from a substrate that varies considerably in size, depending on the enzyme, and may be either a starch or a glycoprotein.
COG1501 YicI 6.94e-56 174 792 153 710
Alpha-glucosidase, glycosyl hydrolase family GH31 [Carbohydrate transport and metabolism].
pfam01055 Glyco_hydro_31 2.09e-47 428 744 159 442
Glycosyl hydrolases family 31. Glycosyl hydrolases are key enzymes of carbohydrate metabolism. Family 31 comprises of enzymes that are, or similar to, alpha- galactosidases.
cd08759 Type_III_cohesin_like 7.42e-36 1271 1438 1 167
Cohesin domain, interaction partner of dockerin. Bacterial cohesin domains bind to a complementary protein domain named dockerin, and this interaction is required for the formation of the cellulosome, a cellulose-degrading complex. Two specific calcium-dependent interactions between cohesin and dockerin appear to be essential for cellulosome assembly, type I and type II. This subfamily represents type III cohesins and closely related domains.
cd06589 GH31 1.75e-34 363 614 21 265
glycosyl hydrolase family 31 (GH31). GH31 enzymes occur in prokaryotes, eukaryotes, and archaea with a wide range of hydrolytic activities, including alpha-glucosidase (glucoamylase and sucrase-isomaltase), alpha-xylosidase, 6-alpha-glucosyltransferase, 3-alpha-isomaltosyltransferase and alpha-1,4-glucan lyase. All GH31 enzymes cleave a terminal carbohydrate moiety from a substrate that varies considerably in size, depending on the enzyme, and may be either a starch or a glycoprotein. In most cases, the pyranose moiety recognized in subsite -1 of the substrate binding site is an alpha-D-glucose, though some GH31 family members show a preference for alpha-D-xylose. Several GH31 enzymes can accommodate both glucose and xylose and different levels of discrimination between the two have been observed. Most characterized GH31 enzymes are alpha-glucosidases. In mammals, GH31 members with alpha-glucosidase activity are implicated in at least three distinct biological processes. The lysosomal acid alpha-glucosidase (GAA) is essential for glycogen degradation and a deficiency or malfunction of this enzyme causes glycogen storage disease II, also known as Pompe disease. In the endoplasmic reticulum, alpha-glucosidase II catalyzes the second step in the N-linked oligosaccharide processing pathway that constitutes part of the quality control system for glycoprotein folding and maturation. The intestinal enzymes sucrase-isomaltase (SI) and maltase-glucoamylase (MGAM) play key roles in the final stage of carbohydrate digestion, making alpha-glucosidase inhibitors useful in the treatment of type 2 diabetes. GH31 alpha-glycosidases are retaining enzymes that cleave their substrates via an acid/base-catalyzed, double-displacement mechanism involving a covalent glycosyl-enzyme intermediate. Two aspartic acid residues have been identified as the catalytic nucleophile and the acid/base, respectively.

CAZyme Hits      help

Created with Snap1062133204275346407478549611068117412811388149516021708181519222029172111QWT17625.1|CBM32|GH31462064QNM10857.1|CBM32|GH31462059BCT46261.1|CBM32|GH31382065BBK61154.1|CBM32|GH31491393QUO30799.1|GH31
Hit ID E-Value Query Start Query End Hit Start Hit End
QWT17625.1 0.0 17 2111 21 2140
QNM10857.1 0.0 46 2064 50 2018
BCT46261.1 0.0 46 2059 49 2041
BBK61154.1 0.0 38 2065 39 2075
QUO30799.1 0.0 49 1393 35 1367

PDB Hits      download full data without filtering help

Created with Snap10621332042753464074785496110681174128113881495160217081815192220294510546M76_A4510547F7R_A4510547F7Q_A103311884LPL_A5027922XVG_A
Hit ID E-Value Query Start Query End Hit Start Hit End Description
6M76_A 2.96e-237 45 1054 42 963
GH31alpha-N-acetylgalactosaminidase from Enterococcus faecalis [Enterococcus faecalis ATCC 10100],6M77_A GH31 alpha-N-acetylgalactosaminidase from Enterococcus faecalis in complex with N-acetylgalactosamine [Enterococcus faecalis ATCC 10100]
7F7R_A 1.57e-236 45 1054 42 963
ChainA, GH31 alpha-N-acetylgalactosaminidase [Enterococcus faecalis ATCC 10100]
7F7Q_A 4.29e-236 45 1054 42 963
ChainA, GH31 alpha-N-acetylgalactosaminidase [Enterococcus faecalis ATCC 10100]
4LPL_A 1.62e-20 1033 1188 23 182
Structureof CBM32-1 from a family 31 glycoside hydrolase from Clostridium perfringens [Clostridium perfringens ATCC 13124]
2XVG_A 8.80e-19 502 792 629 913
crystalstructure of alpha-xylosidase (GH31) from Cellvibrio japonicus [Cellvibrio japonicus],2XVK_A crystal structure of alpha-xylosidase (GH31) from Cellvibrio japonicus in complex with 5-fluoro-alpha-D-xylopyranosyl fluoride [Cellvibrio japonicus],2XVL_A crystal structure of alpha-xylosidase (GH31) from Cellvibrio japonicus in complex with Pentaerythritol propoxylate (5 4 PO OH) [Cellvibrio japonicus]

Swiss-Prot Hits      download full data without filtering help

Created with Snap1062133204275346407478549611068117412811388149516021708181519222029508792sp|Q9F234|AGL2_BACTQ509793sp|Q9FN05|PSL5_ARATH497745sp|B9F676|GLU2A_ORYSJ490746sp|P38138|GLU2A_YEAST504747sp|P79403|GANAB_PIG
Hit ID E-Value Query Start Query End Hit Start Hit End Description
Q9F234 1.10e-19 508 792 461 714
Alpha-glucosidase 2 OS=Bacillus thermoamyloliquefaciens OX=1425 PE=3 SV=1
Q9FN05 3.86e-19 509 793 566 822
Probable glucan 1,3-alpha-glucosidase OS=Arabidopsis thaliana OX=3702 GN=PSL5 PE=1 SV=1
B9F676 1.97e-18 497 745 551 778
Probable glucan 1,3-alpha-glucosidase OS=Oryza sativa subsp. japonica OX=39947 GN=Os03g0216600 PE=3 SV=1
P38138 1.05e-15 490 746 570 811
Glucosidase 2 subunit alpha OS=Saccharomyces cerevisiae (strain ATCC 204508 / S288c) OX=559292 GN=ROT2 PE=1 SV=1
P79403 5.29e-15 504 747 591 813
Neutral alpha-glucosidase AB OS=Sus scrofa OX=9823 GN=GANAB PE=1 SV=1

SignalP and Lipop Annotations help

This protein is predicted as SP

Other SP_Sec_SPI LIPO_Sec_SPII TAT_Tat_SPI TATLIP_Sec_SPII PILIN_Sec_SPIII
0.000499 0.998544 0.000243 0.000261 0.000232 0.000198

TMHMM  Annotations      download full data without filtering help

start end
2108 2130