logo
sublogo
You are browsing environment: HUMAN GUT
help

CAZyme Information: MGYG000000798_00008

You are here: Home > Sequence: MGYG000000798_00008

Basic Information | Genomic context | Full Sequence | Enzyme annotations |  CAZy signature domains |  CDD domains | CAZyme hits | PDB hits | Swiss-Prot hits | SignalP and Lipop annotations | TMHMM annotations

Basic Information help

Species Collinsella sp900542945
Lineage Bacteria; Actinobacteriota; Coriobacteriia; Coriobacteriales; Coriobacteriaceae; Collinsella; Collinsella sp900542945
CAZyme ID MGYG000000798_00008
CAZy Family GH31
CAZyme Description hypothetical protein
CAZyme Property
Protein Length CGC Molecular Weight Isoelectric Point
2137 MGYG000000798_1|CGC1 229661.04 4.4126
Genome Property
Genome Assembly ID Genome Size Genome Type Country Continent
MGYG000000798 2225766 MAG China Asia
Gene Location Start: 5591;  End: 12004  Strand: +

Full Sequence      Download help

MRRNSIFTKR  WVKGSAVLVA  TAATLVFANP  QQAIATPLKG  VDSATVSQSA  SNVVDIDFAD60
GIKGRITFLE  DGIFRYNVDP  KGEFSEYATP  RSKSHTAKIQ  AQPDTSDTYS  KPSATVADKG120
DTIEITGGKA  TVILDKATGK  MSIKSGDRVV  VSESASLDLD  KKGTVQTLAK  EKSENFFGGG180
TQNGRFLHTN  QTIAISNTNN  WTDGGVASPN  PFYWTSDGYG  VLRNTFAEGS  YDFGSTDSST240
VTTLHSENEF  DAYYFVATNE  GASNVATEML  RDYYKVTGNP  VLLPEYGFYL  GHLNAYNRDG300
WSTTSGQKKW  ETKGSKSSSE  KGDVKYESGM  STGYKLDGTL  AAETLNGEGP  TVDTENMKAT360
DFDRQFSARQ  VIDDYEDNDM  PLGWFLPNDG  YGAGYGQNGY  YKTGGVNSDG  TSSADRLNAV420
RANVDNLKKF  TEYANSKGVS  TGLWTQSNLE  PDSNSETPWH  LLRDFDAEVK  TGGISTLKTD480
VAWVGSGYSF  GLNGIKTAYD  TVTTGANKRP  NIITLDGWAG  TQRFGGIWTG  DQYGGNWEYI540
RFHIPTYIGQ  SLSGNPNIGS  DMDGIFGGDP  IISARDYQWK  TFTPSMLDMD  GWGTYRKSPM600
THGDPYTGIS  RFYLKLKAQL  MPYIYTSAAS  AANIDTGNGD  AGLPMIRAML  LSDNSEYAAS660
TSTQYQYMFG  ENFLVAPVYQ  DTQADENGND  IRNGIYLPNY  GTDENPTIWI  DYFSGKQYRG720
GQVLNNYEAP  LWKLPLFAKA  NAIVPMYAEN  NNPEAVSDTN  TKGTDRSQRI  VEFFATEGDG780
TFTQYEDDGS  SIENNTTEDD  SYGTIDNISY  GEHVSTVYRS  KAAGGTATFT  AEASQGGYNG840
YDSNRTTTFV  ANVSAKPMSV  VAKNGGSALN  VVEVDSQETF  DTATPEAGQA  VYFYSETPNL900
NHYGSDVDGT  EAERGESFNN  TKITTNPKVY  VKFAKTDVAA  AAQTLELGGF  VNADATLTAD960
RLNENLSAPT  NLAAPEDVTT  PTSIKLTWGK  VDGATSYDLK  VDGTVFAVGD  AAEFTHTDLA1020
YNSKHTYQIR  SRNAEGYSAW  SDVLEANSAL  DPWRNVPDAE  EITWEGSLYG  SHKAELAFDH1080
EFQSGDGGFH  SGGNDLGKAL  TVDYGRAYKL  DKLEYYPRDD  AGNGTVTKMR  VETSLDGVHW1140
TSQEVDWARS  ADCKTVAFDG  TVAARYVRMT  PLASVGNFFS  ASEIAIYKTD  GTDGFAVGSN1200
LNKATISDGD  YSNMKNYLGL  ENREPDTPTF  GSQIRDHFAD  LNDNGVYDVY  DYSFTMAGLD1260
GGTKQKGKVA  GSLAVIPSKT  EVEAGDEITV  DVYAADAKNV  NALGALVNFK  SDQFEFVSES1320
IAQDGSTAGM  ENLSREKVQF  ADRTQTINLA  FANRGDGKLY  NGTGAVASFK  LKAKTAGKVE1380
LPSTSWLIGP  ACDALEFASD  GTVTIPDVPQ  PTVAEYGQDA  FNITMTNDEL  TTDDGTNVEK1440
LIQQKSYNAL  FDNNEGDNGF  EFLWNWSGNW  DSEGKLPSYV  KLPTTMTFAF  KTPSKLDSVE1500
VVNRNGGNGT  VQKIKAVVTF  EDGSTQEFAG  GEYDSFRARY  AFGLSAENKG  KKATKVEVTP1560
LEASGDQMLT  LREVNFNYTQ  GVEAIEGVEL  GKNQTEFFEG  DVTLVDAKAT  PESAGYPYVT1620
VESSDPSVVS  VSAVQAGDSV  SYYLRANKAG  KAKITVASVL  DPSKTASYEV  EVKAGVDTSA1680
LVAALSKAAG  YSESVYTKDS  YAALTTAVEA  AQKLLDGGQG  SYSKKDVADA  TVKIEKAIDG1740
LKMRPVDEKI  LINTPENRKN  VKVARFSSEA  DYEGSNAVNA  LDGDERTLWH  SDWAHGTGMP1800
QYLSVDLGRD  YDLTDVTFLP  RQDGGTNGDI  FEAEILVSDT  ADGYEMGTAA  SMGTFAFDND1860
GRVLTDRGDW  KQMSFGAARG  RYVTVKVIHA  GGGDVDAYCS  MAELKFYGTA  VPDPVAKDDL1920
ATAIEKVDAE  VAAGDLKAGD  YTEASWTALE  NALASARAIL  SDENATQDEV  DHALRALESA1980
RGALEKTPVA  PVENPTKKQL  KALVKQAKET  DTRGKTAESV  KVLTDAITYA  EDVLGDENAS2040
DTQIQAAYGQ  LKLAIESLKD  ADSGNQGGSG  NQGSGNGSNG  GNQAGKPAGD  KAQGSKKNGS2100
AIPNTGDQTA  ATVATVGGLG  AIIAAIGAFF  HRRSKAK2137

Enzyme Prediction      help

No EC number prediction in MGYG000000798_00008.

CAZyme Signature Domains help

Created with Snap1062133204275346417478549611068117512821389149516021709181619232030491744GH31
Family Start End Evalue family coverage
GH31 491 744 1.7e-48 0.5386416861826698

CDD Domains      download full data without filtering help

Created with Snap1062133204275346417478549611068117512821389149516021709181619232030276715GH31_CPE104673792YicI428744Glyco_hydro_3112711438Type_III_cohesin_like363614GH31
Cdd ID Domain E-Value qStart qEnd sStart sEnd Domain Description
cd06596 GH31_CPE1046 7.90e-147 276 715 1 334
Clostridium CPE1046-like. CPE1046 is an uncharacterized Clostridium perfringens protein with a glycosyl hydrolase family 31 (GH31) domain. The domain architecture of CPE1046 and its orthologs includes a C-terminal fibronectin type 3 (FN3) domain and a coagulation factor 5/8 type C domain in addition to the GH31 domain. Enzymes of the GH31 family possess a wide range of different hydrolytic activities including alpha-glucosidase (glucoamylase and sucrase-isomaltase), alpha-xylosidase, 6-alpha-glucosyltransferase, 3-alpha-isomaltosyltransferase and alpha-1,4-glucan lyase. All GH31 enzymes cleave a terminal carbohydrate moiety from a substrate that varies considerably in size, depending on the enzyme, and may be either a starch or a glycoprotein.
COG1501 YicI 1.64e-55 73 792 62 710
Alpha-glucosidase, glycosyl hydrolase family GH31 [Carbohydrate transport and metabolism].
pfam01055 Glyco_hydro_31 5.94e-46 428 744 159 442
Glycosyl hydrolases family 31. Glycosyl hydrolases are key enzymes of carbohydrate metabolism. Family 31 comprises of enzymes that are, or similar to, alpha- galactosidases.
cd08759 Type_III_cohesin_like 1.64e-37 1271 1438 1 167
Cohesin domain, interaction partner of dockerin. Bacterial cohesin domains bind to a complementary protein domain named dockerin, and this interaction is required for the formation of the cellulosome, a cellulose-degrading complex. Two specific calcium-dependent interactions between cohesin and dockerin appear to be essential for cellulosome assembly, type I and type II. This subfamily represents type III cohesins and closely related domains.
cd06589 GH31 1.01e-35 363 614 21 265
glycosyl hydrolase family 31 (GH31). GH31 enzymes occur in prokaryotes, eukaryotes, and archaea with a wide range of hydrolytic activities, including alpha-glucosidase (glucoamylase and sucrase-isomaltase), alpha-xylosidase, 6-alpha-glucosyltransferase, 3-alpha-isomaltosyltransferase and alpha-1,4-glucan lyase. All GH31 enzymes cleave a terminal carbohydrate moiety from a substrate that varies considerably in size, depending on the enzyme, and may be either a starch or a glycoprotein. In most cases, the pyranose moiety recognized in subsite -1 of the substrate binding site is an alpha-D-glucose, though some GH31 family members show a preference for alpha-D-xylose. Several GH31 enzymes can accommodate both glucose and xylose and different levels of discrimination between the two have been observed. Most characterized GH31 enzymes are alpha-glucosidases. In mammals, GH31 members with alpha-glucosidase activity are implicated in at least three distinct biological processes. The lysosomal acid alpha-glucosidase (GAA) is essential for glycogen degradation and a deficiency or malfunction of this enzyme causes glycogen storage disease II, also known as Pompe disease. In the endoplasmic reticulum, alpha-glucosidase II catalyzes the second step in the N-linked oligosaccharide processing pathway that constitutes part of the quality control system for glycoprotein folding and maturation. The intestinal enzymes sucrase-isomaltase (SI) and maltase-glucoamylase (MGAM) play key roles in the final stage of carbohydrate digestion, making alpha-glucosidase inhibitors useful in the treatment of type 2 diabetes. GH31 alpha-glycosidases are retaining enzymes that cleave their substrates via an acid/base-catalyzed, double-displacement mechanism involving a covalent glycosyl-enzyme intermediate. Two aspartic acid residues have been identified as the catalytic nucleophile and the acid/base, respectively.

CAZyme Hits      help

Created with Snap1062133204275346417478549611068117512821389149516021709181619232030172072QWT17625.1|CBM32|GH31462065QNM10857.1|CBM32|GH31462060BCT46261.1|CBM32|GH31182066BBK61154.1|CBM32|GH31491393QUO30799.1|GH31
Hit ID E-Value Query Start Query End Hit Start Hit End
QWT17625.1 0.0 17 2072 21 2114
QNM10857.1 0.0 46 2065 50 2018
BCT46261.1 0.0 46 2060 49 2041
BBK61154.1 0.0 18 2066 18 2075
QUO30799.1 0.0 49 1393 35 1367

PDB Hits      download full data without filtering help

Created with Snap10621332042753464174785496110681175128213891495160217091816192320304210546M76_A4210547F7R_A4210547F7Q_A103311884LPL_A5077897KMP_A
Hit ID E-Value Query Start Query End Hit Start Hit End Description
6M76_A 4.05e-238 42 1054 32 963
GH31alpha-N-acetylgalactosaminidase from Enterococcus faecalis [Enterococcus faecalis ATCC 10100],6M77_A GH31 alpha-N-acetylgalactosaminidase from Enterococcus faecalis in complex with N-acetylgalactosamine [Enterococcus faecalis ATCC 10100]
7F7R_A 2.16e-237 42 1054 32 963
ChainA, GH31 alpha-N-acetylgalactosaminidase [Enterococcus faecalis ATCC 10100]
7F7Q_A 5.88e-237 42 1054 32 963
ChainA, GH31 alpha-N-acetylgalactosaminidase [Enterococcus faecalis ATCC 10100]
4LPL_A 1.01e-21 1033 1188 23 182
Structureof CBM32-1 from a family 31 glycoside hydrolase from Clostridium perfringens [Clostridium perfringens ATCC 13124]
7KMP_A 1.94e-18 507 789 636 901
ChainA, Alpha-xylosidase [Xanthomonas citri pv. citri str. 306],7KNC_A Chain A, Alpha-xylosidase [Xanthomonas citri pv. citri str. 306]

Swiss-Prot Hits      download full data without filtering help

Created with Snap106213320427534641747854961106811751282138914951602170918161923203065792sp|Q9F234|AGL2_BACTQ433745sp|Q9FN05|PSL5_ARATH212745sp|B9F676|GLU2A_ORYSJ504747sp|P79403|GANAB_PIG504747sp|Q4R4N7|GANAB_MACFA
Hit ID E-Value Query Start Query End Hit Start Hit End Description
Q9F234 1.23e-20 65 792 48 714
Alpha-glucosidase 2 OS=Bacillus thermoamyloliquefaciens OX=1425 PE=3 SV=1
Q9FN05 8.74e-19 433 745 497 780
Probable glucan 1,3-alpha-glucosidase OS=Arabidopsis thaliana OX=3702 GN=PSL5 PE=1 SV=1
B9F676 1.50e-18 212 745 291 778
Probable glucan 1,3-alpha-glucosidase OS=Oryza sativa subsp. japonica OX=39947 GN=Os03g0216600 PE=3 SV=1
P79403 2.35e-15 504 747 591 813
Neutral alpha-glucosidase AB OS=Sus scrofa OX=9823 GN=GANAB PE=1 SV=1
Q4R4N7 4.03e-15 504 747 591 813
Neutral alpha-glucosidase AB OS=Macaca fascicularis OX=9541 GN=GANAB PE=2 SV=1

SignalP and Lipop Annotations help

This protein is predicted as SP

Other SP_Sec_SPI LIPO_Sec_SPII TAT_Tat_SPI TATLIP_Sec_SPII PILIN_Sec_SPIII
0.000460 0.998593 0.000237 0.000257 0.000226 0.000187

TMHMM  Annotations      download full data without filtering help

start end
2109 2131