logo
sublogo
You are browsing environment: HUMAN GUT
help

CAZyme Information: MGYG000000398_01659

You are here: Home > Sequence: MGYG000000398_01659

Basic Information | Genomic context | Full Sequence | Enzyme annotations |  CAZy signature domains |  CDD domains | CAZyme hits | PDB hits | Swiss-Prot hits | SignalP and Lipop annotations | TMHMM annotations

Basic Information help

Species
Lineage Bacteria; Firmicutes_A; Clostridia; Lachnospirales; Lachnospiraceae; TF01-11;
CAZyme ID MGYG000000398_01659
CAZy Family GH5
CAZyme Description hypothetical protein
CAZyme Property
Protein Length CGC Molecular Weight Isoelectric Point
806 MGYG000000398_10|CGC1 89366.16 8.6566
Genome Property
Genome Assembly ID Genome Size Genome Type Country Continent
MGYG000000398 2936252 MAG Sweden Europe
Gene Location Start: 108746;  End: 111166  Strand: +

Full Sequence      Download help

MRKIIKKLMA  SVLSLAMVVT  SGSGLAVTSG  VKTAKAAGST  PANIGEERTL  SSGFKTKDNG60
LMRDNMNSAK  FMQEMGLGWN  YGNSLDQAVD  TSKMTEDELA  KVDVNYCETS  ANNIALTQKN120
VDELKRYGFK  NIRIPVAWSN  LMEISEDKMT  YTINKDYLQR  VEEVINYCLN  DGLYAIVNIH180
YDGDWWGQFG  DQDQEIRKQA  WARYEQIWTQ  LSERYKDYSD  LLVFESANEE  LGDRLNDDWR240
NRTTNPKTGV  LTVDEQYKTL  NKINQKFVDI  VRASGGNNEK  RYLLIAGYST  NIERTCDDRF300
EMPTDRIEGN  GVSKLSVSVH  YYTPWNYCGG  SEYHQGEGDA  PRLFDWGTDE  QIASMHKDLD360
AMSKFTEQGY  GVIIGEYGVQ  TSAADGIPAY  IREVAKYAIE  KGMVPVLWDN  GTWFDREKSR420
VAFDNVAQAI  IDVTGSTETF  KQTGVDTGKP  NYTEVKDADA  ENLVPLYTWE  GKWKKNGGDN480
NSYCEPKLSV  TSNDGINDWV  FHCNSYGYWA  VIYSEALKNV  NHLYLRVTCE  DNDIDSSTLQ540
IAPAEFKEKY  SPEKGTPLVE  EGEEFAKANK  GALEMCKNGK  ETGIIAEDEW  SGKLFSIDED600
ILQGKSILWI  SASNKPVFTK  IELFTDKASA  TETPSASPNT  TAPAPTATVA  AQIPTQTAVA660
AAPAKGDVVK  DKNASYTVSN  VKNKTVEYKA  PASKKKTSAV  IPATVKVKGV  SYKVTSIAKN720
AFKNNKKLKK  VTIGANITSI  GANAFSGCKV  LKNVVVKSTK  LKTIGKNAFK  GIQKKAVIKV780
PAKKLKAYKK  LFGSKAGIKK  TMKIKK806

Enzyme Prediction      help

No EC number prediction in MGYG000000398_01659.

CAZyme Signature Domains help

Created with Snap4080120161201241282322362403443483523564604644685725765108411GH5
Family Start End Evalue family coverage
GH5 108 411 1.2e-78 0.9891304347826086

CDD Domains      download full data without filtering help

Created with Snap4080120161201241282322362403443483523564604644685725765122411Cellulase74414BglC713771LRR_3713781LRR_3713781LRR_3
Cdd ID Domain E-Value qStart qEnd sStart sEnd Domain Description
pfam00150 Cellulase 4.21e-40 122 411 31 269
Cellulase (glycosyl hydrolase family 5).
COG2730 BglC 1.46e-17 74 414 50 363
Aryl-phospho-beta-D-glucosidase BglC, GH1 family [Carbohydrate transport and metabolism].
sd00036 LRR_3 1.33e-13 713 771 24 81
leucine-rich repeats. A leucine-rich repeat (LRR) is a structural protein motif of 20-30 amino acids that is unusually rich in the hydrophobic amino acid leucine. The conserved eleven-residue sequence motif (LxxLxLxxN/CxL) within the LRRs corresponds to the beta-strand and adjacent loop regions, whereas the remaining parts of the repeats are variable. LRRs fold together to form a solenoid protein domain, termed leucine-rich repeat domain. Leucine-rich repeats are usually involved in protein-protein interactions.
sd00036 LRR_3 4.91e-13 713 781 47 114
leucine-rich repeats. A leucine-rich repeat (LRR) is a structural protein motif of 20-30 amino acids that is unusually rich in the hydrophobic amino acid leucine. The conserved eleven-residue sequence motif (LxxLxLxxN/CxL) within the LRRs corresponds to the beta-strand and adjacent loop regions, whereas the remaining parts of the repeats are variable. LRRs fold together to form a solenoid protein domain, termed leucine-rich repeat domain. Leucine-rich repeats are usually involved in protein-protein interactions.
sd00036 LRR_3 5.90e-13 713 781 1 68
leucine-rich repeats. A leucine-rich repeat (LRR) is a structural protein motif of 20-30 amino acids that is unusually rich in the hydrophobic amino acid leucine. The conserved eleven-residue sequence motif (LxxLxLxxN/CxL) within the LRRs corresponds to the beta-strand and adjacent loop regions, whereas the remaining parts of the repeats are variable. LRRs fold together to form a solenoid protein domain, termed leucine-rich repeat domain. Leucine-rich repeats are usually involved in protein-protein interactions.

CAZyme Hits      help

Created with Snap408012016120124128232236240344348352356460464468572576553428CCO05195.1|GH5_466457VEU81114.1|GH5_458463CBK96866.1|GH5_462437AEY66181.1|GH5_443427QQA02031.1|GH5_4
Hit ID E-Value Query Start Query End Hit Start Hit End
CCO05195.1 1.81e-87 53 428 38 391
VEU81114.1 2.35e-86 66 457 32 400
CBK96866.1 4.32e-79 58 463 31 404
AEY66181.1 2.32e-74 62 437 34 384
QQA02031.1 4.07e-73 43 427 21 387

PDB Hits      download full data without filtering help

Created with Snap4080120161201241282322362403443483523564604644685725765684312JEP_A724306XRK_A754336WQY_A704144X0V_A704145H4R_A
Hit ID E-Value Query Start Query End Hit Start Hit End Description
2JEP_A 1.46e-54 68 431 35 391
Nativefamily 5 xyloglucanase from Paenibacillus pabuli [Paenibacillus pabuli],2JEP_B Native family 5 xyloglucanase from Paenibacillus pabuli [Paenibacillus pabuli],2JEQ_A Family 5 xyloglucanase from Paenibacillus pabuli in complex with ligand [Paenibacillus pabuli]
6XRK_A 3.56e-48 72 430 28 374
GH5-4broad specificity endoglucanase from an uncultured bovine rumen ciliate [uncultured bovine rumen ciliate],6XRK_B GH5-4 broad specificity endoglucanase from an uncultured bovine rumen ciliate [uncultured bovine rumen ciliate]
6WQY_A 2.23e-47 75 433 31 377
ChainA, Cellulase [Phocaeicola salanitronis DSM 18170]
4X0V_A 1.89e-46 70 414 42 366
Structureof a GH5 family lichenase from Caldicellulosiruptor sp. F32 [Caldicellulosiruptor sp. F32],4X0V_B Structure of a GH5 family lichenase from Caldicellulosiruptor sp. F32 [Caldicellulosiruptor sp. F32],4X0V_C Structure of a GH5 family lichenase from Caldicellulosiruptor sp. F32 [Caldicellulosiruptor sp. F32],4X0V_D Structure of a GH5 family lichenase from Caldicellulosiruptor sp. F32 [Caldicellulosiruptor sp. F32],4X0V_E Structure of a GH5 family lichenase from Caldicellulosiruptor sp. F32 [Caldicellulosiruptor sp. F32],4X0V_F Structure of a GH5 family lichenase from Caldicellulosiruptor sp. F32 [Caldicellulosiruptor sp. F32],4X0V_G Structure of a GH5 family lichenase from Caldicellulosiruptor sp. F32 [Caldicellulosiruptor sp. F32],4X0V_H Structure of a GH5 family lichenase from Caldicellulosiruptor sp. F32 [Caldicellulosiruptor sp. F32]
5H4R_A 4.82e-46 70 414 42 366
thecomplex of Glycoside Hydrolase 5 Lichenase from Caldicellulosiruptor sp. F32 E188Q mutant and cellotetraose [Caldicellulosiruptor sp. F32]

Swiss-Prot Hits      download full data without filtering help

Created with Snap408012016120124128232236240344348352356460464468572576553431sp|O08342|GUNA_PAEBA65439sp|P17901|GUNA_RUMCH44451sp|P28621|GUNB_CLOC762411sp|P10477|CELE_ACET270411sp|P28623|GUND_CLOC7
Hit ID E-Value Query Start Query End Hit Start Hit End Description
O08342 4.78e-54 53 431 25 396
Endoglucanase A OS=Paenibacillus barcinonensis OX=198119 GN=celA PE=1 SV=1
P17901 8.75e-44 65 439 44 405
Endoglucanase A OS=Ruminiclostridium cellulolyticum (strain ATCC 35319 / DSM 5812 / JCM 6584 / H10) OX=394503 GN=celCCA PE=1 SV=1
P28621 3.32e-42 44 451 23 389
Endoglucanase B OS=Clostridium cellulovorans (strain ATCC 35296 / DSM 3052 / OCM 3 / 743B) OX=573061 GN=engB PE=3 SV=1
P10477 7.87e-40 62 411 52 352
Cellulase/esterase CelE OS=Acetivibrio thermocellus (strain ATCC 27405 / DSM 1237 / JCM 9322 / NBRC 103400 / NCIMB 10682 / NRRL B-4536 / VPI 7372) OX=203119 GN=celE PE=1 SV=2
P28623 1.20e-39 70 411 45 339
Endoglucanase D OS=Clostridium cellulovorans (strain ATCC 35296 / DSM 3052 / OCM 3 / 743B) OX=573061 GN=engD PE=1 SV=2

SignalP and Lipop Annotations help

This protein is predicted as SP

Other SP_Sec_SPI LIPO_Sec_SPII TAT_Tat_SPI TATLIP_Sec_SPII PILIN_Sec_SPIII
0.000498 0.998621 0.000349 0.000198 0.000157 0.000144

TMHMM  Annotations      help

There is no transmembrane helices in MGYG000000398_01659.