logo
sublogo
You are browsing environment: HUMAN GUT
help

CAZyme Information: MGYG000000398_01658

You are here: Home > Sequence: MGYG000000398_01658

Basic Information | Genomic context | Full Sequence | Enzyme annotations |  CAZy signature domains |  CDD domains | CAZyme hits | PDB hits | Swiss-Prot hits | SignalP and Lipop annotations | TMHMM annotations

Basic Information help

Species
Lineage Bacteria; Firmicutes_A; Clostridia; Lachnospirales; Lachnospiraceae; TF01-11;
CAZyme ID MGYG000000398_01658
CAZy Family GH5
CAZyme Description hypothetical protein
CAZyme Property
Protein Length CGC Molecular Weight Isoelectric Point
952 MGYG000000398_10|CGC1 102874.7 7.4251
Genome Property
Genome Assembly ID Genome Size Genome Type Country Continent
MGYG000000398 2936252 MAG Sweden Europe
Gene Location Start: 105838;  End: 108696  Strand: +

Full Sequence      Download help

MRKQWKGLLA  VGLAAAMMVT  SLPALQLTKV  NAAGVDPINV  SGSTTATVKV  PDAPLDFTDL60
SAEEITKACG  LGWNLGNTLD  AWSWISGTKN  DYATGEGLWG  NVTTTKAIIK  AVHDMGFSTI120
RIPVTWGANI  TKDYEVDEDW  MSRVQEVVDY  AISQDMYVMI  NMHHDGCRND  APTPHGWFDV180
AGTDDEFAAV  REKYDGLWKT  IANRFKNYDE  HLMFAAMNEV  FDDANNLGWA  PEGSPNSEQM240
KLLKTEMERI  NTINQDFVGI  VRKSGGNNDK  RWVVIQPHNT  QIAAVTKDEY  KDLFQMPEDP300
AKRTMLEVHD  YDAFNASAQM  KEDVSNSYAS  NFKKLKEMFV  DNNIPVVIGE  FGFQGTNGRR360
FSFEGVGYML  KKYSMIGCVW  DEGYQKGYAL  VNRSELKPYD  NCVAALMRGF  YNVTDTSQVV420
KGTEIIPMTD  LNVSADAVTV  EAGKSVTVTA  TAAAPADTND  TIRWITEDDT  VATVSNGVIV480
GKAPGTTTVT  AKALNGEAAK  NIQVTVTAAT  YDNPTTDVVS  DSGAVIMQNG  QEVYLNATTL540
PENNGADIIY  RSADESVVNV  SCDGRVLAKS  TGETTVTAIS  TDGKSKVIKV  SVVEPAAPSE600
YQMRLAIHVL  YNFKETADDG  TELHSYYGTE  LSSDIITVNG  DGTYTLKFDC  ATDLSKDAIN660
NGVKSLNGIG  SLYIKDYDVT  KGNQKKSPEG  DGKLSYTSIK  VDGNELLTAP  TAEQSAMKGG720
VFDTGNPLNV  WDGSVVPEDK  LDIKKSLNMI  NFKGMDSPQV  IEITFKMDGF  HAQPTEKPAE780
TPSAVPSATA  PVSAVPAPAT  SPAITAQPEV  KKGDVVQAAS  AKYQVTNASK  KTVAYKAPVK840
KSAIVSVPNK  VTINGASYKV  TSIAKNAFKG  NKKLKKVTIG  SNVTSVGANA  FSGCKSLKTI900
VIKSTKLKTV  GKNAFKGINK  KAVIKVPAKK  LKAYKKLFKS  STGFKKSMKI  KK952

Enzyme Prediction      help

No EC number prediction in MGYG000000398_01658.

CAZyme Signature Domains help

Created with Snap479514219023828533338042847652357161866671476180985690496383GH5
Family Start End Evalue family coverage
GH5 96 383 2.1e-76 0.9891304347826086

CDD Domains      download full data without filtering help

Created with Snap479514219023828533338042847652357161866671476180985690493380Cellulase33352BglC849927LRR_3859927LRR_3861919LRR_5
Cdd ID Domain E-Value qStart qEnd sStart sEnd Domain Description
pfam00150 Cellulase 2.43e-44 93 380 13 265
Cellulase (glycosyl hydrolase family 5).
COG2730 BglC 1.10e-18 33 352 11 322
Aryl-phospho-beta-D-glucosidase BglC, GH1 family [Carbohydrate transport and metabolism].
sd00036 LRR_3 8.15e-15 849 927 40 114
leucine-rich repeats. A leucine-rich repeat (LRR) is a structural protein motif of 20-30 amino acids that is unusually rich in the hydrophobic amino acid leucine. The conserved eleven-residue sequence motif (LxxLxLxxN/CxL) within the LRRs corresponds to the beta-strand and adjacent loop regions, whereas the remaining parts of the repeats are variable. LRRs fold together to form a solenoid protein domain, termed leucine-rich repeat domain. Leucine-rich repeats are usually involved in protein-protein interactions.
sd00036 LRR_3 1.15e-14 859 927 1 68
leucine-rich repeats. A leucine-rich repeat (LRR) is a structural protein motif of 20-30 amino acids that is unusually rich in the hydrophobic amino acid leucine. The conserved eleven-residue sequence motif (LxxLxLxxN/CxL) within the LRRs corresponds to the beta-strand and adjacent loop regions, whereas the remaining parts of the repeats are variable. LRRs fold together to form a solenoid protein domain, termed leucine-rich repeat domain. Leucine-rich repeats are usually involved in protein-protein interactions.
pfam13306 LRR_5 7.01e-14 861 919 1 57
Leucine rich repeats (6 copies). This family includes a number of leucine rich repeats. This family contains a large number of BSPA-like surface antigens from Trichomonas vaginalis.

CAZyme Hits      help

Created with Snap479514219023828533338042847652357161866671476180985690454774AUO18792.1|GH5_451594AUO19859.1|GH5_452419CCO05502.1|GH5_444394QAA35398.1|GH5_456383CCO05659.1|GH5_4
Hit ID E-Value Query Start Query End Hit Start Hit End
AUO18792.1 2.33e-110 54 774 247 948
AUO19859.1 8.05e-81 51 594 103 644
CCO05502.1 3.19e-61 52 419 34 409
QAA35398.1 2.59e-57 44 394 26 386
CCO05659.1 4.91e-56 56 383 46 364

PDB Hits      download full data without filtering help

Created with Snap4795142190238285333380428476523571618666714761809856904624092JEP_A583996XSU_A614136Q1I_A594094IM4_A524086WQP_A
Hit ID E-Value Query Start Query End Hit Start Hit End Description
2JEP_A 3.07e-54 62 409 35 393
Nativefamily 5 xyloglucanase from Paenibacillus pabuli [Paenibacillus pabuli],2JEP_B Native family 5 xyloglucanase from Paenibacillus pabuli [Paenibacillus pabuli],2JEQ_A Family 5 xyloglucanase from Paenibacillus pabuli in complex with ligand [Paenibacillus pabuli]
6XSU_A 1.13e-49 58 399 9 339
GH5-4broad specificity endoglucanase from Ruminococcus flavefaciens [Ruminococcus flavefaciens],6XSU_B GH5-4 broad specificity endoglucanase from Ruminococcus flavefaciens [Ruminococcus flavefaciens]
6Q1I_A 5.46e-49 61 413 13 356
GH5-4broad specificity endoglucanase from Clostrdium longisporum [Clostridium longisporum],6Q1I_B GH5-4 broad specificity endoglucanase from Clostrdium longisporum [Clostridium longisporum]
4IM4_A 5.91e-49 59 409 4 334
ChainA, Endoglucanase E [Acetivibrio thermocellus],4IM4_B Chain B, Endoglucanase E [Acetivibrio thermocellus],4IM4_C Chain C, Endoglucanase E [Acetivibrio thermocellus],4IM4_D Chain D, Endoglucanase E [Acetivibrio thermocellus],4IM4_E Chain E, Endoglucanase E [Acetivibrio thermocellus],4IM4_F Chain F, Endoglucanase E [Acetivibrio thermocellus]
6WQP_A 6.00e-48 52 408 6 354
GH5-4broad specificity endoglucanase from Ruminococcus champanellensis [Ruminococcus champanellensis],6WQP_B GH5-4 broad specificity endoglucanase from Ruminococcus champanellensis [Ruminococcus champanellensis],6WQV_A GH5-4 broad specificity endoglucanase from Ruminococcus champanellensis with bound cellotriose [Ruminococcus champanellensis],6WQV_B GH5-4 broad specificity endoglucanase from Ruminococcus champanellensis with bound cellotriose [Ruminococcus champanellensis],6WQV_C GH5-4 broad specificity endoglucanase from Ruminococcus champanellensis with bound cellotriose [Ruminococcus champanellensis],6WQV_D GH5-4 broad specificity endoglucanase from Ruminococcus champanellensis with bound cellotriose [Ruminococcus champanellensis]

Swiss-Prot Hits      download full data without filtering help

Created with Snap479514219023828533338042847652357161866671476180985690456409sp|O08342|GUNA_PAEBA61413sp|P54937|GUNA_CLOLO59423sp|Q12647|GUNB_NEOPA59409sp|P10477|CELE_ACET267387sp|P17901|GUNA_RUMCH
Hit ID E-Value Query Start Query End Hit Start Hit End Description
O08342 5.46e-54 56 409 34 398
Endoglucanase A OS=Paenibacillus barcinonensis OX=198119 GN=celA PE=1 SV=1
P54937 1.61e-47 61 413 38 381
Endoglucanase A OS=Clostridium longisporum OX=1523 GN=celA PE=1 SV=1
Q12647 4.85e-46 59 423 22 376
Endoglucanase B OS=Neocallimastix patriciarum OX=4758 GN=CELB PE=2 SV=1
P10477 9.16e-45 59 409 54 384
Cellulase/esterase CelE OS=Acetivibrio thermocellus (strain ATCC 27405 / DSM 1237 / JCM 9322 / NBRC 103400 / NCIMB 10682 / NRRL B-4536 / VPI 7372) OX=203119 GN=celE PE=1 SV=2
P17901 5.36e-43 67 387 52 372
Endoglucanase A OS=Ruminiclostridium cellulolyticum (strain ATCC 35319 / DSM 5812 / JCM 6584 / H10) OX=394503 GN=celCCA PE=1 SV=1

SignalP and Lipop Annotations help

This protein is predicted as SP

Other SP_Sec_SPI LIPO_Sec_SPII TAT_Tat_SPI TATLIP_Sec_SPII PILIN_Sec_SPIII
0.000330 0.998933 0.000154 0.000237 0.000170 0.000155

TMHMM  Annotations      download full data without filtering help

start end
7 24