logo
sublogo
You are browsing environment: HUMAN GUT
help

CAZyme Information: MGYG000000398_01658

You are here: Home > Sequence: MGYG000000398_01658

Basic Information | Genomic context | Full Sequence | Enzyme annotations |  CAZy signature domains |  CDD domains | CAZyme hits | PDB hits | Swiss-Prot hits | SignalP and Lipop annotations | TMHMM annotations

Basic Information help

Species
Lineage Bacteria; Firmicutes_A; Clostridia; Lachnospirales; Lachnospiraceae; TF01-11;
CAZyme ID MGYG000000398_01658
CAZy Family GH5
CAZyme Description hypothetical protein
CAZyme Property
Protein Length CGC Molecular Weight Isoelectric Point
952 MGYG000000398_10|CGC1 102874.7 7.4251
Genome Property
Genome Assembly ID Genome Size Genome Type Country Continent
MGYG000000398 2936252 MAG Sweden Europe
Gene Location Start: 105838;  End: 108696  Strand: +

Full Sequence      Download help

Enzyme Prediction      help

No EC number prediction in MGYG000000398_01658.

CAZyme Signature Domains help

Family Start End Evalue family coverage
GH5 96 383 2.1e-76 0.9891304347826086

CDD Domains      download full data without filtering help

Cdd ID Domain E-Value qStart qEnd sStart sEnd Domain Description
pfam00150 Cellulase 2.43e-44 93 380 13 265
Cellulase (glycosyl hydrolase family 5).
COG2730 BglC 1.10e-18 33 352 11 322
Aryl-phospho-beta-D-glucosidase BglC, GH1 family [Carbohydrate transport and metabolism].
sd00036 LRR_3 8.15e-15 849 927 40 114
leucine-rich repeats. A leucine-rich repeat (LRR) is a structural protein motif of 20-30 amino acids that is unusually rich in the hydrophobic amino acid leucine. The conserved eleven-residue sequence motif (LxxLxLxxN/CxL) within the LRRs corresponds to the beta-strand and adjacent loop regions, whereas the remaining parts of the repeats are variable. LRRs fold together to form a solenoid protein domain, termed leucine-rich repeat domain. Leucine-rich repeats are usually involved in protein-protein interactions.
sd00036 LRR_3 1.15e-14 859 927 1 68
leucine-rich repeats. A leucine-rich repeat (LRR) is a structural protein motif of 20-30 amino acids that is unusually rich in the hydrophobic amino acid leucine. The conserved eleven-residue sequence motif (LxxLxLxxN/CxL) within the LRRs corresponds to the beta-strand and adjacent loop regions, whereas the remaining parts of the repeats are variable. LRRs fold together to form a solenoid protein domain, termed leucine-rich repeat domain. Leucine-rich repeats are usually involved in protein-protein interactions.
pfam13306 LRR_5 7.01e-14 861 919 1 57
Leucine rich repeats (6 copies). This family includes a number of leucine rich repeats. This family contains a large number of BSPA-like surface antigens from Trichomonas vaginalis.

CAZyme Hits      help

Hit ID E-Value Query Start Query End Hit Start Hit End
AUO18792.1 2.33e-110 54 774 247 948
AUO19859.1 8.05e-81 51 594 103 644
CCO05502.1 3.19e-61 52 419 34 409
QAA35398.1 2.59e-57 44 394 26 386
CCO05659.1 4.91e-56 56 383 46 364

PDB Hits      download full data without filtering help

Hit ID E-Value Query Start Query End Hit Start Hit End Description
2JEP_A 3.07e-54 62 409 35 393
Nativefamily 5 xyloglucanase from Paenibacillus pabuli [Paenibacillus pabuli],2JEP_B Native family 5 xyloglucanase from Paenibacillus pabuli [Paenibacillus pabuli],2JEQ_A Family 5 xyloglucanase from Paenibacillus pabuli in complex with ligand [Paenibacillus pabuli]
6XSU_A 1.13e-49 58 399 9 339
GH5-4broad specificity endoglucanase from Ruminococcus flavefaciens [Ruminococcus flavefaciens],6XSU_B GH5-4 broad specificity endoglucanase from Ruminococcus flavefaciens [Ruminococcus flavefaciens]
6Q1I_A 5.46e-49 61 413 13 356
GH5-4broad specificity endoglucanase from Clostrdium longisporum [Clostridium longisporum],6Q1I_B GH5-4 broad specificity endoglucanase from Clostrdium longisporum [Clostridium longisporum]
4IM4_A 5.91e-49 59 409 4 334
ChainA, Endoglucanase E [Acetivibrio thermocellus],4IM4_B Chain B, Endoglucanase E [Acetivibrio thermocellus],4IM4_C Chain C, Endoglucanase E [Acetivibrio thermocellus],4IM4_D Chain D, Endoglucanase E [Acetivibrio thermocellus],4IM4_E Chain E, Endoglucanase E [Acetivibrio thermocellus],4IM4_F Chain F, Endoglucanase E [Acetivibrio thermocellus]
6WQP_A 6.00e-48 52 408 6 354
GH5-4broad specificity endoglucanase from Ruminococcus champanellensis [Ruminococcus champanellensis],6WQP_B GH5-4 broad specificity endoglucanase from Ruminococcus champanellensis [Ruminococcus champanellensis],6WQV_A GH5-4 broad specificity endoglucanase from Ruminococcus champanellensis with bound cellotriose [Ruminococcus champanellensis],6WQV_B GH5-4 broad specificity endoglucanase from Ruminococcus champanellensis with bound cellotriose [Ruminococcus champanellensis],6WQV_C GH5-4 broad specificity endoglucanase from Ruminococcus champanellensis with bound cellotriose [Ruminococcus champanellensis],6WQV_D GH5-4 broad specificity endoglucanase from Ruminococcus champanellensis with bound cellotriose [Ruminococcus champanellensis]

Swiss-Prot Hits      download full data without filtering help

Hit ID E-Value Query Start Query End Hit Start Hit End Description
O08342 5.46e-54 56 409 34 398
Endoglucanase A OS=Paenibacillus barcinonensis OX=198119 GN=celA PE=1 SV=1
P54937 1.61e-47 61 413 38 381
Endoglucanase A OS=Clostridium longisporum OX=1523 GN=celA PE=1 SV=1
Q12647 4.85e-46 59 423 22 376
Endoglucanase B OS=Neocallimastix patriciarum OX=4758 GN=CELB PE=2 SV=1
P10477 9.16e-45 59 409 54 384
Cellulase/esterase CelE OS=Acetivibrio thermocellus (strain ATCC 27405 / DSM 1237 / JCM 9322 / NBRC 103400 / NCIMB 10682 / NRRL B-4536 / VPI 7372) OX=203119 GN=celE PE=1 SV=2
P17901 5.36e-43 67 387 52 372
Endoglucanase A OS=Ruminiclostridium cellulolyticum (strain ATCC 35319 / DSM 5812 / JCM 6584 / H10) OX=394503 GN=celCCA PE=1 SV=1

SignalP and Lipop Annotations help

This protein is predicted as SP

Other SP_Sec_SPI LIPO_Sec_SPII TAT_Tat_SPI TATLIP_Sec_SPII PILIN_Sec_SPIII
0.000330 0.998933 0.000154 0.000237 0.000170 0.000155

TMHMM  Annotations      download full data without filtering help

start end
7 24