logo
sublogo
You are browsing environment: HUMAN GUT
help

CAZyme Information: MGYG000003826_01153

You are here: Home > Sequence: MGYG000003826_01153

Basic Information | Genomic context | Full Sequence | Enzyme annotations |  CAZy signature domains |  CDD domains | CAZyme hits | PDB hits | Swiss-Prot hits | SignalP and Lipop annotations | TMHMM annotations

Basic Information help

Species
Lineage Bacteria; Firmicutes_A; Clostridia; Oscillospirales; Ruminococcaceae; UMGS1601;
CAZyme ID MGYG000003826_01153
CAZy Family GH5
CAZyme Description hypothetical protein
CAZyme Property
Protein Length CGC Molecular Weight Isoelectric Point
857 MGYG000003826_37|CGC1 95158.72 3.9728
Genome Property
Genome Assembly ID Genome Size Genome Type Country Continent
MGYG000003826 1835600 MAG United States North America
Gene Location Start: 2368;  End: 4941  Strand: -

Full Sequence      Download help

Enzyme Prediction      help

EC 3.2.1.151 3.2.1.4

CAZyme Signature Domains help

Family Start End Evalue family coverage
GH5 74 390 2.3e-88 0.9927536231884058

CDD Domains      download full data without filtering help

Cdd ID Domain E-Value qStart qEnd sStart sEnd Domain Description
pfam00150 Cellulase 1.10e-56 67 391 8 270
Cellulase (glycosyl hydrolase family 5).
COG2730 BglC 5.45e-25 34 355 30 326
Aryl-phospho-beta-D-glucosidase BglC, GH1 family [Carbohydrate transport and metabolism].
NF033609 MSCRAMM_ClfA 1.90e-04 726 826 782 887
MSCRAMM family adhesin clumping factor ClfA. Clumping factor A is an MSCRAMM (Microbial Surface Components Recognizing Adhesive Matrix Molecules). It is heavily studied in Staphylococcus aureus both for its biological role in adhesion and for its potential for vaccination. Features of the sequence, but also of other MSCRAMM adhesins, include a long run of Ser-Asp dipeptide repeats and a C-terminal cell wall anchoring LPXTG motif.
NF000535 MSCRAMM_SdrC 0.002 726 831 800 905
MSCRAMM family adhesin SdrC. Features of this protein family include a YSIRK-type signal peptide at the N-terminus and a variable-length C-terminal region of Ser-Asp (SD) repeats followed by an LPXTG motif for surface immobilization by sortase.
NF000535 MSCRAMM_SdrC 0.002 698 825 676 803
MSCRAMM family adhesin SdrC. Features of this protein family include a YSIRK-type signal peptide at the N-terminus and a variable-length C-terminal region of Ser-Asp (SD) repeats followed by an LPXTG motif for surface immobilization by sortase.

CAZyme Hits      help

Hit ID E-Value Query Start Query End Hit Start Hit End
QYR19601.1 6.65e-116 36 430 32 412
QYR21757.1 1.12e-115 1 417 3 399
AAR65336.1 2.95e-115 40 418 33 397
QYR24001.1 3.68e-114 25 430 20 411
ANY70776.1 3.87e-114 40 418 18 382

PDB Hits      download full data without filtering help

Hit ID E-Value Query Start Query End Hit Start Hit End Description
2JEP_A 8.99e-113 49 418 37 392
Nativefamily 5 xyloglucanase from Paenibacillus pabuli [Paenibacillus pabuli],2JEP_B Native family 5 xyloglucanase from Paenibacillus pabuli [Paenibacillus pabuli],2JEQ_A Family 5 xyloglucanase from Paenibacillus pabuli in complex with ligand [Paenibacillus pabuli]
6WQY_A 7.37e-61 49 419 26 374
ChainA, Cellulase [Phocaeicola salanitronis DSM 18170]
3NDY_A 8.90e-61 48 418 13 339
Thestructure of the catalytic and carbohydrate binding domain of endoglucanase D from Clostridium cellulovorans [Clostridium cellulovorans],3NDY_B The structure of the catalytic and carbohydrate binding domain of endoglucanase D from Clostridium cellulovorans [Clostridium cellulovorans],3NDY_C The structure of the catalytic and carbohydrate binding domain of endoglucanase D from Clostridium cellulovorans [Clostridium cellulovorans],3NDY_D The structure of the catalytic and carbohydrate binding domain of endoglucanase D from Clostridium cellulovorans [Clostridium cellulovorans],3NDZ_A The structure of the catalytic and carbohydrate binding domain of endoglucanase D from Clostridium cellulovorans bound to cellotriose [Clostridium cellulovorans],3NDZ_B The structure of the catalytic and carbohydrate binding domain of endoglucanase D from Clostridium cellulovorans bound to cellotriose [Clostridium cellulovorans],3NDZ_C The structure of the catalytic and carbohydrate binding domain of endoglucanase D from Clostridium cellulovorans bound to cellotriose [Clostridium cellulovorans],3NDZ_D The structure of the catalytic and carbohydrate binding domain of endoglucanase D from Clostridium cellulovorans bound to cellotriose [Clostridium cellulovorans]
6WQP_A 3.01e-60 44 418 13 354
GH5-4broad specificity endoglucanase from Ruminococcus champanellensis [Ruminococcus champanellensis],6WQP_B GH5-4 broad specificity endoglucanase from Ruminococcus champanellensis [Ruminococcus champanellensis],6WQV_A GH5-4 broad specificity endoglucanase from Ruminococcus champanellensis with bound cellotriose [Ruminococcus champanellensis],6WQV_B GH5-4 broad specificity endoglucanase from Ruminococcus champanellensis with bound cellotriose [Ruminococcus champanellensis],6WQV_C GH5-4 broad specificity endoglucanase from Ruminococcus champanellensis with bound cellotriose [Ruminococcus champanellensis],6WQV_D GH5-4 broad specificity endoglucanase from Ruminococcus champanellensis with bound cellotriose [Ruminococcus champanellensis]
4IM4_A 6.58e-60 50 417 10 332
ChainA, Endoglucanase E [Acetivibrio thermocellus],4IM4_B Chain B, Endoglucanase E [Acetivibrio thermocellus],4IM4_C Chain C, Endoglucanase E [Acetivibrio thermocellus],4IM4_D Chain D, Endoglucanase E [Acetivibrio thermocellus],4IM4_E Chain E, Endoglucanase E [Acetivibrio thermocellus],4IM4_F Chain F, Endoglucanase E [Acetivibrio thermocellus]

Swiss-Prot Hits      download full data without filtering help

Hit ID E-Value Query Start Query End Hit Start Hit End Description
O08342 9.48e-110 49 418 42 397
Endoglucanase A OS=Paenibacillus barcinonensis OX=198119 GN=celA PE=1 SV=1
P28623 1.82e-58 48 418 44 370
Endoglucanase D OS=Clostridium cellulovorans (strain ATCC 35296 / DSM 3052 / OCM 3 / 743B) OX=573061 GN=engD PE=1 SV=2
P10477 4.85e-55 50 417 60 382
Cellulase/esterase CelE OS=Acetivibrio thermocellus (strain ATCC 27405 / DSM 1237 / JCM 9322 / NBRC 103400 / NCIMB 10682 / NRRL B-4536 / VPI 7372) OX=203119 GN=celE PE=1 SV=2
P23660 6.09e-55 49 418 29 361
Endoglucanase A OS=Ruminococcus albus OX=1264 GN=celA PE=1 SV=1
P28621 1.05e-53 44 406 39 360
Endoglucanase B OS=Clostridium cellulovorans (strain ATCC 35296 / DSM 3052 / OCM 3 / 743B) OX=573061 GN=engB PE=3 SV=1

SignalP and Lipop Annotations help

This protein is predicted as LIPO

Other SP_Sec_SPI LIPO_Sec_SPII TAT_Tat_SPI TATLIP_Sec_SPII PILIN_Sec_SPIII
0.051161 0.024006 0.924571 0.000054 0.000059 0.000155

TMHMM  Annotations      download full data without filtering help

start end
9 31
830 852