logo
sublogo
You are browsing environment: HUMAN GUT
help

CAZyme Information: MGYG000000227_00078

You are here: Home > Sequence: MGYG000000227_00078

Basic Information | Genomic context | Full Sequence | Enzyme annotations |  CAZy signature domains |  CDD domains | CAZyme hits | PDB hits | Swiss-Prot hits | SignalP and Lipop annotations | TMHMM annotations

Basic Information help

Species Bacillus sonorensis
Lineage Bacteria; Firmicutes; Bacilli; Bacillales; Bacillaceae; Bacillus; Bacillus sonorensis
CAZyme ID MGYG000000227_00078
CAZy Family GH5
CAZyme Description hypothetical protein
CAZyme Property
Protein Length CGC Molecular Weight Isoelectric Point
559 63854.44 6.0167
Genome Property
Genome Assembly ID Genome Size Genome Type Country Continent
MGYG000000227 4782975 Isolate China Asia
Gene Location Start: 76957;  End: 78636  Strand: -

Full Sequence      Download help

MREKGFTWFI  TSVILLFLLI  GTALSGAIPQ  ASGTGWGTKA  KAQSVAPQKL  LEHMSPGWNL60
GNTLDAVPTE  GSWNNPPVRE  HTFDDIRAAG  FKSVRIPVTW  DSHIGSAPEY  TIAPEWMNRV120
EEVTDWALER  DFYVVLNIHH  DSWLWINRMG  NSQQETLEKL  GKVWRQIADR  FKHKSERLLF180
EIVNEPTGMS  AYQLNLLNRE  MLKVIRSTGG  ENDRRLVIIG  GLEDNKDELL  NTFEPPEDDR240
IVLTFHYYSP  WDYVSNWWGR  TTWGTAKDIE  DMERDMKPVY  ERFVSKGYPV  IIGEYGTLGA300
NEKHSKRLYH  ETLVRLSHKY  QMVTMWWDNG  NDQFDRMARK  WRDPVVKDIV  IQAGRGVSNA360
IIKPADLFIR  KGQTITDQTA  DIELNGNKLT  GIYRQSEPLQ  KGTDYTVDNS  GKTVTIKASY420
LAKLIGGSPS  FGTKAELAFA  FDKGARQVMD  VILYDTPELE  KHEFTISKSA  IKGDLEIPAS480
LNGTQLATVK  GVIDSTGRPV  LEEVWSWTPY  MNYDEDFYEK  DGNLYLRERV  LNYLKSDSTF540
TFEMWPKGVE  TVVKVKVTP559

Enzyme Prediction      help

EC 3.2.1.4

CAZyme Signature Domains help

Created with Snap27558311113916719522325127930733536339141944747550353170330GH5367447CBM46
Family Start End Evalue family coverage
GH5 70 330 5.2e-86 0.9891304347826086
CBM46 367 447 2.7e-20 0.8850574712643678

CDD Domains      download full data without filtering help

Created with Snap27558311113916719522325127930733536339141944747550353164332Cellulase54331BglC362451CBM_X2
Cdd ID Domain E-Value qStart qEnd sStart sEnd Domain Description
pfam00150 Cellulase 2.05e-58 64 332 10 271
Cellulase (glycosyl hydrolase family 5).
COG2730 BglC 8.29e-30 54 331 51 361
Aryl-phospho-beta-D-glucosidase BglC, GH1 family [Carbohydrate transport and metabolism].
pfam03442 CBM_X2 4.48e-18 362 451 1 83
Carbohydrate binding domain X2. This domain binds to cellulose and to bacterial cell walls. It is found in glycosyl hydrolases and in scaffolding proteins of cellulosomes (multiprotein glycosyl hydrolase complexes). In the cellulosome it may aid cellulose degradation by anchoring the cellulosome to the bacterial cell wall and by binding it to its substrate. This domain has an Ig-like fold.

CAZyme Hits      help

Created with Snap2755831111391671952232512793073353633914194474755035311559ASB89577.1|GH5_41559QAT65156.1|GH5_41559SCA85699.1|GH5_41559QAS16156.1|GH5_41559CAJ70720.1|GH5_4
Hit ID E-Value Query Start Query End Hit Start Hit End
ASB89577.1 0.0 1 559 1 559
QAT65156.1 0.0 1 559 1 551
SCA85699.1 0.0 1 559 1 551
QAS16156.1 0.0 1 559 1 560
CAJ70720.1 0.0 1 559 1 560

PDB Hits      download full data without filtering help

Created with Snap275583111139167195223251279307335363391419447475503531285594YZP_A45454V2X_A255455XRC_A425455E0C_A425455E09_A
Hit ID E-Value Query Start Query End Hit Start Hit End Description
4YZP_A 0.0 28 559 4 536
Crystalstructure of a tri-modular GH5 (subfamily 4) endo-beta-1, 4-glucanase from Bacillus licheniformis [Bacillus licheniformis],4YZT_A Crystal structure of a tri-modular GH5 (subfamily 4) endo-beta-1, 4-glucanase from Bacillus licheniformis complexed with cellotetraose [Bacillus licheniformis]
4V2X_A 8.59e-87 4 545 28 565
ChainA, Endo-beta-1,4-glucanase (cellulase B) [Halalkalibacterium halodurans]
5XRC_A 1.52e-83 25 545 12 550
ATrimodular GH5_4 Subfamily Endoglucanase Structure with Large Unit Cell [Bacillus sp. BG-CS10],5XRC_B A Trimodular GH5_4 Subfamily Endoglucanase Structure with Large Unit Cell [Bacillus sp. BG-CS10],5XRC_C A Trimodular GH5_4 Subfamily Endoglucanase Structure with Large Unit Cell [Bacillus sp. BG-CS10]
5E0C_A 3.49e-83 42 545 5 517
StructuralInsight of a Trimodular Halophilic Cellulase with a Family 46 Carbohydrate-Binding Module [Bacillus sp. BG-CS10]
5E09_A 9.44e-78 42 545 5 517
StructuralInsight of a Trimodular Halophilic Cellulase with a Family 46 Carbohydrate-Binding Module [Bacillus sp. BG-CS10]

Swiss-Prot Hits      download full data without filtering help

Created with Snap27558311113916719522325127930733536339141944747550353111545sp|P23550|GUNB_PAELA17330sp|O08342|GUNA_PAEBA14330sp|P28621|GUNB_CLOC714358sp|P28623|GUND_CLOC743330sp|P10477|CELE_ACET2
Hit ID E-Value Query Start Query End Hit Start Hit End Description
P23550 4.56e-98 11 545 6 545
Endoglucanase B OS=Paenibacillus lautus OX=1401 GN=celB PE=3 SV=1
O08342 3.35e-64 17 330 8 369
Endoglucanase A OS=Paenibacillus barcinonensis OX=198119 GN=celA PE=1 SV=1
P28621 9.44e-64 14 330 12 341
Endoglucanase B OS=Clostridium cellulovorans (strain ATCC 35296 / DSM 3052 / OCM 3 / 743B) OX=573061 GN=engB PE=3 SV=1
P28623 1.53e-62 14 358 11 376
Endoglucanase D OS=Clostridium cellulovorans (strain ATCC 35296 / DSM 3052 / OCM 3 / 743B) OX=573061 GN=engD PE=1 SV=2
P10477 3.54e-62 43 330 53 352
Cellulase/esterase CelE OS=Acetivibrio thermocellus (strain ATCC 27405 / DSM 1237 / JCM 9322 / NBRC 103400 / NCIMB 10682 / NRRL B-4536 / VPI 7372) OX=203119 GN=celE PE=1 SV=2

SignalP and Lipop Annotations help

This protein is predicted as SP

Other SP_Sec_SPI LIPO_Sec_SPII TAT_Tat_SPI TATLIP_Sec_SPII PILIN_Sec_SPIII
0.000354 0.999033 0.000166 0.000145 0.000134 0.000133

TMHMM  Annotations      download full data without filtering help

start end
5 27