logo
sublogo
You are browsing environment: HUMAN GUT
help

CAZyme Information: MGYG000000032_04093

You are here: Home > Sequence: MGYG000000032_04093

Basic Information | Genomic context | Full Sequence | Enzyme annotations |  CAZy signature domains |  CDD domains | CAZyme hits | PDB hits | Swiss-Prot hits | SignalP and Lipop annotations | TMHMM annotations

Basic Information help

Species Hungatella effluvii
Lineage Bacteria; Firmicutes_A; Clostridia; Lachnospirales; Lachnospiraceae; Hungatella; Hungatella effluvii
CAZyme ID MGYG000000032_04093
CAZy Family GH136
CAZyme Description hypothetical protein
CAZyme Property
Protein Length CGC Molecular Weight Isoelectric Point
1928 MGYG000000032_12|CGC6 208708.54 4.2165
Genome Property
Genome Assembly ID Genome Size Genome Type Country Continent
MGYG000000032 6969476 Isolate United Kingdom Europe
Gene Location Start: 190126;  End: 195912  Strand: -

Full Sequence      Download help

MYSLSRWKTR  MTACIVSGAV  AFSTLFPVPA  AAAGKMAAYD  GSEVVDLAGN  REMKAYGERL60
VTGGRALVSR  DYGEEAGTGI  TYYLDSEAGN  DENEGTSEDK  PWKSLEMVNN  RTFQPGDRIL120
LKAGSVWEEE  TLSPKGSGEE  GNPIVIGAYG  DGEKPKLTGN  AAVPEVVSLY  NQQYWELSDL180
EITNTAAGFT  GAMNDQNGNK  LKDVRGIRIA  GQDGGDLYNY  YLHDLYVHDV  TGVCAWISGD240
NNGKPGILGK  QGWDKSKRTG  GILFEVLEPA  TDQPTVFHDI  TVERNVVNNN  SFGGIIVKQW300
KGADTGGEHW  ASREEGKGNA  ANNYYCENWK  PHTDVTIRDN  YLSQENSDYA  CNTIYLTSVQ360
GAVVEGNISK  EAGTCGIEMY  YADGVTVQRN  EVYKTRVKAG  GADSNAIDPD  KASTNILIQY420
NYIHDTGDGI  LLCGFIFGTS  VVRYNVIKDA  QKRYINPHGD  KGVNYVYNNI  LYNTIEKSNL480
PFIESSGGNG  VYLNKSGNMH  YINNNIFYNT  AKTTTSVGIG  EGPSTSYDSN  CYYGKGVQAP540
EQETRAVCGD  PMFEGGLTGD  EDDGAAELLK  LRLKPESPLL  NAGTQIEDDP  LLSISANPGS600
DFAGAPLFNG  EPDMGIFEYQ  GEAGSGILNG  YVEDPFGNIM  KGAAVKLKDT  DYAAVTDEQG660
FFAIAGVKPG  SYTAVISKEI  YLDGETPDIT  VEEKAVTRVQ  LRLGESLSDV  GSVKGCVSNS720
SGPLQGVTVT  ASLGDLVFTA  KTGNDGCYEM  TDVKIGTGYV  VTASKEGYTT  AATEEVRVMP780
AAAAEVNLIL  SKEIGKTTYL  IADLFDQYDA  GAFTNSSSGN  NIWKVSDISA  ADASKASLQI840
KAEGNGNKYL  EMAKTGSYSL  YAYTKKEYDL  SGIITVEARV  KRTAESASAN  QFGMYSFNKA900
DFKTGDPTGS  SNAIGTFALT  KGNIVTHNKK  GSSNTVNVQG  YTKNEWDIIR  NVINLDTNTF960
DFYVNDMNVP  KLADQPLRTQ  GKNIDRFEFF  SSGSNVGDLL  IDYFKVCTGP  AMDYDDAGIS1020
GIRVDGNEAE  WKGGNIYELE  ISSGSSEVKV  QPAASSIFAR  KIMVGDQDAT  KDAVTVPVSG1080
SGDSLRITVL  AEDGETEESY  QLNLVKEDAA  GLAYLTSLAM  EGVTLTPEFD  FNTMEYEGEV1140
PSDVSRVTLQ  YETVQASNEV  TIKVNNQYVT  DPVIPLKPGV  NVIEIGVASA  DGTSFADYTV1200
IVTRDCVIDG  IRIDTLPVKT  VYERGEEPDF  EGLTVGAYCE  GERVRVLDAA  EFAVSMPDTS1260
ETGTVPVTVT  YVTEDGMTLE  AVFDITVYDR  DTMEPQSIKV  VKPPEQTVYG  TGEAFQPDGM1320
EVRLLMKATA  SNAAPAEVRI  LDDGEYDAIG  DFTEPGDTKV  AIRYTWTDSE  GEERYLEDSV1380
AVTVYDEELD  YYETGITVTK  QPKKTVYETG  SVFDPEGMEV  KRTMKASGSN  AVYYKETITD1440
YDYDADELTK  TGTRKISVSH  EGTDKDGEDR  TFTTGVTVTV  TNREDILKNA  VLEQSVAGAR1500
EILNKENVAY  VTEEEKAAAA  DAAKEVFLDV  MGAEHTVLTK  EMADRMVELE  ALLKNAYPGI1560
TIRVEGDSKL  TEGAEVAGAL  LNAPLGQGNL  TIVSRITGTK  LPEDIPETKA  AAAMEIKLLV1620
NGEELQPDIP  LYMTLKIPEG  INKADLIIYH  VLDDGSIEVI  DPEISGGFLR  FFVRSFSTFV1680
IANPKEADLT  GIEITAQPKK  TEYRIHEAFD  ASGLVVTAVY  EDGTKAAVTE  YGLAGFDAAT1740
AGIKTITVSY  KGKTAKFTVT  VKAEDDGSGD  DDNGDNDDGG  GNNGSGSHGS  SGSPVVRSVL1800
PDTVKGSWKQ  NENGWWFETL  DGGFVKSDWA  RINEKWYYFG  DQGYMAVGWV  LDQNHWYYLG1860
ADGAMAADAW  ILDRGAWYFL  QSGGAMAADQ  WVQWNGNWYY  LNSDGSMAKD  TVVTVGYRIG1920
EDGVWRPD1928

Enzyme Prediction      help

No EC number prediction in MGYG000000032_04093.

CAZyme Signature Domains help

Created with Snap9619228938548257867477186796410601156125313491446154216381735183179618GH136
Family Start End Evalue family coverage
GH136 79 618 4.9e-118 0.9959266802443992

CDD Domains      download full data without filtering help

Created with Snap9619228938548257867477186796410601156125313491446154216381735183118011908PspC_subgroup_117971925PspC_relate_118011908pneumo_PspA18081908pneumo_PspA18031925COG5263
Cdd ID Domain E-Value qStart qEnd sStart sEnd Domain Description
NF033838 PspC_subgroup_1 1.74e-24 1801 1908 481 585
pneumococcal surface protein PspC, choline-binding form. The pneumococcal surface protein PspC, as described in Streptococcus pneumoniae, is a repetitive and highly variable protein, recognized by a conserved N-terminal domain and also by genomic location. This form, subgroup 1, has variable numbers of a choline-binding repeat in the C-terminal region, and is also known as choline-binding protein A. The other form, subgroup 2, is anchored covalently after cleavage by sortase at a C-terminal LPXTG site.
NF033840 PspC_relate_1 7.02e-23 1797 1925 504 646
PspC-related protein choline-binding protein 1. Members of this family share C-terminal homology to the choline-binding form of the pneumococcal surface antigen PspC, but not to its allelic LPXTG-anchored forms because they lack the choline-binding repeat region. Members of this family should not be confused with PspC itself, whose identity and function reflect regions N-terminal to the choline-binding region. See Iannelli, et al. (PMID: 11891047) for information about the different allelic forms of PspC.
NF033930 pneumo_PspA 4.04e-22 1801 1908 438 542
pneumococcal surface protein A. The pneumococcal surface protein proteins, found in Streptococcus pneumoniae, are repetitive, with patterns of localized high sequence identity across pairs of proteins given different specific names that recombination may be presumed. This protein, PspA, has an N-terminal region that lacks a cross-wall-targeting YSIRK type extended signal peptide, in contrast to the closely related choline-binding protein CbpA which has a similar C-terminus but a YSIRK-containing region at the N-terminus.
NF033930 pneumo_PspA 1.28e-21 1808 1908 485 582
pneumococcal surface protein A. The pneumococcal surface protein proteins, found in Streptococcus pneumoniae, are repetitive, with patterns of localized high sequence identity across pairs of proteins given different specific names that recombination may be presumed. This protein, PspA, has an N-terminal region that lacks a cross-wall-targeting YSIRK type extended signal peptide, in contrast to the closely related choline-binding protein CbpA which has a similar C-terminus but a YSIRK-containing region at the N-terminus.
COG5263 COG5263 3.36e-20 1803 1925 191 312
Glucan-binding domain (YG repeat) [Carbohydrate transport and metabolism].

CAZyme Hits      help

Created with Snap96192289385482578674771867964106011561253134914461542163817351831791284ASA23882.1|GH136781202CQR57267.1|GH136821204QDH21977.1|GH13661618BBC61707.1|GH13661618BAL62976.1|GH136
Hit ID E-Value Query Start Query End Hit Start Hit End
ASA23882.1 2.63e-169 79 1284 39 1233
CQR57267.1 3.11e-168 78 1202 38 1145
QDH21977.1 7.88e-162 82 1204 53 1155
BBC61707.1 8.61e-155 61 618 60 598
BAL62976.1 8.61e-155 61 618 60 598

PDB Hits      download full data without filtering help

Created with Snap96192289385482578674771867964106011561253134914461542163817351831746187V6M_A795006KQS_A795006KQT_A696185GQC_A746187V6I_A
Hit ID E-Value Query Start Query End Hit Start Hit End Description
7V6M_A 5.73e-37 74 618 3 576
ChainA, Fibronectin type III domain-containing protein [Tyzzerella nexilis]
6KQS_A 7.12e-27 79 500 244 663
CrystalStructure of GH136 lacto-N-biosidase from Eubacterium ramulus - selenomethionine derivative [Eubacterium ramulus ATCC 29099]
6KQT_A 9.42e-27 79 500 244 663
CrystalStructure of GH136 lacto-N-biosidase from Eubacterium ramulus - native protein [Eubacterium ramulus ATCC 29099]
5GQC_A 5.18e-22 69 618 3 596
Crystalstructure of lacto-N-biosidase LnbX from Bifidobacterium longum subsp. longum, ligand-free form [Bifidobacterium longum subsp. longum],5GQC_B Crystal structure of lacto-N-biosidase LnbX from Bifidobacterium longum subsp. longum, ligand-free form [Bifidobacterium longum subsp. longum],5GQC_C Crystal structure of lacto-N-biosidase LnbX from Bifidobacterium longum subsp. longum, ligand-free form [Bifidobacterium longum subsp. longum],5GQC_D Crystal structure of lacto-N-biosidase LnbX from Bifidobacterium longum subsp. longum, ligand-free form [Bifidobacterium longum subsp. longum],5GQC_E Crystal structure of lacto-N-biosidase LnbX from Bifidobacterium longum subsp. longum, ligand-free form [Bifidobacterium longum subsp. longum],5GQC_F Crystal structure of lacto-N-biosidase LnbX from Bifidobacterium longum subsp. longum, ligand-free form [Bifidobacterium longum subsp. longum],5GQC_G Crystal structure of lacto-N-biosidase LnbX from Bifidobacterium longum subsp. longum, ligand-free form [Bifidobacterium longum subsp. longum],5GQC_H Crystal structure of lacto-N-biosidase LnbX from Bifidobacterium longum subsp. longum, ligand-free form [Bifidobacterium longum subsp. longum],5GQF_A Crystal structure of lacto-N-biosidase LnbX from Bifidobacterium longum subsp. longum, lacto-N-biose complex [Bifidobacterium longum subsp. longum],5GQF_B Crystal structure of lacto-N-biosidase LnbX from Bifidobacterium longum subsp. longum, lacto-N-biose complex [Bifidobacterium longum subsp. longum],5GQG_A Crystal structure of lacto-N-biosidase LnbX from Bifidobacterium longum subsp. longum, galacto-N-biose complex [Bifidobacterium longum subsp. longum],5GQG_B Crystal structure of lacto-N-biosidase LnbX from Bifidobacterium longum subsp. longum, galacto-N-biose complex [Bifidobacterium longum subsp. longum]
7V6I_A 5.73e-19 74 618 8 608
ChainA, Lacto-N-biosidase [Bifidobacterium saguini DSM 23967]

Swiss-Prot Hits      download full data without filtering help

Created with Snap9619228938548257867477186796410601156125313491446154216381735183116701763sp|A0A401ETL2|EXGAL_BIFL2
Hit ID E-Value Query Start Query End Hit Start Hit End Description
A0A401ETL2 2.75e-08 1670 1763 1362 1453
Exo-beta-1,6-galactobiohydrolase OS=Bifidobacterium longum subsp. longum (strain ATCC 15707 / DSM 20219 / JCM 1217 / NCTC 11818 / E194b) OX=565042 GN=bl1,6Gal PE=1 SV=1

SignalP and Lipop Annotations help

This protein is predicted as SP

Other SP_Sec_SPI LIPO_Sec_SPII TAT_Tat_SPI TATLIP_Sec_SPII PILIN_Sec_SPIII
0.000433 0.998698 0.000224 0.000235 0.000208 0.000169

TMHMM  Annotations      download full data without filtering help

start end
12 34