logo
sublogo
You are browsing environment: HUMAN GUT
help

CAZyme Information: MGYG000000220_02570

You are here: Home > Sequence: MGYG000000220_02570

Basic Information | Genomic context | Full Sequence | Enzyme annotations |  CAZy signature domains |  CDD domains | CAZyme hits | PDB hits | Swiss-Prot hits | SignalP and Lipop annotations | TMHMM annotations

Basic Information help

Species AM07-15 sp003477405
Lineage Bacteria; Firmicutes_A; Clostridia; Oscillospirales; Butyricicoccaceae; AM07-15; AM07-15 sp003477405
CAZyme ID MGYG000000220_02570
CAZy Family CBM54
CAZyme Description hypothetical protein
CAZyme Property
Protein Length CGC Molecular Weight Isoelectric Point
1633 175922.64 4.4803
Genome Property
Genome Assembly ID Genome Size Genome Type Country Continent
MGYG000000220 3018710 Isolate China Asia
Gene Location Start: 36670;  End: 41571  Strand: +

Full Sequence      Download help

MRPRNALRRV  SALLLTLAML  IGLMPQWSGE  AQAASYMQPY  LDKMVNWGFM  RGDIQGNLRP60
DNNITRAEFV  TIINRAFGYK  KIGTTPFKDV  KESDWYSDDV  AIAYNVGYIN  GTSKSTFSPT120
KEITREEAAV  ILARNLMMQA  SVGESTSFTD  SRELSSWSRG  LVDTAASYKL  ISGYPDGSFR180
PHNPITRGET  AIMVVNAVGT  PVSKAEEYQL  GSVYGNVLIT  ASGATLKNTT  IAGNLYVTAG240
VDLGNVLLEN  VTVLGEIVLS  GGGVSEGGDD  SLVLRNVNGS  KIIVDNLKNQ  QVSLRVEGDG300
TVDVTSVRTD  TYIVDRTASG  YGLKEIQLDG  EDGRVLKLSG  NVKKVVNYTP  KSYIGVESGQ360
VETITIDEKA  TGSTLHIAAG  ATVDQLNLDV  GTTVTGDGDV  ANLTVNADGS  TVSMLPDQII420
IRPGVEANID  GEVMDTEAAA  ESSSDPRMLA  GYPQITDLAP  TSATAKFAAN  KKGTIYWAVT480
SVTDGSVNAQ  DLINPPSYSS  TILKKGSVSV  TGSGQTATAK  ISGLTSDGAY  YLSAVLVDAR540
EDQSPVKVIS  FTTPDNTVPA  FATGYPYLSR  VTNVSAQVTT  MATKTCRLYY  AVLPKGSKAP600
TGEDFKANAV  TGNLGYGSLD  VSKNTPYTFD  VNNVPLEELQ  SYDLYLWLTD  IEGGSSSAVK660
KLTFTTVDKT  PPVFQTEPTV  NSVKETSVGL  YANLNEAGTL  YWVIVPQGTE  YPKPLAGQSG720
KVDLTSDTAK  LQVAAGMNAL  KSGKASMTAG  KDVTFTVSGL  SKETAYDLYY  VAQDKAGNYS780
ASVKMITIHT  LDPNAPTVTQ  EFTRYNADEK  DHPLADTDIR  LVFSEAVQDA  ETNTTLVSLY840
EAVTTASTDA  EKNDARNRMA  NILRNEIVLY  ADTGSGLPQE  VEVATGFVDK  NTADWVIDYR900
YAEIKLEQGK  TVVTFPTKDV  KKESALNLKS  GATYYFEVQG  IADTSESKNV  MGVTKLDPFT960
TVFAQVNLSP  GSNTETEIDT  GSVDDKPELD  LYWRMDPVST  DKVEDGIRWD  MIIWSDTTVE1020
FDLYERTGGS  GTWTKVNADD  KPFTITIDST  TKRSGASVQV  NDKDNTKPNS  PIFQPLKEIK1080
EGTVYEYAVH  FIKVNGEGDR  TTWNGTVQMD  VNAIAGSNDN  LGRMGVGGWE  ESWNNLVGKD1140
LTNIGVPSTL  TLYKPFRDQT  APHFINDRPT  FTPGDSVVRM  DLMLDRVGTI  YWMVAKRGNV1200
RTTGMNGEDY  SPSGTNPGQY  MDLPESGVGK  YTLSIATPSY  DQVVDPGKYM  VSSDVEYGSL1260
TCGASVQSVT  VENLIPNQDY  VAYFVIKGTS  QVYSSVYCYR  FTTGDVTKPY  ITMQELSPNV1320
AFTTSEDSDL  NYVLFAGTEL  PTVFNQKFKQ  YVDSSKLAEF  STAAGEEADT  MTVLDAIMRT1380
GTGGYSWFDL  YAEPPAKEGE  GKKIRDTIEQ  IVLRGQGSGG  SPGATGNAVT  KAGDEIQRDF1440
TKDMDPQSAT  NYYCIAVAKN  VLGGEYAFKA  VSNVHIPDKT  PPVLYAPGAN  GVKKSDGTFS1500
GTLTLNFSEN  LYWIPENGDT  KQICAMLNKN  KTATETDAVL  EDDGTVTPGK  KVLIKHMGGD1560
VDTLTPSSTS  AISSTFTFSY  SGVRIGDTLT  LFSDGYIADA  RGNSTKTKII  LEYQANSTGL1620
NGAIAGGWVI  ISQ1633

Enzyme Prediction      help

No EC number prediction in MGYG000000220_02570.

CAZyme Signature Domains help

Created with Snap811632443264084895716537348168989791061114312241306138814691551197309CBM54
Family Start End Evalue family coverage
CBM54 197 309 1.7e-19 0.9298245614035088

CDD Domains      download full data without filtering help

Created with Snap811632443264084895716537348168989791061114312241306138814691551148189SLH87128SLH72194inl_like_NEAT_162198inl_like_NEAT_141132inl_like_NEAT_1
Cdd ID Domain E-Value qStart qEnd sStart sEnd Domain Description
pfam00395 SLH 5.61e-09 148 189 1 41
S-layer homology domain.
pfam00395 SLH 9.42e-09 87 128 1 42
S-layer homology domain.
NF033190 inl_like_NEAT_1 1.93e-07 72 194 567 687
NEAT domain-containing leucine-rich repeat protein. Members of this family have an N-terminal NEAT (near transporter) domain often associated with iron transport, followed by a leucine-rich repeat region with significant sequence similarity to the internalins of Listeria monocytogenes. However, since Bacillus cereus (from which this protein was described, in PMID:16978259) is not considered an intracellular pathogen, and the function may be iron transport rather than internalization, applying the name "internalin" to this family probably would be misleading.
NF033190 inl_like_NEAT_1 4.48e-07 62 198 616 752
NEAT domain-containing leucine-rich repeat protein. Members of this family have an N-terminal NEAT (near transporter) domain often associated with iron transport, followed by a leucine-rich repeat region with significant sequence similarity to the internalins of Listeria monocytogenes. However, since Bacillus cereus (from which this protein was described, in PMID:16978259) is not considered an intracellular pathogen, and the function may be iron transport rather than internalization, applying the name "internalin" to this family probably would be misleading.
NF033190 inl_like_NEAT_1 3.73e-04 41 132 655 748
NEAT domain-containing leucine-rich repeat protein. Members of this family have an N-terminal NEAT (near transporter) domain often associated with iron transport, followed by a leucine-rich repeat region with significant sequence similarity to the internalins of Listeria monocytogenes. However, since Bacillus cereus (from which this protein was described, in PMID:16978259) is not considered an intracellular pathogen, and the function may be iron transport rather than internalization, applying the name "internalin" to this family probably would be misleading.

CAZyme Hits      help

Created with Snap81163244326408489571653734816898979106111431224130613881469155141633QIA32369.1|CBM5441633QQR07714.1|CBM5481604BCK84769.1|CBM5441528QQR30851.1|CBM5441528ASB41592.1|CBM54
Hit ID E-Value Query Start Query End Hit Start Hit End
QIA32369.1 0.0 4 1633 6 1615
QQR07714.1 0.0 4 1633 6 1615
BCK84769.1 0.0 8 1604 17 1610
QQR30851.1 8.63e-233 4 1528 6 1684
ASB41592.1 8.63e-233 4 1528 6 1684

PDB Hits      download full data without filtering help

Created with Snap811632443264084895716537348168989791061114312241306138814691551872196BT4_A872193PYW_A961974AQ1_A961974AQ1_C
Hit ID E-Value Query Start Query End Hit Start Hit End Description
6BT4_A 4.08e-08 87 219 26 165
Crystalstructure of the SLH domain of Sap from Bacillus anthracis in complex with a pyruvylated SCWP unit [Bacillus anthracis]
3PYW_A 4.14e-08 87 219 5 144
Thestructure of the SLH domain from B. anthracis surface array protein at 1.8A [Bacillus anthracis]
4AQ1_A 7.03e-06 96 197 11 109
Structureof the SbsB S-layer protein of Geobacillus stearothermophilus PV72p2 in complex with nanobody KB6 [Geobacillus stearothermophilus]
4AQ1_C 7.03e-06 96 197 11 109
Structureof the SbsB S-layer protein of Geobacillus stearothermophilus PV72p2 in complex with nanobody KB6 [Geobacillus stearothermophilus]

Swiss-Prot Hits      download full data without filtering help

Created with Snap81163244326408489571653734816898979106111431224130613881469155134206sp|P38537|SLAP_LYSSH35194sp|P38536|APU_THETU35194sp|P38535|XYNX_ACETH49198sp|Q06852|SLAP1_ACET249194sp|Q06848|ANCA_ACET2
Hit ID E-Value Query Start Query End Hit Start Hit End Description
P38537 6.40e-18 34 206 40 219
Surface-layer 125 kDa protein OS=Lysinibacillus sphaericus OX=1421 PE=3 SV=1
P38536 4.38e-16 35 194 1688 1851
Amylopullulanase OS=Thermoanaerobacterium thermosulfurigenes OX=33950 GN=amyB PE=3 SV=2
P38535 1.06e-15 35 194 914 1077
Exoglucanase XynX OS=Acetivibrio thermocellus OX=1515 GN=xynX PE=3 SV=1
Q06852 1.32e-13 49 198 2103 2264
Cell surface glycoprotein 1 OS=Acetivibrio thermocellus (strain ATCC 27405 / DSM 1237 / JCM 9322 / NBRC 103400 / NCIMB 10682 / NRRL B-4536 / VPI 7372) OX=203119 GN=olpB PE=3 SV=2
Q06848 6.42e-13 49 194 242 394
Cellulosome-anchoring protein OS=Acetivibrio thermocellus (strain ATCC 27405 / DSM 1237 / JCM 9322 / NBRC 103400 / NCIMB 10682 / NRRL B-4536 / VPI 7372) OX=203119 GN=ancA PE=1 SV=1

SignalP and Lipop Annotations help

This protein is predicted as SP

Other SP_Sec_SPI LIPO_Sec_SPII TAT_Tat_SPI TATLIP_Sec_SPII PILIN_Sec_SPIII
0.000314 0.998948 0.000200 0.000198 0.000165 0.000143

TMHMM  Annotations      download full data without filtering help

start end
7 29