logo
sublogo
You are browsing environment: HUMAN GUT
help

CAZyme Information: MGYG000000287_04911

You are here: Home > Sequence: MGYG000000287_04911

Basic Information | Genomic context | Full Sequence | Enzyme annotations |  CAZy signature domains |  CDD domains | CAZyme hits | PDB hits | Swiss-Prot hits | SignalP and Lipop annotations | TMHMM annotations

Basic Information help

Species Robinsoniella sp900539655
Lineage Bacteria; Firmicutes_A; Clostridia; Lachnospirales; Lachnospiraceae; Robinsoniella; Robinsoniella sp900539655
CAZyme ID MGYG000000287_04911
CAZy Family GH127
CAZyme Description hypothetical protein
CAZyme Property
Protein Length CGC Molecular Weight Isoelectric Point
1983 MGYG000000287_121|CGC1 220716.2 4.563
Genome Property
Genome Assembly ID Genome Size Genome Type Country Continent
MGYG000000287 7249439 MAG Sweden Europe
Gene Location Start: 43;  End: 5994  Strand: +

Full Sequence      Download help

MGAVSAKGWL  LEQLYLQKNG  LTGAINDEYP  LYGPANGWRG  GTGDGWERGA  YYLRGLMSLA60
WVLDDDELKD  KSMEWINFIL  DSQRSNGFMG  PVNDGDGSSD  GWDWWPRMVI  LQVIRDYYEA120
TALQGTPDER  VLPFFDNYFR  YQLNRLPQKP  LNSWAASRGG  DNIEVLLWYY  NHAYNDQNPT180
ESDWIIELAE  ILASQTKSQD  SGLNWNDVFT  KTTVREHVVN  TTQAMKTPAV  LSQLPGREND240
KDSLKEGIFN  MGLDHGRVDK  LANSDEGARD  NLPYRGAELC  SIVESLLSNE  ISISITGESW300
LGDDMERIAY  NSLPAGYAPD  YTGHCYFQAQ  NQVMATHGNH  EFDCEHGDDS  AFGAPTGFEC360
CFPNMHMGWP  KFVQNMWMAT  KDNGLALVMY  GPNTVTAKGA  DGKTAIFDET  TDYPFKDTIG420
LDYSGEEARF  PLNLRLPAWC  ENPVVEVNGK  GVDISQADGF  AVIDRTWKPG  DQVTVTFPME480
VRTSVWYNNS  AAIEYGPLIF  SQRIEEDWRI  ASDDAAREIQ  YDPVGEFDRK  EVYPASDWNY540
GLVIDPEDPE  SSMQIEFADE  IGLQPFTLSN  APITMKVTGQ  KIPQWKLKGN  VVPEPPFSPI600
APDESLQEEI  QLVPYGCTRL  HMTQLPVVGE  PIESGITSKT  EQDSQTYQEN  GERVVEFDNV660
VVPSADDYTL  KIYYTGSGTL  RLNINGKYEE  QMDFNGTEPN  IVENLGSIVP  GTNQYFYFKH720
ENYNNIRFFG  NDNVVITGID  IVPVNPFTQP  EIKEAISNKN  GTSVTLNTNI  NRSSGFYTVN780
YGTESGKYTK  TAENFYSKKA  VITGLNPGET  YYFQISMLVN  GVEKLSEEVK  AESASVQPLT840
FKDDFSDPSV  SKNKWTLYDP  ENVIKFQPGK  MSVGVSKNIK  AMTGMQEWTD  YAVVADVIGT900
GNPDRDFGII  LRATDIRDGS  DGYNGYYVGI  NAVAGGLNIG  FADGGWNGIA  APGGIAYEEN960
KVYQLKTIIA  GERLAVYVDG  VKLYDENISD  MKSGNKTVPY  YAAGSAGLRS  WNQSFDVNSF1020
EVREITAEEY  EDLGIDNNLF  EDDFTNTEES  LNKWTVLDPK  NAVTFENGQI  NVANSDNLKI1080
MAGTGEETWE  DYAIETKLTG  PENPNRDFGV  MFRCTDVSAE  NGDSYKGYYV  GIDAIGNGLN1140
VGYANNGWND  ITKVPAFTYE  PGKIYDLKIL  VNGNMFKVFI  DGAQVYELED  DKFSYGSVGL1200
RSWKQPFTVN  YFKVRNLTKS  EADSFTTEIP  DPEPPIPVTP  DFQDDFEDAI  ASAQKWRLCG1260
SVDKIHIADG  KISMESNNNV  KAVAGDEAWT  DYVAEADVIL  DGKSNQNAGL  MYRVTGAGNN1320
GSDNYSGYFF  GIGNNDDGTG  YYISGYADNK  WHQTERKNLS  AFENGKAYRL  KAVAYQDMVA1380
LYIDNQFITR  FINTRYPNGM  IGLRSYEKPF  SADNIVVREV  TAEDLAPFDE  WTRYIDETFG1440
AHRTVQIKFP  KFSPTKEYKI  VYGTESGVYT  TEVYGLLHSP  GGSKSDKLSI  SVPENNTDYY1500
FRVIALDGNE  EIASSGETVV  HTGERYDITQ  DLDNLTESLT  VAENIDRASY  TQDSLRRLDF1560
AVANANRVLN  NQNSNLIDAR  LAKNCIAVGI  NELRETEKPI  VVLDHITVKP  PVKTDYFTQE1620
EADLKGMEVT  AVYSNQTEKK  LTAEDYRITG  FDTSSPGDKT  ITVTYTEGAI  SKNGTFHITV1680
REAPVPEKPD  STLLNALYQK  LSAIGQGNYT  ADSYRNLQNA  LAAANDVLQK  ENTSQEELNQ1740
ASDNLLLAFT  KLEKTTPVIP  EKVDKEKLQM  IYDTLKNTKN  NNYTEDSWKN  FQTALTAAQR1800
ILADTQATKS  QVSSAFDTLW  KTSQNLAVNP  PKPPAAPSVK  KGSVYTYKGL  KYKVTNAVKG1860
KETVSVIGAL  KKSITSLTIP  STVTLKGVKC  KVTYIKSKAF  KNYSKLTKVT  IGSNISNIGS1920
QAFCDDGKLK  YITIRSSVLT  KTGSKIFRGI  SAAAKIKVPR  SKRTSYKKLL  KNKGQKSSVK1980
IIW1983

Enzyme Prediction      help

No EC number prediction in MGYG000000287_04911.

CAZyme Signature Domains help

Created with Snap99198297396495594694793892991109011891288138814871586168517841883277500GH12710671204CBM668751015CBM66
Family Start End Evalue family coverage
GH127 277 500 3.4e-29 0.42366412213740456
CBM66 1067 1204 1.7e-20 0.8387096774193549
CBM66 875 1015 4.7e-17 0.8064516129032258

CDD Domains      download full data without filtering help

Created with Snap9919829739649559469479389299110901189128813881487158616851784188333501Glyco_hydro_127261500COG353318721981LRR_318731961LRR_518721949LRR_3
Cdd ID Domain E-Value qStart qEnd sStart sEnd Domain Description
pfam07944 Glyco_hydro_127 1.07e-44 33 501 50 503
Beta-L-arabinofuranosidase, GH127. One member of this family, from Bidobacterium longicum, UniProtKB:E8MGH8, has been characterized as an unusual beta-L-arabinofuranosidase enzyme, EC:3.2.1.185. It rleases l-arabinose from the l-arabinofuranose (Araf)-beta1,2-Araf disaccharide and also transglycosylates 1-alkanols with retention of the anomeric configuration. Terminal beta-l-arabinofuranosyl residues have been found in arabinogalactan proteins from a mumber of different plantt species. beta-l-Arabinofuranosyl linkages with 1-4 arabinofuranosides are also found in the sugar chains of extensin and solanaceous lectins, hydroxyproline (Hyp)2-rich glycoproteins that are widely observed in plant cell wall fractions. The critical residue for catalytic activity is Glu-338, in a ET/SCAS sequence context.
COG3533 COG3533 6.74e-29 261 500 271 500
Uncharacterized conserved protein, DUF1680 family [Function unknown].
sd00036 LRR_3 1.39e-11 1872 1981 14 113
leucine-rich repeats. A leucine-rich repeat (LRR) is a structural protein motif of 20-30 amino acids that is unusually rich in the hydrophobic amino acid leucine. The conserved eleven-residue sequence motif (LxxLxLxxN/CxL) within the LRRs corresponds to the beta-strand and adjacent loop regions, whereas the remaining parts of the repeats are variable. LRRs fold together to form a solenoid protein domain, termed leucine-rich repeat domain. Leucine-rich repeats are usually involved in protein-protein interactions.
pfam13306 LRR_5 7.65e-11 1873 1961 12 89
Leucine rich repeats (6 copies). This family includes a number of leucine rich repeats. This family contains a large number of BSPA-like surface antigens from Trichomonas vaginalis.
sd00036 LRR_3 3.23e-10 1872 1949 60 127
leucine-rich repeats. A leucine-rich repeat (LRR) is a structural protein motif of 20-30 amino acids that is unusually rich in the hydrophobic amino acid leucine. The conserved eleven-residue sequence motif (LxxLxLxxN/CxL) within the LRRs corresponds to the beta-strand and adjacent loop regions, whereas the remaining parts of the repeats are variable. LRRs fold together to form a solenoid protein domain, termed leucine-rich repeat domain. Leucine-rich repeats are usually involved in protein-protein interactions.

CAZyme Hits      help

Created with Snap991982973964955946947938929911090118912881388148715861685178418831651QJE95537.1|CBM661697QNF30112.1|CBM351649ATP55412.1|CBM661650ALL07583.1|CBM661631QPH39723.1|CBM66
Hit ID E-Value Query Start Query End Hit Start Hit End
QJE95537.1 3.84e-148 1 651 45 674
QNF30112.1 5.51e-144 1 697 45 749
ATP55412.1 5.97e-137 1 649 45 672
ALL07583.1 7.96e-137 1 650 45 664
QPH39723.1 6.82e-133 1 631 43 652

PDB Hits      download full data without filtering help

Created with Snap991982973964955946947938929911090118912881388148715861685178418832785005MQO_A
Hit ID E-Value Query Start Query End Hit Start Hit End Description
5MQO_A 2.22e-11 278 500 381 606
Glycosidehydrolase BT_1003 [Bacteroides thetaiotaomicron]

Swiss-Prot Hits      download full data without filtering help

Created with Snap9919829739649559469479389299110901189128813881487158616851784188316871833sp|Q9L7Q2|ZMPB_STRPN
Hit ID E-Value Query Start Query End Hit Start Hit End Description
Q9L7Q2 3.34e-07 1687 1833 414 551
Zinc metalloprotease ZmpB OS=Streptococcus pneumoniae serotype 4 (strain ATCC BAA-334 / TIGR4) OX=170187 GN=zmpB PE=3 SV=2

SignalP and Lipop Annotations help

This protein is predicted as OTHER

Other SP_Sec_SPI LIPO_Sec_SPII TAT_Tat_SPI TATLIP_Sec_SPII PILIN_Sec_SPIII
1.000041 0.000000 0.000000 0.000000 0.000000 0.000000

TMHMM  Annotations      help

There is no transmembrane helices in MGYG000000287_04911.