Download - Ligand search and data mining of Structural Genomics structures Abhinav Kumar, Herbert Axelrod, Ashley Deacon Structure Determination Core, Joint Center.

Transcript
Page 1: Ligand search and data mining of Structural Genomics structures Abhinav Kumar, Herbert Axelrod, Ashley Deacon Structure Determination Core, Joint Center.

Ligand search and data mining of Ligand search and data mining of Structural Genomics structuresStructural Genomics structures

Abhinav Kumar, Herbert Axelrod, Ashley DeaconAbhinav Kumar, Herbert Axelrod, Ashley Deacon

Structure Determination Core, Joint Center for Structural Genomics (JCSG), Structure Determination Core, Joint Center for Structural Genomics (JCSG), Stanford Synchrotron Radiation Laboratory, Menlo Park, CA, USAStanford Synchrotron Radiation Laboratory, Menlo Park, CA, USA

UCSD & Burnham (Bioinformatics Core)John Wooley

Adam Godzik Slawomir Grzechnik Lukasz Jaroszewski Dana WeekesLian Duan Sri Krishna Subramanian Natasha Sefcovic Piotr KozbialAndrew Morse Prasad BurraTamara Astakhova Josie AlaoenCindy Cook

TSRI (NMR Core)Kurt Wüthrich Reto Horst Maggie JohnsonAmaranth ChatterjeeMichael GeraltWojtek AugustyniakPedro SerranoBill PedriniWilliam Placzek

Stanford /SSRLStructure Determination Core

Keith Hodgson Ashley DeaconMitchell Miller Debanu DasHsiu-Ju (Jessica) Chiu Kevin JinChristopher Rife Qingping XuSilvya Oommachen Scott TalafuseHenry van den Bedem Ronald Reyes Christine Trame

Scientific Advisory BoardSir Tom Blundell Robert Stroud Univ. Cambridge Center for Structure of Membrane Proteins Homme Hellinga Membrane Protein Expression Center Duke University Medical Center UC San FranciscoJames Naismith James Paulson The Scottish Structural Proteomics facility Consortium for Functional Glycomics Univ. St. Andrews The Scripps Research InstituteSoichi Wakatsuki Todd Yeates Photon Factory, KEK, Japan UCLA-DOE Inst. for Genomics and ProteomicsJames Wells UC San Francisco

The JCSG is supported by the NIH Protein Structure Initiative (PSI) Grant U54 GM074898 from NIGMS (www.nigms.nih.gov). Portions of this research were carried out at the Stanford Synchrotron Radiation Laboratory (SSRL). The SSRL is a national user facility operated by Stanford University on behalf of the U.S. Department of Energy, Office of Basic Energy Sciences. The SSRL Structural Molecular Biology Program is supported by the Department of Energy, Office of Biological and Environmental Research, and by the NIH.

GNF & TSRI (Crystallomics Core)Scott Lesley Mark Knuth Dennis CarltonThomas Clayton Kevin D. Murphy Christina TroutMarc Deller Daniel McMullan Heath Klock Polat Abdubek Claire Acosta Linda M. ColumbusJulie Feuerhelm Joanna C. Hale Thamara JanaratneHope Johnson Linda Okach Edward NigoghossianSebastian Sudek Aprilfawn White Bernhard GeierstangerGlen Spraggon Ylva Elias Sanjay AgarwallaCharlene Cho Bi-Ying Yeh Anna GrzechnikJessica Canseco Mimmi Brown

JCSG Ligand Search4

7

Search Results (35 hits)

ACY ADP AMP BR CA CL EDO FMN GLC GOL IOD MG NCA NI ORO P33 PO4 SO4 Ligand Depot:

ACY ADP AMP BR CA CL EDO FMN GLC GOL IOD MG NCA NI ORO P33 PO4 SO4 HIC-Up:

Ligand Visualization Links

JCSGFMN UNL

Archaeoglobus Fulgidus Dsm 4304

Crystal Structure of Hypothetical Protein (NP_068944.1) from Archaeoglobus Fulgidus at 1.30 Å resolution

NP_068944.1PF089811vp8TB0885A35

CESGFMNArabidopsis Thaliana

12-0xo-Phytodienoate Reductase Isoform 3NP_178662.1PF007241q45SGT9848034

…………………….

JCSGFMN GOL SO4

Jannaschia Sp. Ccs1

Crystal Structure of Pyridoxamine 5'-phosphate Oxidase- Related FMN-binding (YP_508196.1) From Jannaschia Sp. Ccs1 at 1.60 Å resolution

YP_508196.1PF012432ou5FJ9446A3

JCSGEDO FMN SO4 UNL

Clostridium Acetobutylicum

Crystal Structure of NIMC/NIMA Family Protein (NP_349178.1) from Clostridium Acetobutylicum at 1.80 Å resolution

NP_349178.1PF012432ig6FH7614A2

JCSGEDO FMN NCA

Pyrococcus Horikoshii Ot3

Crystal Structure of FMN-binding Protein (NP_142786.1) from Pyrococcus Horikoshii at 1.35 Å resolution

NP_142786.1PF016132r6vFB10607B1

PSILigandsOrganismDescriptionAccessionPFAMPDBTargetN

6

Unique PSI Ligands8

PDB Ligand Name Ligand PSI2A3L Coformycin 5'-Phosphate CF5 CESG2OU3 1H-Indole-3-Carbaldehyde I3A JCSG1VR0 (2R)-3-Sulfolactic Acid 3SL JCSG2OD6 10-Oxohexadecanoic Acid OHA JCSG1X92 D-Glycero-D-Mannopyranose-7-Phosphate M7P MCSG1O8B Beta-D-Arabinofuranose-5'-Phosphate ABF MCSG2OSU 6-Diazenyl-5-Oxo-L-Norleucine DON MCSG1M33 3-Hydroxy-Propanoic Acid 3OH MCSG1RTW (4-Amino-2-Methylpyrimidin-5-Yl)Methyl Dihydrogen Phosphate MP5 NESG2NW9 6-Fluoro-L-Tryptophan FT6 NESG1XKL 2-Amino-4H-1,3-Benzoxathiin-4-Ol STH NESG1LW4 3-Hydroxy-2-[(3-Hydroxy-2-Methyl-5-Phosphonooxymethyl- Pyridin-4-Ylmethyl)-Amino]-Butyric Acid TLP NYSGXRC2B4B N-Ethyl-N-[3-(Propylamino)Propyl]Propane- 1,3-Diamine B33 NYSGXRC1TUF Azelaic Acid AZ1 NYSGXRC2PUZ N-(Iminomethyl)-L-Glutamic Acid NIG NYSGXRC2Q09 3-[(4S)-2,5-Dioxoimidazolidin-4-Yl]Propanoic Acid DI6 NYSGXRC2GVC 1-Methyl-1,3-Dihydro-2H-Imidazole-2-Thione MMZ NYSGXRC1Y0G 2-[(2E,6E,10E,14E,18E,22E,26E)-3,7,11,15,19,23,27,31- Octamethyldotriaconta-2,6,10,14,18,22,26,30- Octaenyl]Phenol 8PP NYSGXRC1Z2L Allantoate Ion 1AL NYSGXRC1Y80 Co-5-Methoxybenzimidazolylcobamide B1M SECSG1KPH Didecyl-Dimethyl-Ammonium 10A TBSGC1KPI Didecyl-Dimethyl-Ammonium 10A TBSGC1N2H Pantoyl Adenylate PAJ TBSGC1N2I Pantoyl Adenylate PAJ TBSGC1BVR Trans-2-Hexadecenoyl-(N-Acetyl-Cysteamine)- Thioester THT TBSGC1QPR 5-Phosphoribosyl-1-(Beta-Methylene) Pyrophosphate PPC TBSGC1P44 5-{[4-(9H-Fluoren-9-Yl)Piperazin-1-Yl]Carbonyl}- 1H-Indole GEQ TBSGC

Unique Ligands9

(R)-2-Hydroxy-3-Sulfopropanoic acid (3SL) bound to the structure of putative

2-phosphosulfolactatetitle 2 phosphatase from Clostridium Acetobutylicum (1VR0)

Indole-3-Carboxaldehyde (I3A) bound to the structure of tellurite resistance

protein of cog3793 (zp_00109916.1) from Nostoc Punctiforme PCC 73102 (2OU3)

10-Oxohexadecanoic acid (OHA) bound to the structure of Ferredoxin-like

Protein (JCVI_PEP_1096682647733) from an environmental metagenome

(Unidentified Marine Microbe) (2OD6)

FK9436A (2OH1)Acetyltransferase Gnat family

FB8805A (2Q9K)Unknown protein

Unknown Ligands (UNL)

Ligands bound to JCSG new folds10

Target PDB Description Organism Ligand

CL6107A 2ICH Putative ATTH (NP_841447.1) at 2.00 A Nitrosomonas Europaea NHE

TB0797A 1VR0 Putative 2-phosphosulfolactate Phosphatase at 2.6 A Clostridium Acetobutylicum 3SL

TM0160 1VJL Predicted Protein related to Wound Inducive Proteins in Plants at 1.90 A Thermotoga Maritima UNL

TM0449 1KQ4 Thy1-complementing Protein at 2.25 A Thermotoga Maritima FAD

TM0574 1VKY S-adenosylmethionine Trna Ribosyltransferase at 2.00 A Thermotoga Maritima UNL

TM1394 1VQ0 33 kDa Chaperonin (heat Shock Protein 33 Homolog) at 2.20 A Thermotoga Maritima UNL

TM1464 1VKM Conserved Hypothetical Protein Possibly Involved in Carbohydrate Metabolism at 1.90 A Thermotoga Maritima Msb8 UNL

TM1506 1VK9 Hypothetical Protein at 2.70 A Thermotoga Maritima UNL

TM1553 1VRM Hypothetical Protein at 1.58 A Thermotoga Maritima Msb8 UNL

2ICH

1VQ0

1VR0 1VJL 1KQ4 1VKY

1VRM1VK91VKM

Each project moves from target selection through publication along the Target Pipeline.

The JCSG Target Pipeline2

Autoindex Integrate Solve TraceScale

1. Screen Crystals and Collect Data

2. Automatically Process Data

3. Refine and Evaluate Structures

4. Disseminate Information* Publish Web based Tools

TOPSPAN (www.topsan.org) Ligand Search (smb.slac.stanford.edu/public/jcsg/cgi/jcsg_ligand_check.pl)

* in collaboration with BIC

The Role of the Structure Determination Core in the JCSG3

The JCSG (www.jcsg.org) is one of the four large-scale structural genomics centers funded by NIGMS as part of the production phase of the Protein Structure Initiative (PSI). More than 2600 structures have been deposited into the PDB by the PSI centers as of 2007, of which the JCSG has contributed over 500 structures.

The Joint Center for Structural Genomics (JCSG)1

Binding Modes of Ligands11

There are over 340 structures in PDB with the co-factor Flavin Mononucleotide (FMN) bound to the protein

The binding poses of FMN display considerable variations due to the torsional flexibility in the molecule.

However, unique binding poses can be observed in proteins belonging to specific PFAM families.

Number of Structures

303PF04289

1082PF01613

21147PF01243

TotalNon-PSIPSIPFAM

PF01243 (Pyridoxamine 5'-

phosphate oxidase)PF01613

(Flavin reductase like domain)

PF04289 (Unknown Function DUF447)

1y30

5Summary of Ligands (1606 structures)

Ligands (269 structures; 140 different ligands): UNL(70), UNX(22), LLP(6), SIN(6), NDP(6), MA7(6), NAG(5), PLM(4), UNK(4), GUN(3), APC(3), SUC(3), BAL(3), GLC(3), PAF(3), APR(2), GAL(2), NCN(2), CSD(2), SAI(2), CEI(2), BIO(2), HMH(2), SAP(2), GNP(2), 144(2), NCA(2), G4P(2), MPO(2), SRT(2), ANP(2), PCP(2), BGC(2), PAJ(2), NIG(1), PRP(1), NIO(1), ABF(1), IPR(1), MTA(1), CP(1), MLT(1), DI6(1), MED(1), MLZ(1), 5GP(1), CSO(1), CDP(1), I3A(1), 2PL(1), HED(1), G1P(1), NBZ(1), CSY(1), FRU(1), PLG(1), THF(1), B1M(1), ACP(1), DU(1), MMZ(1), OHA(1), 16A(1), THT(1), M7P(1), 3GC(1), CF5(1), PEO(1), CTZ(1), ADE(1), FT6(1), KEG(1), LUM(1), XLS(1), BAM(1), ADN(1), PMP(1), ADQ(1), B33(1), DGI(1), G3H(1), OXG(1), NDS(1), SAL(1), 3SL(1), SIB(1), STH(1), FEO(1), G3P(1), OXN(1), FES(1), TYD(1), DGT(1), 8PP(1), CO2(1), MP5(1), NTM(1), PNS(1), AES(1), APK(1), UVW(1), TRE(1), PYR(1), NAI(1), TCL(1), NMN(1), MAN(1), BFD(1), HHP(1), RIP(1), RBF(1), ORO(1), SNN(1), DTP(1), ZID(1), DEP(1), UPG(1), HXA(1), AAT(1), DTY(1), DON(1), NPO(1), C2E(1), AGC(1), BDF(1), PHT(1), OSB(1), NVA(1), CRO(1), BDN(1), TNE(1), SOG(1), AGS(1), TLP(1), 1PS(1), DUT(1), CXS(1), GEQ(1), MRD(1), G6P(1)

Co-factors (211 structures; 21 different co-factors): FMN(36), NAD(29), COA(18), NAP(17), PLP(15), ADP(15), FAD(15), SAM(14), ATP(9), SAH(9), AMP(9), HEM(8), ACO(7), GDP(4), FS4(3), U5P(2), MLC(1), COD(1), CNC(1), UTP(1), CTP(1)

Metal Ions (647 structures; 30 different metal ions): MG(177), ZN(174), NA(102), CA(83), NI(40), MN(31), FE(26), K(16), FE2(9), CD(8), PT(8), HG(7), CO(5), SM(2), WO4(2), PR(2), AU(2), BA(1), CS(1), MW2(1), SE(1), ARS(1), ZN3(1), O4M(1), YT3(1), LI(1), MO2(1), MO3(1), VO4(1), MO6(1)

Non-metal Ions (692 structures; 22 different non-metal ions): SO4(324), CL(243), PO4(118), NO3(11), IOD(10), BR(10), SCN(8), CO3(4), CAC(4), POP(3), AZI(3), SUL(2), BCT(2), ALF(2), OXL(2), PER(1), SO3(1), MLI(1), PO3(1), THJ(1), 1AL(1), NH4(1)

Organics (90 structures; 26 different organics): IPA(14), EOH(13), BME(9), BEZ(5), TLA(5), SEO(5), AKG(5), ETX(4), TAR(4), PGO(4), DTT(4), OAA(2), ACE(2), DMS(2), MLA(1), DOX(1), XYL(1), MOH(1), 3OH(1), AZ1(1), PPI(1), IOH(1), FOR(1), MYR(1), GTT(1), LMT(1)

Buffers (240 structures; 15 different buffers): ACT(86), ACY(47), FMT(37), CIT(27), TRS(16), EPE(15), MES(12), IMD(8), TMN(2), 10A(2), BTB(2), ICT(1), CPS(1), FLC(1), NHE(1)

Precipitants (98 structures; 13 different precipitants): PEG(38), PG4(28), PGE(16), 1PE(8), P6G(7), 2PE(3), PE4(3), P33(3), PE5(2), PEF(1), BU3(1), 1PG(1), PE8(1)

Salts (3 structures; 3 different salts): DPO(1), AF3(1), PPC(1)

Detergents (2 structures; 1 different detergents): BOG(2)

Cryos (502 structures; 5 different cryos): GOL(244), EDO(241), MPD(32), EGL(3), CRY(2)

Top Related