PRAGMA 10 Biosciences Working Group Update
Habibah Wahab, Ph.DWilfred W. Li, Ph.D.On behalf of Karpjoo Jeong, Ph.D.
2
Key Activities
Bioinformatics mpiBLAST-G2 iGAP/Gfarm/CSF4
Avian Flu Project Metagenomics Annotation Computational Chemistry Biosciences Portal
M*Grid NCHC Portal My WorkSphere Telescience AMEXg APAC portals
Education and Training PRIME
CNIC – Kai Nan, Zhong-hua Lu
hosting 2 students from UCSD on Avian flu projects
University of Zurich PRIUS
Osaka University Kohei Ichikawa Susumu Date
Summer Internship Program Jilin University
Zhaohui Ding Xiaohui Wei
3
Publications
[1] X. Wei, J. Jiang, W. W. Li, O. Tatebe, G. Xu, L. Hu, and J. Ju, "Implementing Data Aware Scheduling and Data Management in Gfarm using LSFtm Scheduler Plugin Mechanism," Future Generation of Computer Systems, Submitted, 2006.
[2] X. Wei, Z. Ding, W. W. Li, O. Tatebe, J. Jiang, L. Hu, and P. W. Arzberger, "Grid Infrastructure for Bioinformatics Applications Based on CSF4," Future Generations of Computer Systems, Submitted, 2006.
[3] W. W. Li, S. Krishnan, K. Mueller, K. Ichikawa, S. Date, S. Dallakyan, M. Sanner, C. Misleh, Z. Ding, X. Wei, O. Tatebe, and P. W. Arzberger, "Building cyberinfrastructure for bioinformatics using service oriented architecture," CCGrid 2006, Singapore, 2006.
[4] D. Abramson, A. Lynch, H. Takemaya, Y. Tanimura, S. Date, H. Nakmura, K. Jeong, S. Hwang, J. Zhu, Z.-h. Lu, C. Amoreira, K. K. Baldridge, H.-C. Lee, C.-W. Wang, H.-L. Shih, T. Molina, W. W. Li, and P. W. Arzberger, "Deploying Scientific Applications to the PRAGMA Grid testbed: Strategies and Lessons," CCGrid, Singapore, 2006.
4
mpiBLAST-G2
5
Protein sequences
Prediction of : signal peptides (SignalP, PSORT) transmembrane (TMHMM, PSORT) coiled coils (COILS) low complexity regions (SEG)
Structural assignment of domains by PSI-BLAST profiles on FOLDLIB
Structural assignment of domains by 123D on FOLDLIB
Structural assignment of domains by WU-BLAST
Data Warehouse
Functional assignment by PFAM, NR assignments
FOLDLIB
Building FOLDLIB:
PDB chains SCOP domains PDP domains CE matches PDB vs. SCOP
90% sequence non-identical minimum size 25 aa coverage (90%, gaps <30, ends<30)
Domain location prediction by sequence
structure info sequence info
Step 1
Step 2
Step 3
Step 4
Step 5
Step 6
NR, PFAMSCOP, PDB
Integrative Genome Annotation Pipeline(iGAP)
6
• Gfarm virtual filesystem allows existing application to utilize distributed compute and data resources transparently and efficiently.
• Applications such as iGAP and their required input data may be automatically replicated to each node on demand.
Distributed analysis in a virtualfilesystem
Gfarm File System
/gfarm/eol/apps
igap psiblast Foldlib NR
Virtual Directory Tree
Transparent distributed data access and file affinity-based application scheduling
From Cluster-wide to Grid-wide environmentapps dbs
7
PRAGMA Gfarm Testbed
Taiwan NCHC Academia sinica
USA NCSA SDSC NBCR
Japan AIST Titech
Korea KISTI
China CNIC JLU
8
CSF4
CSF4 integrate with Gfarm
Gfarm Security Share Secure Key GSI Authentication
User certificate DelegateProxy certificate
User credentials
FrontendScheduler
A
FrontendScheduler
B
Mutual Authentication
GFS
9
Opal: Web Service Wrapper
10
Opal WSRF Operation Provider
11
M*Grid and e-Glyconjugates portal
Reusable components to support a large community
Comprehensive environment for molecular simulation studies
12
Computational Chemistry
Use of Nimrod/G
Workflow built with web services
Gemstone Led by
Baldridge
13
14
15
User interface
Hawk
Rocks-52
ASCC
Aurora
IOIT-HCM
Gsiftp
inpu
ts &
resu
ltsGlo
bus-
subm
it
jobs
Gsiftp inputs &
results
Globus-submit jobs
Gsiftp inputs & resultsGlobus-submit jobs
Gsiftp inputs &
results
Globus-subm
it jobs
Upload files/submit jobs
Download & view results
16
Real Science Applications Rational Drug Disovery of Novel Dengue Therapeutics. Characterisation of drug binding site(s) on the DNA Elucidating isoniazid resistance using Molecular Modelling Techniques. Structure and function of PHA synthase Drug receptor database Binding mode of andrographolide to Renin, HIV-1 Protease and Tyrosine
kinase enzymes. Binding of erythromycin and its relatives to ribosome. Molecular Docking
and Molecular Dynamics Simulation Study. Investigation of the Binding Properties of Some Flavonoids to Calcium
using Molecular Modelling Techniques. Molecular Modelling of Cytochrome P450 2D6. Effects of Allelic variation on
the enzyme activity. Structure based drug design of compounds derived from marine natural
products. Chemical Reactivity as a Tool to Study Carcinogenicity: Reaction between
Estradiol and Estrone 3,4-Quinones Ultimate Carcinogens and Guanine.
17
New Collaboration in the Fight Against Avian Flu AIST (Japan), CNIC (China),
Konkook/KISTI (Korea), UCSD/SDSC (USA), JLU (China), CGPBRI (Univ. Hawaii), USM (Malaysia) IBM World Community Grid
Avian Flu Proteome Annotation and Analysis iGAP Rosetta MEME
Involve students and postdocs
Solving real problems using bioinformatics, molecular simulation and grid tools AutoDock Amber Gromacs GAMESS CHARMM NAMD
18
Pictures
19
Participating Institutions
SDSC/UCSD Wilfred Li Tomas Molina Cindy Zheng Peter Arzberger
AIST Osamu Tatebe Hiroshi Takemiya Yusuke Tanimura Satoshi Seikiguchi
Jilin University Zhaohui Ding Xiaohui Wei
APAC Rajesh Chhabra
Osaka University Susumu Date Kohei Ichikawa Shinji Shimojo
Konkuk Karpjoo Jeong Taehoon Kim
Kookmin Suntae Hwang Daeyong Heo
KISTI Jae-Hyuck Kwak Young-Chul Hwang
20
Participating Institutions
USM Habibah Wahab Amin Malik Sah Chan Huah Yong
BII Santosh Mishra Arun Krishnan
Academia sinica Hurng-Chun Lee, Chi-Wei Wang Horng-Liang Shih
University of Zurich/SDSC Kim Baldridge
NCHC Fang-Pang Lin Whey-Fone Tsai Weicheng Huang
CNIC Zhong-Hua Lu Kai Nan Bao Ping Yan
University of Wisconsin Katherine (Trina) McMahon
Other Working Groups Mason Katz Yoshio Tanaka Shinji Shimojo
21
Breakout Session Participants
USM Habibah Wahab Ahmad Yussof Hassan Amin Malik Shah Abdul Majid
Drug and DNA interactions Drug design
UCSD/SDSC Wilfred Li
Gfarm Additional applications
Kim Baldridge GAMESS Gemstone GAMESS/APBS hybrid pipeline
CNIC Xiaoming Zhang
ASCC Hsin-Yen Chen
Bioportal, MPICH-G2, LCG, Docking
EGEE Mimos Mashkuri Yaacob Irdawah Ab. Rahman
Osaka University Kohei Ichikawa
Web services Susumu Date
TDW APAC
Rajesh Chhabra Grid portals
22
Portals
Biosciences portal Wiki already set up PRAGMA wiki – http://auriga.qut.edu.au/pragma set up a PRAGMA portal and wiki
For users to try AMEXg
One way to install Link to all sites with available applications
Tiled Display Much details in VMD (KB)
Could not see before Gfarm testbed Other technologies
23
Communications
Biosciences mailing list
[email protected] msn, skype Contact info listed.
Application stack APBS Autodock Amber GAMESS
Pipelines Applications compiled for different architectures
With examples Central site Complaints about heterogeneity of resources Shared installations
24
Avian flu Analysis
Two projects planned for PRIME students at CNIC Epitope identification Host selectivity Need synopsis to refine collaborations and
subprojects Scientific discussion during breakout session –
PRAGMA 11 Discuss results Project coordination
25
Metagenomics Annotation
Sequencing of genomes from native environmental samples Shared software stackRoutine analysisUse Gfarm/CSF4 for scheduling and data
replicationData servicesPortal (shared infrastructure)
26
Supercomputing Demonstrations
Potential TopicsTiled display using VMD – Kim BaldrigeBioPortal – Grid application portalCNIC demonstration – CNGridGridSphere portal to Gfarm/CSF4Biosciences Portal.
Booth LocationSC04 -- KISTI
27
Other Activities
Summer interns Australia: visa required US: J-1 visa
Grant applications Applications from own funding agencies
Intellectual properties International collaborations Standard nondisclosure agreements
World community grid Philanthropic activities
28
ISGC 2006 1~4 May Taipei
EGEE Workshop Its purpose is to introduce the EGEE project, including its goals,
infrastructure, middleware and operations Symposium
It focuses on Grid core technology, Grid architecture, applications on various domains such as High Energy Physics, Bio/Medical, Digital Archive, and Atmospherics. World-Wide Grid application development, infrastructure interoperation, and collaboration would also be discussed.
http://www.twgrid.org
Top Related