20120220 Tri-Con Cloud Computing Symposium
-
Upload
andrew-su -
Category
Technology
-
view
280 -
download
1
Transcript of 20120220 Tri-Con Cloud Computing Symposium
BioGPS and mygene.info: Consuming and Providing Cloud
Computing Resources
Molecular Med Tri-ConFebruary 20, 2012
Andrew Su, Ph.D.
http://sulab.org@andrewsu+Andrew Su
High-throughput molecular profiling is powerful2
Gene/protein list
m/z
Testable hypothesis
3
20 million papers900,000 new papers / year
Gene databases are numerous and overlapping4
… and hundreds more …
http://biogps.org
Community extensibility and user customizability5
Crowdsourcing depends on positive feedback6
Utility
UsersContributors
1001
2002
Utility
UsersContributors
Utility: A simple and universal plugin interface7
Utility
UsersContributors
Utility: A simple and universal plugin interface8
Utility
UsersContributors
Utility: A simple and universal plugin interface9
Utility
UsersContributors
Utility: A simple and universal plugin interface10
Utility
UsersContributors
Utility: A simple and universal plugin interface11
Utility: A simple and universal plugin interface12
Utility
UsersContributors
Total of 389 gene-centric online databases registered as BioGPS plugins
Users: BioGPS has critical mass13
• > 4100 registered users• 4000 unique visitors per week• 40,000 page views per week
1. Harvard2. NIH3. UCSD4. Scripps5. MIT6. Cambridge
7. U Penn8. Stanford9. Wash U10. UNC
Top 10 organizations
Daily pageviewsUtility
UsersContributors
Contributors: Explicit and implicit knowledge14
389 plugins registered (65% publicly shared)
by over 75 users
spanning 150+ domains
Utility
UsersContributors
BioGPS architecture15
http://mygene.info
mygene.info architecture16
http://mygene.info
NGINX
BioGPS as a cloud computing consumer17
EC2 Micro
EC2 Small
NGINX
EC2 MicroEC2 Micro
Total monthly cost: ~$100
BioGPS as a cloud computing provider18
Use case: Create web application to display custom Affymetrix data
“204252_at”
“CDK2”
Data set samples
204252_at
Exp
ress
ion
Gene Annotation as a Service
(GAaaS)
Users
Developers
Users
Developers
Users
Developers
http://mygene.info/query?q=cdk2
Gene query web service19
http://mygene.info/query?q=cdk?http://mygene.info/query?q=GO:0000307http://mygene.info/query?q=P24941http://mygene.info/query?q=204252_at
http://mygene.info/query?q=cdk*
Gene annotation web service20
http://mygene.info/gene/1017
Optimized for performance in web apps21
10 100 1000 10000 1000000.01
0.1
1
10
# of query terms
Tim
e (
s)
More documentation (paging, sorting, filtering, etc.) plus code snippets at http://mygene.info.
The future of BioGPS22
Third party content providers
The future of BioGPS23
Third party content providers
Semantic interpretation,
change detection, etc.
24
Erik ClarkeBen GoodSalvatore Loguercio
Ian MacleodChunlei Wu
Group members
Funding and Support
(BioGPS: GM83924, Gene Wiki: GM089820)
Contact
http://[email protected]
@andrewsu+Andrew Su