20120220 Tri-Con Cloud Computing Symposium

Post on 10-May-2015

280 views 1 download

Tags:

Transcript of 20120220 Tri-Con Cloud Computing Symposium

BioGPS and mygene.info: Consuming and Providing Cloud

Computing Resources

Molecular Med Tri-ConFebruary 20, 2012

Andrew Su, Ph.D.

http://sulab.org@andrewsu+Andrew Su

asu@scripps.edu

High-throughput molecular profiling is powerful2

Gene/protein list

m/z

Testable hypothesis

3

20 million papers900,000 new papers / year

Gene databases are numerous and overlapping4

… and hundreds more …

http://biogps.org

Community extensibility and user customizability5

Crowdsourcing depends on positive feedback6

Utility

UsersContributors

1001

2002

Utility

UsersContributors

Utility: A simple and universal plugin interface7

Utility

UsersContributors

Utility: A simple and universal plugin interface8

Utility

UsersContributors

Utility: A simple and universal plugin interface9

Utility

UsersContributors

Utility: A simple and universal plugin interface10

Utility

UsersContributors

Utility: A simple and universal plugin interface11

Utility: A simple and universal plugin interface12

Utility

UsersContributors

Total of 389 gene-centric online databases registered as BioGPS plugins

Users: BioGPS has critical mass13

• > 4100 registered users• 4000 unique visitors per week• 40,000 page views per week

1. Harvard2. NIH3. UCSD4. Scripps5. MIT6. Cambridge

7. U Penn8. Stanford9. Wash U10. UNC

Top 10 organizations

Daily pageviewsUtility

UsersContributors

Contributors: Explicit and implicit knowledge14

389 plugins registered (65% publicly shared)

by over 75 users

spanning 150+ domains

Utility

UsersContributors

BioGPS architecture15

http://mygene.info

mygene.info architecture16

http://mygene.info

NGINX

BioGPS as a cloud computing consumer17

EC2 Micro

EC2 Small

NGINX

EC2 MicroEC2 Micro

Total monthly cost: ~$100

BioGPS as a cloud computing provider18

Use case: Create web application to display custom Affymetrix data

“204252_at”

“CDK2”

Data set samples

204252_at

Exp

ress

ion

Gene Annotation as a Service

(GAaaS)

Users

Developers

Users

Developers

Users

Developers

http://mygene.info/query?q=cdk2

Gene query web service19

http://mygene.info/query?q=cdk?http://mygene.info/query?q=GO:0000307http://mygene.info/query?q=P24941http://mygene.info/query?q=204252_at

http://mygene.info/query?q=cdk*

Gene annotation web service20

http://mygene.info/gene/1017

Optimized for performance in web apps21

10 100 1000 10000 1000000.01

0.1

1

10

# of query terms

Tim

e (

s)

More documentation (paging, sorting, filtering, etc.) plus code snippets at http://mygene.info.

The future of BioGPS22

Third party content providers

The future of BioGPS23

Third party content providers

Semantic interpretation,

change detection, etc.

24

Erik ClarkeBen GoodSalvatore Loguercio

Ian MacleodChunlei Wu

Group members

Funding and Support

(BioGPS: GM83924, Gene Wiki: GM089820)

Contact

http://sulab.orgasu@scripps.edu

@andrewsu+Andrew Su