Kevin Jorissen - eResearch Australasia...

30

Transcript of Kevin Jorissen - eResearch Australasia...

Page 1: Kevin Jorissen - eResearch Australasia Conferenceconference.eresearch.edu.au/wp-content/uploads/... · 1. The long-term trends in Scientific Computing? How can we • Democratize
Page 2: Kevin Jorissen - eResearch Australasia Conferenceconference.eresearch.edu.au/wp-content/uploads/... · 1. The long-term trends in Scientific Computing? How can we • Democratize

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark

Kevin JorissenSeattle

Kevin has 10 years of experience incomputational science, and holds a Ph.D. inPhysics. He developed codes solving the quantumphysics equations for light absorption bymaterials, taught workshops to scientistsworldwide, and wrote about high performancecomputing in the cloud before it was fashionable.He worked as a postdoctoral researcher inAntwerp, Lausanne, Seattle, and Zurich. Hecontributed to the WIEN2k code (DensityFunctional Theory calculations of materialproperties, www.wien2k.at) and the FEFF code (X-ray and Electron absorption spectra,www.feffproject.org).

Kevin joined Amazon in 2015 to help acceleratethe adoption of cloud computing in the scientificcommunity globally.,

BIO

Page 3: Kevin Jorissen - eResearch Australasia Conferenceconference.eresearch.edu.au/wp-content/uploads/... · 1. The long-term trends in Scientific Computing? How can we • Democratize

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark

What research&HPC has been successful in the cloud

and why

Page 4: Kevin Jorissen - eResearch Australasia Conferenceconference.eresearch.edu.au/wp-content/uploads/... · 1. The long-term trends in Scientific Computing? How can we • Democratize

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark

1. The long-term trends in Scientific Computing

? How can we• Democratize research computing so everybody can use it? (no longer just HPC experts)• Meet the need for a variety of hardware platforms? (no longer just CPU based)• Support diverse applications and frameworks? (no longer just Fortran+MPI physics codes)

Additional challenges• Data gravity: massive volumes of data• Cross-disciplinary research• Research Data Management• Data compliance and security• Reproducibility and reusability• New methods, e.g. serverless computing; ML; domain platforms (e.g. Cromwell, Pangeo, …)

Scientific computing will have to evolve to solve these challenges.The public cloud (e.g. AWS) has the right characteristics. (because it evolved under similar constraints)

Page 5: Kevin Jorissen - eResearch Australasia Conferenceconference.eresearch.edu.au/wp-content/uploads/... · 1. The long-term trends in Scientific Computing? How can we • Democratize

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark

2. Key Strengths of AWS for Scientific Discovery

Improve time to discovery• Resources are available when needed• Experiment fast (‘agility’)• Avoid undifferentiated work by using advanced managed services

Collaboration• Store massive data sets• Share them with your collaborators• With compute/analytics/ML tools available• In a highly secure and compliant way

Page 6: Kevin Jorissen - eResearch Australasia Conferenceconference.eresearch.edu.au/wp-content/uploads/... · 1. The long-term trends in Scientific Computing? How can we • Democratize

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark

https://aws.amazon.com/blogs/aws/saving-koalas-using-genomics-research-and-cloud-computing/

Availability of resources: (We’re off to a cute start …)

Page 7: Kevin Jorissen - eResearch Australasia Conferenceconference.eresearch.edu.au/wp-content/uploads/... · 1. The long-term trends in Scientific Computing? How can we • Democratize

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark

Moving quickly with managed services: Jupiter Intelligence

Page 8: Kevin Jorissen - eResearch Australasia Conferenceconference.eresearch.edu.au/wp-content/uploads/... · 1. The long-term trends in Scientific Computing? How can we • Democratize

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark

Collaborating on scientific data in the cloud

AthenaGlue

Page 9: Kevin Jorissen - eResearch Australasia Conferenceconference.eresearch.edu.au/wp-content/uploads/... · 1. The long-term trends in Scientific Computing? How can we • Democratize

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark

Collaborating on scientific data in the cloud

NOAA- NEXRAD on AWS S3, usage increased 2.3x

greater scientific impact

Page 10: Kevin Jorissen - eResearch Australasia Conferenceconference.eresearch.edu.au/wp-content/uploads/... · 1. The long-term trends in Scientific Computing? How can we • Democratize

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark

NIH - Strides

http://www.cancergenomicscloud.org

Funded projects to create collaborative environments on cloudTens of PB of cancer data coming to the cloud

Page 11: Kevin Jorissen - eResearch Australasia Conferenceconference.eresearch.edu.au/wp-content/uploads/... · 1. The long-term trends in Scientific Computing? How can we • Democratize

Do go to aws.amazon.com/earth/

Page 12: Kevin Jorissen - eResearch Australasia Conferenceconference.eresearch.edu.au/wp-content/uploads/... · 1. The long-term trends in Scientific Computing? How can we • Democratize

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark

3. Institutional goals

Can cloud make university’s research data more reusable?Can cloud make students more employable after graduation?Can cloud shorten average time-to-discovery and boost impact?Can cloud raise the university’s profile for research (inter)nationally?Can cloud help make competitive faculty hires? (Extra resources allow the competitive new hire to stay on top of the field)Can cloud help new faculty build impact faster? (Put cloud $ in every startup package and see citations build up faster.)Can cloud democratize compute/analytics/ML/AI across all departments?Can cloud help grad students finish up faster? Can cloud boost the approval rate of grant applications?

Page 13: Kevin Jorissen - eResearch Australasia Conferenceconference.eresearch.edu.au/wp-content/uploads/... · 1. The long-term trends in Scientific Computing? How can we • Democratize

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark

3 areas of work:

Toolset:150-odd servicesLearning pathsPrograms(Educate, Open Data, Egress Waiver, Academy, Catalyst, …)

1000’s of 3rd party solutions

Education Research Operations

A University Cloud Journey Quadrant

4 Horsemen:Capability (Can I?)Compliance (May I?)Cost (How much?)Complexity (How the …?)

Timeline:Champions

to

Institutionto

Ecosystem

Page 14: Kevin Jorissen - eResearch Australasia Conferenceconference.eresearch.edu.au/wp-content/uploads/... · 1. The long-term trends in Scientific Computing? How can we • Democratize

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark

A guided tour of cool people and thingsin the cloud

you be the judge

Page 15: Kevin Jorissen - eResearch Australasia Conferenceconference.eresearch.edu.au/wp-content/uploads/... · 1. The long-term trends in Scientific Computing? How can we • Democratize

Page 15The University of Sydney

De novo Transcriptome Assembly

Current HPC Limitations RONIN Summary

• Data since 30th July• Compute requirements highly variable across samples• Temp storage issues - Unable to run on node with other jobs• Strict versioning of dependencies• Often many failures/errors despite using same code with new samples• Parallelization achieved on NCI thanks to SIH

• Setup Time = <1hr• Multiple species and tissues run in

parallel using auto-scaling cluster• 5 assemblies complete in <1

week + QC in <24 hr• Cost: <$500 per assembly

1. A student: meet Parice (U Sydney)

Page 16: Kevin Jorissen - eResearch Australasia Conferenceconference.eresearch.edu.au/wp-content/uploads/... · 1. The long-term trends in Scientific Computing? How can we • Democratize

2. A group: meet JGI

• Plant genomes: large data & compute needs• Need 4TB servers that aren’t available in-house• Use Cromwell on AWS (https://docs.opendata.aws/genomics-workflows/ )

Page 17: Kevin Jorissen - eResearch Australasia Conferenceconference.eresearch.edu.au/wp-content/uploads/... · 1. The long-term trends in Scientific Computing? How can we • Democratize

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademarkhttps://aws.amazon.com/blogs/aws/natural-language-processing-at-clemson-university-1-1-million-vcpus-ec2-spot-instances/

“I am absolutely thrilled with the outcome of this experiment. The graduate students on the project […] used resources from AWS and Omnibond and developed a new software infrastructure to perform research at a scale and time-to-completion not possible with only campus resources.” – Prof. Amy Apon, Co-Director of the Complex Systems, Analytics and Visualization Institute

“spot market”: cheap AWS computing –a good fit for research

3. A group: meet Clemson U Analytics & Visualization Institute HPC in the cloud : 550,000 cores for Natural Language Processing (Machine Learning)

Page 18: Kevin Jorissen - eResearch Australasia Conferenceconference.eresearch.edu.au/wp-content/uploads/... · 1. The long-term trends in Scientific Computing? How can we • Democratize

4. A collaboration: meet Joint Center for Satellite Data Assimilation (JCSDA)• Joint Effort for Data Integration (JEDI) is a next-generation data assimilation (DA) system for

numerical weather prediction (NWP) that is capable and flexible enough to use for both researchand operations. Run the FV3GFS global model on Amazon Web Services, at full resolution andwith the pre-operational configuration.

• 48-node (1,728-core) compute clusters on AWS.

Page 19: Kevin Jorissen - eResearch Australasia Conferenceconference.eresearch.edu.au/wp-content/uploads/... · 1. The long-term trends in Scientific Computing? How can we • Democratize

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark

5. A collaboration: meet RISElab @UCBerkeley

Collaborative 5-year effort between UC Berkeley, National Science Foundation,and industry partners. (2017-2021) – AWS is founding partner. https://riselab.cs.berkeley.edu

• Students and researchers at RISELab use AWS to rapidly prototype and develop new systems at a scale and speed not possible before.

• Previously built Apache Spark, developed on AWS, and integrated with AWS core services.

GOAL:

Page 20: Kevin Jorissen - eResearch Australasia Conferenceconference.eresearch.edu.au/wp-content/uploads/... · 1. The long-term trends in Scientific Computing? How can we • Democratize

6. A collaboration: meet Pangeo

• Science gateways are the future: http://pangeo.io/architecture.html• Reproducible/reusable research: https://medium.com/pangeo/cesm-lens-on-aws-

4e2a996397a1

Page 21: Kevin Jorissen - eResearch Australasia Conferenceconference.eresearch.edu.au/wp-content/uploads/... · 1. The long-term trends in Scientific Computing? How can we • Democratize

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark

8. An institution: meet Emory University

https://edscoop.com/emory-university-research-aws-cloud-rich-mendola

Page 22: Kevin Jorissen - eResearch Australasia Conferenceconference.eresearch.edu.au/wp-content/uploads/... · 1. The long-term trends in Scientific Computing? How can we • Democratize

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark

9. An institution: institutional support for researchers @ PNNL

18 hours205,000 materials analyzed

156,314 AWS Spot cores at peak2.3M core-hours

Total spending: $33K(Under 1.5 cents per core-hour)

https://www.youtube.com/watch?v=hcnhdwnSY94

Page 23: Kevin Jorissen - eResearch Australasia Conferenceconference.eresearch.edu.au/wp-content/uploads/... · 1. The long-term trends in Scientific Computing? How can we • Democratize

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark

9. An institution: institutional support for researchers @ PNNL

18 hours205,000 materials analyzed

156,314 AWS Spot cores at peak2.3M core-hours

Total spending: $33K(Under 1.5 cents per core-hour)

https://www.youtube.com/watch?v=hcnhdwnSY94The Cloud Champion team:-Research software engineers (RSE)-cloud architects (SA)-consulting/Proserv role: trusted advisor for researchers, maybe build or help build pipelines-can be embedded in central HPC/IT -training workshops for end users(e.g. we can come do a SageMaker workshop for you)

Page 24: Kevin Jorissen - eResearch Australasia Conferenceconference.eresearch.edu.au/wp-content/uploads/... · 1. The long-term trends in Scientific Computing? How can we • Democratize

10. An institution: NIH - Strides

http://www.cancergenomicscloud.org

Funded projects to create collaborative environments on cloudTens of PB of cancer data coming to the cloud

Page 25: Kevin Jorissen - eResearch Australasia Conferenceconference.eresearch.edu.au/wp-content/uploads/... · 1. The long-term trends in Scientific Computing? How can we • Democratize

11. A country: meet Chile, leader in astronomy

Page 26: Kevin Jorissen - eResearch Australasia Conferenceconference.eresearch.edu.au/wp-content/uploads/... · 1. The long-term trends in Scientific Computing? How can we • Democratize

Don’t miss

• Workshop this Friday: Machine Learning on AWS (with SageMaker)• Workshop this Friday: HPC on AWS (with Ronin)• Late November: Supercomputing ’19 in Denver• Early December: re:invent ‘19 in Vegas (lots of new AWS services)

Page 27: Kevin Jorissen - eResearch Australasia Conferenceconference.eresearch.edu.au/wp-content/uploads/... · 1. The long-term trends in Scientific Computing? How can we • Democratize

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark

The Amazon AI/ML Stack

PLATFORM SERVICES

APPLICATION SERVICES

FRAMEWORKS & INTERFACES

Caffe2 CNTK Apache MXNet PyTorch TensorFlow Torch Keras Gluon

AWS Deep Learning AMIs

Amazon SageMaker AWS DeepLens

Rekognition Transcribe Translate Polly Comprehend Lex

INFRASTRUCTURE

CPU IoT & EdgeGPU (P3) Mobile

Page 28: Kevin Jorissen - eResearch Australasia Conferenceconference.eresearch.edu.au/wp-content/uploads/... · 1. The long-term trends in Scientific Computing? How can we • Democratize

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark

HPC in the cloud is serious: Seismic modeling at PFLOP scale

Created a big CLUSTER inthe AWS cloud.

Page 29: Kevin Jorissen - eResearch Australasia Conferenceconference.eresearch.edu.au/wp-content/uploads/... · 1. The long-term trends in Scientific Computing? How can we • Democratize

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark

HPC in the cloud is serious: a surprise TOP500 run

#136

https://medium.com/descarteslabs-team/thunder-from-the-cloud-40-000-cores-running-in-concert-on-aws-bf1610679978

Page 30: Kevin Jorissen - eResearch Australasia Conferenceconference.eresearch.edu.au/wp-content/uploads/... · 1. The long-term trends in Scientific Computing? How can we • Democratize

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark

+

Register for the Researchers Handbook to AWS:aws.amazon.com/rcp

Go play with an Open Dataset:

registry.opendata.aws

1. 2.

Thank [email protected]

+

3.

Go play with ML on AWS:https://github.com/wleepang/sagemaker4research-workshop