INCITE Proposal Writing Webinar Presentation · INCITE Proposal Writing Webinar April 25, 2013 ......
Transcript of INCITE Proposal Writing Webinar Presentation · INCITE Proposal Writing Webinar April 25, 2013 ......
INCITE Proposal Writing Webinar
April 25, 2013
James Osborn ALCF Catalyst Team
Argonne National Laboratory
Matt Norman OLCF Scientific Computing Group
Oak Ridge National Laboratory
and Julia C. White, INCITE Manager
2
Overview
• Allocation programs [3] • INCITE mission and recent stats [4 – 12] • Titan and Mira [13 – 17] • Tips for applicants [18 – 36]
– Common oversights – Requesting a startup account – Benchmarking data
• Q&A [37, open discussion] • Conclusions [38 – 45]
– Submittal, review, and awards decisions – Contact links
3
Three primary ways for access to LCF Distribution of allocable hours
60% INCITE 4.7 billion core-hours in
CY2013 Up to 30% ASCR Leadership Computing
Challenge
10% Director’s Discretionary
Leadership-class computing
DOE/SC capability computing
4
What is INCITE?
INCITE promotes transformational advances in science and technology through large allocations of computer time, supporting resources, and data storage at the Argonne and Oak Ridge Leadership Computing Facilities (LCFs) for computationally intensive, large-scale research projects.
Innovative and Novel Computational Impact on Theory and Experiment
5
INCITE criteria Access on a competitive, merit-reviewed basis*
1 Merit criterion Research campaign with the potential for significant domain and/or community impact
3 Eligibility criterion • Grant allocations regardless of funding source*
• Non-US-based researchers are welcome to apply
2 Computational leadership criterion Computationally intensive runs that cannot be done
anywhere else: capability, architectural needs
*DOE High-End Computing Revitalization Act of 2004: Public Law 108-423
6
Twofold review process
New proposal assessment Renewal assessment Peer review:
INCITE panels • Scientific and/or
technical merit • Appropriateness
of proposal method, milestones given
• Team qualifications • Reasonableness
of requested resources
• Change in scope • Met milestones • On track to meet
future milestones • Scientific and/or
technical merit
Computational readiness
review: LCF centers
• Technical readiness • Appropriateness for
requested resources
• Met technical/ computational milestones
• On track to meet future milestones
Award Decisions
• INCITE Awards Committee comprised of LCF directors, INCITE program manager, LCF directors of science, sr. management
7
2013 INCITE statistics
Contact information Julia C. White, INCITE Manager
• Request for Information helped attract new projects
• Call closed June 27th, 2012
• Total requests ~15 billion core-hours, 3x more than the 5 billion core-hours requested last year
• Number of proposals submitted increased nearly 20%
• Awards of ~5 billion core-hours for CY 2013
• 61 projects awarded of which 20 are renewals
Acceptance rates 33% of nonrenewal submittals and 100% of renewals
8
2013 award statistics, by system
Jaguar Titan Mira Intrepid
2012 INCITE 2013 INCITE 2013 INCITE 2012 INCITE 2013 INCITE
Number projects* 35 32 27 31 27
Average Project 27M 58M 78M 24M 27M
Median Project 23M 49.5M 45M 20M 25M
Titan Mira Intrepid
Total Awards (Hrs in CY2013) 1.84B 2.11B 0.721B
* Totals of 32 projects at the OLCF, 37 projects at the ALCF (many of the ALCF projects received time on both Mira and Intrepid)
9
New PI’s in INCITE
• A “new” PI has never previously led an INCITE submittal • 32% of the nonrenewal projects are led by new PI’s
– 41 new projects awarded, 13 led by new PI’s
INCITE actively engages with new research teams through outreach such as workshops, email distributions, and individual networking.
10
2013 INCITE panel peer reviewers
83 science experts participated in the 2013 INCITE panel review.
• > 50% (e.g. more than 40) of the reviewers are: — Society fellows (AAAS, APS, SIAM, IEEE, etc), — Agency awardees (ex. NSF Early Career), — Laboratory fellows, — National Academy members, — National Society presidents
• 41% participated in the 2012 INCITE review
11
INCITE seeks high-impact research campaigns
Examples of previous successful INCITE applications that advance the state of the art across a broad range of topics and different mission priorities
• Glimpse into dark matter • Supernovae ignition • Protein structure • Creation of biofuels • Replicating enzyme functions • Global climate • Accelerator design • Carbon sequestration • Turbulent flow • Propulsor systems
• Membrane channels • Protein folding • Chemical catalyst design • Plasma physics • Algorithm development • Nano-devices • Batteries • Solar cells • Reactor design • Nuclear structure
12
INCITE breakthroughs since inception A few of the many science and engineering advances
Hours allocated 4.9M 6.5M 18.2M 95M 268M 889M 1.6B 1.7B 1.7B 5B Projects 3 3 15 45 55 66 69 57 60 61
Unprecedented simulation of magnitude-8 earthquake over 125-square miles, Proceedings, SC10
World’s first continuous simulation of 21,000 years of Earth’s climate history. Science (2009)
Largest-ever LES of a full-sized commercial combustion chamber used in an existing
helicopter turbine
Largest simulation of a galaxy’s worth of dark matter, showed for the first time the fractal-like appearance of dark
matter substructures. Nature (2008), Science (2009)
OMEN breaks the petascale barrier using more than 220,000 cores, Proceedings SC10
NIST proposes new standard reference materials from LCF concrete simulations
New method to rapidly determine protein structure, with limited experimental data Science (2010), Nature (2011)
Researchers solved the 2D Hubbard model and presented evidence that it predicts HTSC behavior
Phys. Rev. Lett 2005
Hours requested vs. allocated: ~2X per year ~3X per year
2007 2008 2009 2010 2011 2013 2012 2004 2005 2006
Modeling of molecular basis of Parkinson’s disease named #1 computational accomplishment
Breakthroughs 2008 Calculation of the number of bound nuclei in nature, Nature (2012)
13
INCITE resources: Mira at ALCF • Mira - Blue Gene/Q System
– Node: 64 bit PowerPC A2, 1.6 GHz • 16 cores, 4 HW threads/core • 16 GB memory • 205 GFLOPS peak
– 48 racks / 48k nodes / 768k cores – 768 TB of memory – Peak flop rate: 10 PF – 35 PB of disk + large tape storage system
• New Visualization Systems – Initial system available now – Advanced visualization system in 2015
14
Overview of Blue Gene/Q
Design Parameters BG/P BG/Q Improvement Cores / Node 4 16 4x HW threads / Core 1 4 4x Clock Speed (GHz) 0.85 1.6 1.9x Flop / Clock / Core 4 8 2x Nodes / Rack 1,024 1,024 -- RAM / core (GB) 0.5 1 2x Flops / Node (GF) 13.6 204.8 15x Mem. BW/Node (GB/sec) 13.6 42.6 3x Network Interconnect 3D torus 5D torus Smaller diameter Concurrency / Rack 4,096 65,536 16x
15
Scaling workshop at ALCF
Don’t miss the ALCF's scaling workshop that has produced Gordon Bell finalists for four years running. ALL ARE WELCOME!
• Hit your highest performance numbers in preparation for INCITE 2014 • Try your code out on full-system Mira reservations • Work one-on-one with ALCF experts to optimize your code’s performance Register today! Go to www.alcf.anl.gov for details, or www.alcf.anl.gov/workshops/mira-performance-boot-camp-2013
16
Titan at OLCF • Cray Linux Environment
operating system • Gemini interconnect
• 3-D Torus • Globally addressable memory • Advanced synchronization features
• AMD Opteron 6200 processor (Interlagos) • New accelerated node design using NVIDIA
multi-core accelerators • NVIDIA “Kepler” (K20) GPUs
• 27 PFlops peak performance • 584 TB DDR3 + 110 TB GDDR5 memory
Titan Specs Compute Nodes 18,688 Login & I/O Nodes 512 Memory per node 32 GB + 6 GB NVIDIA “Kepler” (2012) 1.31 TFlops Opteron 2.2 GHz Opteron performance 141 GFlops Total Opteron Flops 2.6 PFlops Disk Bandwidth ~ 1 TB/s
17
Cray XK7 compute node XK7 Compute Node
Characteristics AMD Opteron 6200 “Interlagos” 16 core processor @ 2.2GHz
Tesla K20 “Kepler” @ 1.31 TF with 6GB GDDR5 memory
Memory Host: 32GB @ 1600 MHz DDR3 GPU: 6GB @ 5200 MHz GDDR5 Gemini High Speed Interconnect
Four compute nodes per XK7 blade. 24 blades per rack
18
Key questions to ask yourself • Is both the scale of the runs and the time demands
of the problem of LCF scale? – Yes, I can’t get the amount of time I need anywhere else. – Yes, I my simulations are too large to run on other systems.
• Do you need specific LCF hardware? – Yes, the memory and I/O available here are necessary for my work.
Do answer these questions in the proposal. This is especially helpful for the computational readiness reviewers.
TIPS
19
• Do you have the people ready to do this work? – No, I’m waiting to hire a postdoc. – Yes, I have commitments from the major participants.
• Do you have a workflow? • Do you have a post-processing strategy? • Do you use ensemble runs and need LCF resources?
– My ensembles can run under the direction of a large job, with I/O scaling on a parallel file system. -> possible yes
– My ensemble expects to run millions of serial jobs on nodes with local disk available. -> probably no
Key questions to ask yourself (cont.)
Some of these characteristics are negotiable, so make sure to discuss atypical requirements with the centers
20
Some limitations on what can be done
• Laws regulate what can be done on these systems – LCF systems have cyber security plans that bound the types of
data that can be used and stored on them
• Some kinds of information we cannot have – Personally Identifiable Information (PII) – Classified Information or National Security Information – Unclassified Controlled Nuclear Information (UCNI) – Naval Nuclear Propulsion Information (NNPI) – Information about development of nuclear, biological or chemical
weapons, or weapons of mass destruction
Inquire if you are unsure or have questions
21
Proposal form: Outline 1 Principal investigator and co-principal investigators 2 Project title (80 characters) 3 Research category 4 Project summary (50 words) 5 Computational resources requested 6 Funding sources 7 Other high-performance computing support for this project 8 Project narrative, other materials
(A) Executive summary (1 page) (B) Project narrative including impact of the work, objectives, benchmarking (15 pages) (C) Personnel justification & management plan (D) Milestone table (E) Publications resulting from INCITE Awards (*new*) (F) Request for Information – Data Management Plan (*new*)
9 Application packages 10 Proprietary and sensitive information 11 Export control 12 Monitor information
22
Getting started: Know your audience
• Remember, INCITE is very broad in scope – Computational-science-savvy senior scientists/engineers drawn
around the world from national labs, universities, and industry – They will be assessing potential impact of this work versus other
proposals submitted
Don’t assume that your audience is familiar with your work through other review programs (ex. funding agencies). INCITE is very broad in scope and you may be competing against a diverse set of proposals.
TIPS
23
Narrative: Impact of the work
• This is the principal determinant of a successful submittal – What is the scientific challenge, and its significance – Impact of a successful computational campaign — the big picture – Reasons this work needs to be done now, on the resources requested
Do give a compelling picture of the impact of this work, both in the context of your field and, where appropriate, beyond. Do explain why this work cannot be done elsewhere. Reviewers scrutinize whether another allocation program may be a better fit.
TIPS
24
Narrative: Objectives and milestones
• Successful submittals must also very clearly – Describe approach to solving the problem, its challenging aspects,
preliminary results – Tie to the resources requested your key objectives, key simulations,
and project milestones in your milestone table
Do clearly articulate your project’s milestones for each year. Reviewers have downgraded proposals that don’t show that the PI has a well thought out plan for using the allocation. Do bear in mind that the average INCITE award of time for a single project is equivalent to several million dollars. Spend your time on the proposal accordingly.
TIPS
25
Narrative Computational approach Provide the basic foundation
• Describe the underlying formulation – Don’t assume reviewers know all the codes – Do show that the code you plan to employ is the correct tool for your research plan – Do explain the differences if you plan to use a private version of a well-known code
• List programming languages, libraries and tools used – Check that what you need is available on the system
26
Narrative: Computational campaign The details are important!
• Describe the kind of runs you plan with your allocation – L exploratory runs using M nodes for N hours – X big runs using Y nodes for Z hours – P analysis runs using Q nodes for R hours
• Big runs often have big output and/or big I/O – Show you can deal with it and understand the bottlenecks – Understand the size of results, where you will analyze them,
and how you will get the data there
Do clearly emphasize the relationship between the proposed runs and the major milestones. This helps the Awards Committee maximize your milestones, if they can’t grant the full award requested.
TIPS
27
Narrative: Personnel & Management plan
• Experience and credibility – List the scientific and technical members and their experience as
related to the proposed scientific or technical goals – Successful proposal teams demonstrate a clear understanding of
petascale computing and can optimally use these resources to accomplish the stated scientific/technical goals
• Transparent use of time – Projects involving multiple teams or different thrust areas should
clearly state how the allocation will be distributed and managed
Do include in “Personnel Justification” a brief description of the role of each team member. Although not a requirement, proposals with application developers or clear connections to development teams are favorably viewed by readiness reviewers.
TIPS
28
Narrative: New for 2014 Call for Proposals
• Publications resulting from INCITE awards – To show impact of the INCITE program, we ask authors to list the
publications from previous INCITE awards to this project team for work related to the proposal under consideration
– Include only publications with INCITE acknowledgements
• Request for Information – Data Management Plan (DMP) – We plan to implement in future solicitations a requirement for a formal
DMP as part of the proposal. Submit a short document, not to exceed one page, which describes your anticipated future data management strategies and needs. [Note: this is for INCITE management and will not be included in the materials sent to reviewers.]
29
Are you ready to apply now?
• Port your code before submitting the proposal – Check to see if someone else has already ported it – Request a startup account if needed (see next slide)
• Provide compelling benchmark data – Prove application scalability in your proposal – Run example cases at full scale – If you cannot show proof of runs at full scale,
then provide a very tight story about how you will succeed
Do make the benchmark examples as similar to your production runs as possible, or, make it clear why another benchmark example is valid for your proposed work.
TIPS
30
Request a start up account now • Director’s Discretionary Proposals considered year-round • Award up to millions of hours • Allocated by LCF center directors
• Director’s Discretionary (DD) requests can be submitted anytime
• DD may be used for porting, tuning, scaling in preparation for an INCITE submittal
• Submit applications at least 2 months before INCITE Call for Proposals closes
Argonne DD Program: http://www.alcf.anl.gov/getting-started/apply-for-dd Oak Ridge DD Program: www.olcf.ornl.gov/support/getting-started/olcf-director-discretion-project-application/
31
Computational approach Use of next-generation systems
Do provide a development plan articulating a strategy for maximizing node-level parallelism.
TIPS
• Use as much of the resources on a node as possible • Strategies to consider
– Hybridization utilizing OpenMP or Pthreads to expose thread-level or SMP-like parallelism (for multiple cores/HW threads)
– Make use of available accelerators (GPUs) using, e.g. CUDA, OpenCL, compiler directives, etc.
– Algorithmic improvements or design to maximize data locality and memory hierarchy usage
32
Code performance overview
Do provide performance data in the requested format. Do provide performance of the scaling baseline, not just scaling efficiency
TIPS
• Performance data should support the required scale – Use similar problems to what
you will be running – Show that you can get to the range
of processors required – Best to run on the same machine,
but similar size runs on other machines can be useful
– Be clear about the number of nodes, MPI ranks, threads and GPUs (if applicable) being used in runs
– Include production style I/O in benchmarks (checkpoint/restart, analysis)
– Describe how you will address any scaling deficiencies
33
Parallel performance: Direct evidence
Pick the approach(es) relevant to your work and show results
WEAK SCALING DATA STRONG SCALING DATA Increase problem size as resources are increased
Increase resources (nodes) while doing the same computation
10.0
10.2
10.4
10.6
10.8
11.0
Tim
e to
solu
tion
(m)
Number of processors
Actual Ideal
2400 4800 9600 19200 38400
Weak Scaling Example
76800 0
1,000
2,000
3,000
4,000
5,000
6,000
Tim
e to
solu
tion
(s)
Number of processors
Actual Ideal
2400 4800 9600 19200 38400
Strong Scaling Example
76800
34
More about our ensemble policy or, “Can I meet the computationally intensive criterion by loosely coupling my jobs?”
• Possibly “yes”, – If you require large numbers of discrete or loosely coupled simulations
where time-to-solution is an untenable pacing issue, and – If a software workflow solution (e.g., pre- and post-processing scripts
that automate run management and analysis) is provided to facilitate this volume of work.
• Probably “no”, – If by decoupling the simulations the work could be effectively carried
out on a smaller resource within a reasonable time-to-solution.
Do examine the Frequently Asked Questions for these and other topics at http://hpc.science.doe.gov/allocations/incite/faq.do
TIPS
35
Proposal form: Final check 1 Principal investigator and co-principal investigators 2 Project title (80 characters) 3 Research category 4 Project summary (50 words) 5 Computational resources requested 6 Funding sources 7 Other high-performance computing support for this project 8 Project narrative, other materials
(A) Executive summary (1 page) (B) Project narrative including impact of the work, objectives, benchmarking (15 pages) (C) Personnel justification & management plan (D) Milestone table (E) Publications resulting from INCITE Awards (*new*) (F) Request for Information – Data Management Plan (*new*)
9 Application packages 10 Proprietary and sensitive information 11 Export control 12 Monitor information
36
Renewal form: Outline 1 Principal investigator and co-principal investigators 2 Research category 3 Project status summary (1 page) 4 Renewal computational resources requested 5 Project achievements and plans
(A) Project achievements (1 page) (B) Project plans (15 pages)
• Project achievements – Accomplishments, publications, allocation use, parallel performance,
anticipated data storage needs
• Project plans for next year – What you expect to accomplish, anticipated production-to-
development job time, benchmarking data if new codes to be used
38
Submitting your proposal or renewal
• You may save your proposal at any time without having the entire form complete
• Your Co-PIs may also log in and edit your proposal • Required fields must be completed for the form to be
successfully submitted – An incomplete form may be saved for later revisions
• After submitting your proposal, you will not be able to edit it
Submit
39
INCITE awards committee decisions
• The INCITE Awards Committee is comprised of the LCF center directors, INCITE program manager, LCF directors of science and senior management.
• The committee identifies the top-ranked proposals by a) peer-review panel ratings, rankings, and reports; and b) additional considerations, such as the desire to promote use of HPC resources by underrepresented communities.
• Computational readiness review is used to identify whether the top-ranked proposals are “ready” for the requested system.
40
INCITE awards committee decisions (cont.)
• A balance is struck to ensure – each awarded project has sufficient allocation to enable
all or part of the proposed scientific or technical achievements – a robust support model for each INCITE project
• When the centers are oversubscribed, each potential project is assessed to determine the amount of time that may be awarded to allow the researchers to accomplish significant scientific goals.
• Requests for appeals can be submitted to the INCITE manager or LCF center directors. If an error has occurred in the decision-making process (e.g. procedural, clerical), consideration is given by the INCITE management and an award may be granted.
41
2014 INCITE award announcements
• Awards will be announced by INCITE Manager, Julia White, in November 2013
• Welcome and startup information from centers – Agreements to sign: Start this process as soon as possible! – Getting started materials: Work closely with the center
• Centers provide expert-to-expert assistance to help you get the most from your allocation – Scientific “Liaisons” and “Catalysts” (OLCF / ALCF)
42
PI responsibilities
• Provide quarterly status updates (on supplied template) – Milestone reports – Publications, awards, journal covers, presentations, etc., related to the work
• Provide highlights on significant science/engineering accomplishments as they occur
• Submit annual renewal request • Complete annual surveys • Encourage your team to be good citizens on the computers • Use the resources for the proposed work
Let us know your achievements and challenges
43
It is a small world…
• Let the science agency that funds your work know how significant the INCITE program and the Leadership Computing Facilities will be to your work
• Be sure to include the appropriate acknowledgements • Contact us if you have questions: we want to hear from you
44
Relevant links
INCITE General Information www.doeleadershipcomputing.org/
INCITE Proposal Site proposals.doeleadershipcomputing.org/
Argonne Discretionary Program www.alcf.anl.gov/getting-started/apply-for-dd Oak Ridge Discretionary Program www.olcf.ornl.gov/support/getting-started/olcf-director-discretion-project-application
Contact the center if you’d like to request Discretionary time for benchmarking
45
Contacts
For details about the INCITE program:
www.doeleadershipcomputing.org – General information proposals.doeleadershipcomputing.org – Proposal site [email protected]
For details about the centers:
www.olcf.ornl.gov [email protected], 865-241-6536
www.alcf.anl.gov [email protected], 866-508-9181
47
Innovative and Novel Computational Impact on Theory and Experiment
Call for Proposals
Contact information Julia C. White, INCITE Manager
The INCITE program seeks proposals for high-impact
science and technology research challenges that require the power of the
leadership-class systems. Allocations will be for calendar year 2014.
April 15 – June 28, 2013
INCITE is an annual, peer-review allocation program that provides unprecedented computational and data science resources
• 5 billion core-hours to be awarded for 2014 on the 27-petaflops Cray XK7 “Titan” and the 10-petaflops IBM BG/Q “Mira”
• Average award: 50+ million core-hours
• Individual awards will be up to several hundred million core-hours
• INCITE is open to any science domain
• INCITE seeks computationally intensive, large-scale research campaigns
48
Allocation Programs at the LCFs
INCITE ALCC Director’s Discretionary
Mission High-risk, high-payoff science that requires LCF-scale resources*
High-risk, high-payoff science aligned with DOE mission
Strategic LCF goals
Call 1x/year – (Closes June) 1x/year – (Closes February) Rolling
Duration 1-3 years, yearly renewal 1 year 3m,6m,1 year
Typical Size 50 – 70 projects
50M – 100’s M core-hours/yr.
10 – 20 projects
1M – 75M core-hours/yr.
100s of projects
10K – 1M core-hours
Review Process
Scientific Peer-Review
Computational Readiness
Scientific Peer-Review
Computational Readiness
Strategic impact and feasibility
Managed By INCITE management committee (ALCF & OLCF) DOE Office of Science LCF management
Availability Open to all scientific researchers and organizations Capability >20% of cores
60% 30% 10%
49
A sample of codes with local expertise available at Argonne and Oak Ridge Application Field ALCF OLCF FLASH Astrophysics ✓ ✓ MILC,CPS LQCD ✓ ✓ Nek5000 Nuclear energy ✓ Rosetta Protein structure ✓ DCA++ Materials science ✓ ANGFMC Nuclear structure ✓ NUCCOR Nuclear structure ✓ Qbox Chemistry ✓ ✓ LAMMPS Molecular dynamics ✓ ✓ NWChem Chemistry ✓ ✓ GAMESS Chemistry ✓ ✓ MADNESS Chemistry ✓ ✓ CHARMM Molecular dynamics ✓ ✓ NAMD Molecular dynamics ✓ ✓
Application Field ALCF OLCF AVBP Combustion ✓ GTC Fusion ✓ ✓ Allstar Life science ✓ CPMD, CP2K Molecular dynamics ✓ ✓ CESM Climate ✓ ✓ CAM-SE Climate ✓ ✓ WRF Climate ✓ ✓ Amber Molecular dynamics ✓ ✓ enzo Astrophysics ✓ ✓ Falkon Computer science/HTC ✓ ✓ s3d Combustion ✓ DENOVO Nuclear energy ✓ LSMS Materials science ✓ GPAW Materials science ✓