Https://portal.futuregrid.org FutureGrid Training, Education and Outreach Bloomington Indiana...

32
https://portal.futuregrid.org FutureGrid Training, Education and Outreach Bloomington Indiana January 17 2010 Presented by Renato Figueiredo [email protected] Associate Professor University of Florida

Transcript of Https://portal.futuregrid.org FutureGrid Training, Education and Outreach Bloomington Indiana...

Page 1: Https://portal.futuregrid.org FutureGrid Training, Education and Outreach Bloomington Indiana January 17 2010 Presented by Renato Figueiredo renato@acis.ufl.edu.

https://portal.futuregrid.org

FutureGridTraining, Education and Outreach

Bloomington IndianaJanuary 17 2010

Presented by Renato [email protected]

Associate Professor

University of Florida

Page 2: Https://portal.futuregrid.org FutureGrid Training, Education and Outreach Bloomington Indiana January 17 2010 Presented by Renato Figueiredo renato@acis.ufl.edu.

https://portal.futuregrid.org

Overview

• Traditional ways of delivering hands-on training and education in parallel/distributed computing have non-trivial dependences on the environment

• Difficult to replicate same environment on different resources (e.g. HPC clusters, desktops)

• Difficult to cope with changes in the environment (e.g. software upgrades)

• Virtualization technologies remove key software dependences through a layer of indirection

Page 3: Https://portal.futuregrid.org FutureGrid Training, Education and Outreach Bloomington Indiana January 17 2010 Presented by Renato Figueiredo renato@acis.ufl.edu.

https://portal.futuregrid.org

Overview

• FutureGrid enables new approaches to education and training and opportunities to engage in outreach – Cloud, virtualization and dynamic provisioning –

environment can adapt to the user, rather than expect user to adapt to the environment

• Focus of FutureGrid TEO is on leveraging the unique capabilities of the infrastructure and its software to:– Reduce barriers to entry and engage new users– Use of encapsulated environments (“appliances”) as a

primary delivery mechanism of education/training modules – promoting reuse, replication, and sharing

Page 4: Https://portal.futuregrid.org FutureGrid Training, Education and Outreach Bloomington Indiana January 17 2010 Presented by Renato Figueiredo renato@acis.ufl.edu.

https://portal.futuregrid.org

Summary of activities (1)

• Focus activities in the first year– Infrastructure supporting TEO activities

• Documentation, integration of educational materials, input/recommendations for portal and computing infrastructure

• Development of hands-on tutorials tailored to FutureGrid technologies and resources

• Development, integration, testing of educational virtual appliances

Page 5: Https://portal.futuregrid.org FutureGrid Training, Education and Outreach Bloomington Indiana January 17 2010 Presented by Renato Figueiredo renato@acis.ufl.edu.

https://portal.futuregrid.org

Summary of activities (2)

• Focus activities in the first year– Education activities

• Working with early adopters in class environments • Understand requirements, opportunities, challenges

– Outreach activities• Demonstrations and presentations highlighting

FutureGrid’s unique capabilities in conferences, workshops

• Engaging with minority serving institutions

Page 6: Https://portal.futuregrid.org FutureGrid Training, Education and Outreach Bloomington Indiana January 17 2010 Presented by Renato Figueiredo renato@acis.ufl.edu.

https://portal.futuregrid.org

TEO Infrastructure - guiding principles

• Fidelity: TEO activities should use full-fledged, executable software: education/training modules– Learn using the proper tools

• Reproducibility: Creators of content should be able to install, configure, and test their modules once, and be assured of the same functional behavior regardless of where the module is deployed– Incentive to invest effort in developing, testing and

documenting new modules

Page 7: Https://portal.futuregrid.org FutureGrid Training, Education and Outreach Bloomington Indiana January 17 2010 Presented by Renato Figueiredo renato@acis.ufl.edu.

https://portal.futuregrid.org

TEO Infrastructure - guiding principles

• Deployability: Students and users should be able to deploy modules in a simple manner, and in a variety of resources– Reduce barriers to entry; avoid dependences upon

a particular infrastructure

• Community-oriented: Modules should be simple to share, discover, reuse, and expand– Create conditions for “viral” growth

Page 8: Https://portal.futuregrid.org FutureGrid Training, Education and Outreach Bloomington Indiana January 17 2010 Presented by Renato Figueiredo renato@acis.ufl.edu.

https://portal.futuregrid.org

Towards this vision in FutureGrid

• Executable modules – virtual appliances– Deployable on FutureGrid resources– Deployable on other cloud platforms, as well as

virtualized desktops

• Community sharing – Web 2.0 portal, appliance image repositories– An aggregation hub for executable modules and

documentation

Page 9: Https://portal.futuregrid.org FutureGrid Training, Education and Outreach Bloomington Indiana January 17 2010 Presented by Renato Figueiredo renato@acis.ufl.edu.

https://portal.futuregrid.org

Educational appliancesEducational appliances

• A flexible, extensible platform for hands-on, lab-oriented education on FutureGrid

• Need to support clustering of resources• Virtual machines + social/virtual networking to

create sandboxed modules– Virtual “Grid” appliances: self-contained, pre-packaged

execution environments– Group VPNs: simple management of virtual clusters by

students and educators

Page 10: Https://portal.futuregrid.org FutureGrid Training, Education and Outreach Bloomington Indiana January 17 2010 Presented by Renato Figueiredo renato@acis.ufl.edu.

https://portal.futuregrid.org

Virtual appliance example• Linux, Java, Hadoop, configuration scripts

copy

instantiate

Hadoopimage

A Hadoop workerAnother Hadoop worker

Repeat…

VirtualizationLayer

Page 11: Https://portal.futuregrid.org FutureGrid Training, Education and Outreach Bloomington Indiana January 17 2010 Presented by Renato Figueiredo renato@acis.ufl.edu.

https://portal.futuregrid.org

Virtual Networking

• A single appliance encapsulates software and configuration

• Cluster/Grid/Cloud computing– Middleware expects a collection of machines,

typically on a LAN (Local Area Network)– Appliances need to communicate and coordinate

with each other– Each worker needs an IP address, uses TCP/IP

sockets

Page 12: Https://portal.futuregrid.org FutureGrid Training, Education and Outreach Bloomington Indiana January 17 2010 Presented by Renato Figueiredo renato@acis.ufl.edu.

https://portal.futuregrid.org

Virtual cluster appliances• Virtual appliance + virtual network

copy

instantiate

Hadoop+

VirtualNetwork A Hadoop worker Another Hadoop worker

Repeat…

Virtual machine

Virtual network

Page 13: Https://portal.futuregrid.org FutureGrid Training, Education and Outreach Bloomington Indiana January 17 2010 Presented by Renato Figueiredo renato@acis.ufl.edu.

https://portal.futuregrid.org

Support for clustering

• Network virtualization software on FutureGrid includes ViNe and GroupVPN

• Nimbus has support for contextualization of one-click virtual clusters– Within a LAN, or coupled with ViNe

• Grid appliances use peer-to-peer overlay for discovery and configuration of virtual addresses (DHCP) and cluster middleware

Page 14: Https://portal.futuregrid.org FutureGrid Training, Education and Outreach Bloomington Indiana January 17 2010 Presented by Renato Figueiredo renato@acis.ufl.edu.

https://portal.futuregrid.org

GroupVPN Overview

Alice

CarolBob

SocialNetworkWeb interface

Social network(e.g. XMPP,group site)

Virtual network

10.10.0.2 10.10.0.3

SocialNetwork API

Messaging layer/information system

Alice’s public keysBob’s public keysCarol’s public key

Bootstrapping private links throughWeb 2.0 interfaces and IP-over-P2P overlay tunneling

Private IP address spaces, DHCP

Appliances perceive virtual LAN

10.10.0.4

Page 15: Https://portal.futuregrid.org FutureGrid Training, Education and Outreach Bloomington Indiana January 17 2010 Presented by Renato Figueiredo renato@acis.ufl.edu.

https://portal.futuregrid.org

Deploying virtual clusters• Same image, different VPNs

copy

instantiate

Hadoop+

VirtualNetwork A Hadoop worker Another Hadoop worker

Repeat…

Virtual machine

GroupVPN

GroupVPNCredentials

(fromWeb site)

Virtual IP - DHCP10.10.1.1

Virtual IP - DHCP10.10.1.2

Page 16: Https://portal.futuregrid.org FutureGrid Training, Education and Outreach Bloomington Indiana January 17 2010 Presented by Renato Figueiredo renato@acis.ufl.edu.

https://portal.futuregrid.org

FutureGrid example

• Deploying a Condor virtual appliance cluster on FutureGrid or desktop resources

Nimbus: cloud-client.sh --run --name grid-appliance-amd64.tar.gz

Eucalyptus: euca-run-instances ami-fd4aa494 --instance-type m1.large -k keypair

Vmware player: double-click Grid-appliance.vmxUpload GroupVPN configuration file to appliances

Page 17: Https://portal.futuregrid.org FutureGrid Training, Education and Outreach Bloomington Indiana January 17 2010 Presented by Renato Figueiredo renato@acis.ufl.edu.

https://portal.futuregrid.org

FG appliances - Status

Nimbus,Eucalyptus

Appliance

imageFutureGrid resources,Appliance images (Condor,Hadoop), tutorialsGroupVPN portal, image

downloads, bootstrap routers

Page 18: Https://portal.futuregrid.org FutureGrid Training, Education and Outreach Bloomington Indiana January 17 2010 Presented by Renato Figueiredo renato@acis.ufl.edu.

https://portal.futuregrid.org

Use of FutureGrid in classes

• First-year ramp-up of hardware and software – Training and education emphasis has been use in

classes, tutorials with early adopters• Highlights:

– Cloud computing class at Indiana University– Distributed Scientific Computing class at Louisiana

State University (LSU)– Big data summer school at IU– Nimbus tutorial at CloudCom conference

Page 19: Https://portal.futuregrid.org FutureGrid Training, Education and Outreach Bloomington Indiana January 17 2010 Presented by Renato Figueiredo renato@acis.ufl.edu.

https://portal.futuregrid.org

University ofArkansas

Indiana University

University ofCalifornia atLos Angeles

Penn State

IowaState

Univ.Illinois at Chicago

University ofMinnesota Michigan

State

NotreDame

University of Texas at El Paso

IBM AlmadenResearch Center

WashingtonUniversity

San DiegoSupercomputerCenter

Universityof Florida

Johns Hopkins

July 26-30, 2010 NCSA Summer School Workshophttp://salsahpc.indiana.edu/tutorial

300+ Students (200 on sites from 10 institutes; 100 online)IU MapReduce and UF Virtual Appliance technologies are supported by FutureGrid.

(Slide courtesy of Judy Qiu)

Big Data for Science

Page 20: Https://portal.futuregrid.org FutureGrid Training, Education and Outreach Bloomington Indiana January 17 2010 Presented by Renato Figueiredo renato@acis.ufl.edu.

https://portal.futuregrid.org

Cloud computing class at IU

• Graduate-level “Cloud computing for Data-Intensive Sciences” (Judy Qiu, Fall 2010)– Virtualization technologies and tools– Infrastructure as a service– Parallel programming (MPI, Hadoop)– FutureGrid provided a set of software options that

made it possible for students to work on different projects along the system stack.

Page 21: Https://portal.futuregrid.org FutureGrid Training, Education and Outreach Bloomington Indiana January 17 2010 Presented by Renato Figueiredo renato@acis.ufl.edu.

https://portal.futuregrid.org

Cloud Storage#8 Cloud Storage Survey (Xiaoming, Nixiaogang)

Cloud Storage#8 Cloud Storage Survey (Xiaoming, Nixiaogang)

Iterative MapReduce#3 LDA (Changsi, Yang) #4 MemCache (Saliya, Yiming ,Jerome)#5 Avro (Yuduo, Yuan, patanachai)#6 PageRank (Shuo-Huan,Parag)

Iterative MapReduce#3 LDA (Changsi, Yang) #4 MemCache (Saliya, Yiming ,Jerome)#5 Avro (Yuduo, Yuan, patanachai)#6 PageRank (Shuo-Huan,Parag)

Virtualization#9 Hypervisor Performance Analysis Project (James , Andrew)

Virtualization#9 Hypervisor Performance Analysis Project (James , Andrew)

Cloud Platform

CloudInfrastruct

ure

Cloud Infrastructure #7 Nimbus, Eucalyptus (Stephen, Sonali, Shakeela)

Cloud Infrastructure #7 Nimbus, Eucalyptus (Stephen, Sonali, Shakeela)

Hypervisor/

Virtualization

Dryad/DryadLINQ#1 Matrix Multiplication (Swapnil,Amit,Pradnay)#2 PhyloD (Ratul,Adrija,Chengming)

Dryad/DryadLINQ#1 Matrix Multiplication (Swapnil,Amit,Pradnay)#2 PhyloD (Ratul,Adrija,Chengming)

Higher Level

Languages

Term Projects

(Slide courtesy of Judy Qiu)

Page 22: Https://portal.futuregrid.org FutureGrid Training, Education and Outreach Bloomington Indiana January 17 2010 Presented by Renato Figueiredo renato@acis.ufl.edu.

https://portal.futuregrid.org

Distributed Scientific Computing class at LSU

• FutureGrid supported activities in a new semester-long class offered Fall 2010 at LSU (Gabrielle Allen, Shantenu Jha)

• A practical and comprehensive graduate course preparing students for research involving scientific computing– Module E (Distributed Scientific Computing) taught by Shantenu Jha– Topics where FutureGrid was used:

• Introduction to the practice of distributed computing• Cloud computing and master-worker pattern• Distributed application case studies

• Approximately half of a lecture provided an overview of FutureGrid and the process to get accounts and started

• As part of the homework assignment associated with lecture E0, each student had to confirm access and successful login to FG-Sierra and FG-India

Page 23: Https://portal.futuregrid.org FutureGrid Training, Education and Outreach Bloomington Indiana January 17 2010 Presented by Renato Figueiredo renato@acis.ufl.edu.

https://portal.futuregrid.org

Distributed Scientific Computing class at LSU

• FutureGrid (FG) was used by students to (i) compile, deploy and execute basic SAGA commands(ii) learn the basics of remote job submission and elementary Master-Worker

based distributed applications (such as MapReduce and computing the Mandelbrot Set) using FG-India and FG-Sierra nodes

(iii) to get hands on training with IaaS Clouds, namely stand-up virtual machines using Eucalyptus and deploy software and/or applications from (i) and (ii)

• Students also used Eucalyptus on FG-India and FG-Sierra to do their Module E projects, which ranged from:– (a) Clouds as accelerators for Cactus-based applications, – (b) calculate PI using distributed tasks, – (c) extend the calculation of the Mandelbrot Set to ``new'' backends on

FutureGrid (in addition to the ``default'' remote/ssh backends), and – (d) the execution of workers on bare-metal as well as Clouds concurrently (i.e.,

hybrid Grid-Cloud infrastructure) for master-worker applications.

Page 24: Https://portal.futuregrid.org FutureGrid Training, Education and Outreach Bloomington Indiana January 17 2010 Presented by Renato Figueiredo renato@acis.ufl.edu.

https://portal.futuregrid.org

Images

• IMAGE emi-8D2A13F7 smaddi2-saga-bucket/saga153-ubuntu.manifest.xml smaddi2 availablepublic x86_64 machine eri-5BB61255 eki-78EF12D2

• IMAGE emi-DBD61078 ubuntu-0904-saga-1.5.2/image.manifest.xml luckow available publicx86_64 machine eri-5BB61255 eki-78EF12D2

• IMAGE emi-0E0E165E ajyounge/ubuntu-twister-memcached.img.manifest.xml ajyounge availablepublic x86_64 machine eri-5BB61255 eki-78EF12D2

Page 25: Https://portal.futuregrid.org FutureGrid Training, Education and Outreach Bloomington Indiana January 17 2010 Presented by Renato Figueiredo renato@acis.ufl.edu.

https://portal.futuregrid.org

Nimbus tutorial at CloudCom

• Half-day (3-hour) presentation + hands-on activities– 30 attendees used their own computers to

instantiate virtual machines on FutureGrid resources

– Template for a self-learning tutorial for new users and prospective users

Page 26: Https://portal.futuregrid.org FutureGrid Training, Education and Outreach Bloomington Indiana January 17 2010 Presented by Renato Figueiredo renato@acis.ufl.edu.

https://portal.futuregrid.org

Nimbus tutorial at CloudCom

Page 27: Https://portal.futuregrid.org FutureGrid Training, Education and Outreach Bloomington Indiana January 17 2010 Presented by Renato Figueiredo renato@acis.ufl.edu.

https://portal.futuregrid.org

FutureGrid tutorials• Tutorial topic 1: Cloud Provisioning Platforms

– Using Nimbus on FutureGrid– Nimbus One-click Cluster Guide– Using the Grid Appliances to run FutureGrid Cloud Clients– Using Eucalyptus on FutureGrid

• Tutorial topic 2: Cloud Run-time Platforms– Introduction to Hadoop using the Grid Appliance– Running Hadoop on FG using Eucalyptus (.ppt)– Running Hadoop on Eualyptus

• Tutorial topic 3: Educational Virtual Appliances– Introduction to the Grid Appliance– Creating Grid Appliance Clusters– Building an educational appliance from Ubuntu 10.04– Deploying Grid Appliances using Nimbus– Deploying Grid Appliances using Eucalyptus– Customizing and registering Grid Appliance images using Eucalyptus– MPI Virtual Clusters with the Grid Appliances and MPICH2

• Tutorial topic 4: High Performance Computing– Performance Analysis with Vampir– Instrumentation and tracing with VampirTrace

Page 28: Https://portal.futuregrid.org FutureGrid Training, Education and Outreach Bloomington Indiana January 17 2010 Presented by Renato Figueiredo renato@acis.ufl.edu.

https://portal.futuregrid.org

Year-1 Outreach activities

• Demonstrations, presentations, booths at major events– SuperComputing, TeraGrid Conference, OGF (Open

Grid Forum), CloudCom, CCGrid, Grid’5000 meeting, Vampir workshop

1114 CPU cores (457 VMs) distributed over 3 sites in FutureGrid and 3 sites in Grid’5000 (P. Riteau et al, OGF-29 demo, Chicago, IL, June 2010).

Page 29: Https://portal.futuregrid.org FutureGrid Training, Education and Outreach Bloomington Indiana January 17 2010 Presented by Renato Figueiredo renato@acis.ufl.edu.

https://portal.futuregrid.org

Outreach activities

• At IU, working with dean for diversity and education to organize outreach and pursue REU funding to bring MSI students to IU for summer internships and to coordinate education and training workshops

• Involvement of students from Historically Black Colleges and Universities (HBCUs) – REU supplement for FutureGrid this year funded 2

HBCU students in summer 2010; will apply each year

Page 30: Https://portal.futuregrid.org FutureGrid Training, Education and Outreach Bloomington Indiana January 17 2010 Presented by Renato Figueiredo renato@acis.ufl.edu.

https://portal.futuregrid.org

Planned TEO activities

• Plan to engage MSIs with which IU has already established formal collaborative agreements – MSI Cyberinfrastructure Empowerment Coalition (MSI-

CIEC). Primary theme: “teach the teachers” at MSIs so that they can incorporate cyberinfrastructure into their research and involve students and staff at their home institutions.

– MSI-CIEC’s principal activity: Cyberinfrastructure Days - daylong workshops feature prominent speakers who discuss the application of cyberinfrastructure to research and education

Page 31: Https://portal.futuregrid.org FutureGrid Training, Education and Outreach Bloomington Indiana January 17 2010 Presented by Renato Figueiredo renato@acis.ufl.edu.

https://portal.futuregrid.org

Planned TEO activities

• With Elizabeth City State University– Planning summer school on cloud computing for ADMI

(Association of Computer/Information Sciences and Engineering Departments at Minority Institutions) faculty and students

• Leverage Indiana University’s STEM Initiative– Provides travel, housing, and support for HBCU students

to intern at Indiana University during the summer

Page 32: Https://portal.futuregrid.org FutureGrid Training, Education and Outreach Bloomington Indiana January 17 2010 Presented by Renato Figueiredo renato@acis.ufl.edu.

https://portal.futuregrid.org

Planned TEO activities• Coordinate Web tutorials and documentation;

emphasis to support short tutorials that can be given by partners at conferences, and self-guided learning by new or prospective users

• Continuously provide recommendations and guidance, Web portal, user accounts

• Engage with potential early adopters in computer science and engineering classes

• Leverage existing MSI contacts, and use of FutureGrid in workshops, summer schools, and internships