Powering “Clouds” - the Value Propositionmedia.govtech.net/GOVTECH_WEBSITE/EVENTS/... ·...
Transcript of Powering “Clouds” - the Value Propositionmedia.govtech.net/GOVTECH_WEBSITE/EVENTS/... ·...
NCDG/V2c/Sep-09 1
Powering “Clouds”- the Value Proposition -
Mladen A. VoukProfessor and Department Head of Computer Science, and
Associate Vice-Provost for Information Technology
North Carolina State UniversityRaleigh, NC 27695
NCDG/V2c/Sep-09 2
About the Speaker
• Mladen A. Vouk received Ph.D. from the King's College , University of London , U.K.. He has extensive experience in both commercialsoftware production and academic computing. He is the author/co-author of over 300 publications. His research and development interests include software engineering, scientific computing andworkflows, information technology (IT) assisted education, and high-performance computing and networks. Dr. Vouk has extensive professional visibility through organization of professional meetings, membership on professional journal editorial boards, and professional consulting. Dr. Vouk is a member of the IFIP Working Group 2.5 on Numerical Software, and a recipient of the IFIP Silver Core award. He is an IEEE Fellow, and a member of several IEEE societies, ASQ , ACM , and Sigma Xi.
Department Head and Professor of Computer Science, and Associate Vice Provost for Information Technology at N.C. State University, Raleigh, N.C., [email protected]
NCDG/V2c/Sep-09 3
A seamless component-based architecture that can deliver an integrated, orchestrated and rich suite of both loosely and
tightly coupled on-demand information technology functions and services, and significantly reduce overhead and total
cost of ownership and services.
Server consolidation, hardware abstraction via virtualization, resource management, reliability and availability, security, cost
reduction, ...
Google Trends (9/2/09)
“Clouds”
NCDG/V2c/Sep-09 4
Disruptive Information Technologies
650623606
646667
826
935
10271051
965917
727
665629
532495503500
585624
685
792825
950
10771072
972
838845
739
505488476
396376382362309322
278
204175159142
115109113152136130
94776056587576545734
11551111
108210421043
1208
12971336
1373
1243
1121
902
824771
647604616
652
721754779
869885
1006
11351147
1048
892902
773
0
200
400
600
800
1000
1200
1400
1600
1975 1980 1985 1990 1995 2000 2005 2010 2015
Year
Num
berC
SC S
tude
nts
Enr
olle
d
Ugrad Grad CSC-T
Apple, GUI,PC, TCP/IP
InternetGrows
Personal Devices
Access to Net& Net-basedServices Grows
Integrated& UtilityDevices
Dot.ComCrash
“R”
“R”
Web
R
Clouds
Enablement of End-Users
PC
NCDG/V2c/Sep-09 5
Brief History• “Cloud” computing – builds on decades of research in virtualization,
distributed computing, utility computing, grids, and more recently networking, web and software services.– Virtualization (since 1960s)– Distributed Computing (1988-1990)– Web (1989-1993) – Service Oriented Architectures (1995-2005)– Grids (1996-1999)– Virtual Computing Laboratory – Aug 2004– Amazon Elastic Compute Cloud – Aug 2006– Hadoop/MapReduce (cca 2007)– IBM/Google Cloud (Oct 2007)– IBM Blue Cloud (Nov 2007)– Many other “Clouds”
NCDG/V2c/Sep-09 6
“Cloud Architecture”
(Virtualized)Resources &
Services
AuthenticationAuthorizationAccounting
Client (End-User)
Portal AccessContentServices
ProvenanceMeta-Data
Fault Tolerance
Privacy &Security
OtherAttributes
NCDG/V2c/Sep-09 7
An Implementation – NC State University Virtual Computing
Laboratory (VCL)VCL is Open Source – developed by NCSU OIT, COE and CSC
http://incubator.apache.org/projects/vcl.htmlPartnerships with IBM, Intel, NetApp, Cisco, SAS, UNCGA, State of NC, NCCCS, MCNC,
Friday Institute, SOSI labs, and others.
Bootstrapping reference: http://vcl.ncsu.edu/news/awards-and-recognition/apache-vcl-ncsu-featured-ieee-computer-magazine
http://vcl.ncsu.edu
NCDG/V2c/Sep-09 8
VCL Research and Development Team
• Core NCSU team: Sam Averitt, Michael Bugaev, Patrick Dreher(RENCI), Andy Kurth, Marc Hoit, Aaron Peeler, Henry Shaffer, Eric Sills, Sarah Stein, Josh Thompson, Mladen Vouk, Brian Bouterse, John Bass, Shawn VanHulst, …
• Many others at NCSU (faculty, students, staff)• Many others at other sites and other organizations
NCDG/V2c/Sep-09 9
VCloud Community
OC12 (622 Mbps Cicruit)
OC48 (2.4 Gbps Circuit)
DWDM (10 Gbps Ethernet)
U. South Carolina,Clemson
VTechODUCCVMSUGM
UMBC BC
WFU
NC Community College SystemNC K-12
TorontoQueensWaterlooCarleton
NCA&T
IndiaAmrita U.U. HyderbadHBTI-UPTU
Research
Production/Pilots/UsersInterest/Plans
NCDG/V2c/Sep-09 10
VCL Database
AuthenticationService
Virtual or Real Differentiated Resources
Virtual or RealUndifferentiated Resources
Internet
VCLManager& Scheduler
Node Manager #1
Image Repository
Node Manager #2
Image Repository
Node Manager #n
Image Repository
z-SeriesTera-Grid
University Labs
NC State Computational “Cloud” is powered by VCL
Storage
Storage
NCDG/V2c/Sep-09 11
NC State Cloud Services
Single Seat(VCL-Desktop)
Multiple SyncedSeats
(VCL-Class)Servers
(VCL-Server)Aggregates(VCL-Cloud)
Actual sole-use bare-metal based, or virtual HaaS, IaaS, PaaS, AaaS, [SaaS, CaaS]
HPCClusters
(VCL-HPC)
SupercomputersSystem Z(mainframes)Other …
Differentiated Resources
Undifferentiated Resources
VCL Agent
Storage
NCDG/V2c/Sep-09 12
Hardware Blades, servers, desktops, storage …
OS:
Apps
Win Linux Other …
VirtualLayer
OS: Win Linux …
Apps
e.g.,Web
Sphere
e.g., Web
Sphere
…RDP,VNC,
…
e.g., VMWare,
XEN, MSVS2500,..
X-WinClient
Apps.WorkFlow
Services
End-UserAccess
VisServices Other …
Middlewaree.g. LSF
VCLManager
“Applic
atio
n”
Imag
e Sta
ck
xCAT VCL code IBM TM
WebServer DataBase Etc.
Users“Images”
H/W ResourcesUndifferentiated local or distributed
Differentiator: User to Image to Resource Mapping, Management & Provenance
Reliability, Component-Based,Scalability, Economy
Images&
Environments
NCDG/V2c/Sep-09 13
Business ModelCurrent VCL (at NC State University): 1. cca 2,000 blades2. open to 30,000+ students and faculty3. cca 500 to 700 in non-HPC mode, the rest in HPC
mode4. Environment base-lines are typically Windows and
Linux with a variety of applications. Depending on how demanding an application is, service may be virtualized (VMWare) or bare-metal.
5. Currently Cca 600 images, cca 120 in use per semester.
6. About 80-100,000 image reservations per semester.7. Most of the “individual seat” requests are on-
demand “Now” reservations: cca 90% of requests8. System availability: exceeds 99%
NCDG/V2c/Sep-09 14
Cost Factors• Utilization (real-time + batch mix), operational profile• Lab spaces (25:1) – in 2008/09 cca 160,000 non-HPC
reservations (real-time), cca 7 million HPC CPU hrs (batch)
• Refresh cycle (yearly), resource lifetime (cca 5 years) –yearly down-migration of resources
• Power savings (Blades) • Architectural savings (e.g., NCCCS)• Reduced administration and maintenance costs (1-2
FTEs for about 2,000 blades)• One stop shopping (augmentation)• Distributed burden of image creation (600+ images)• “Green”• Other …
NCDG/V2c/Sep-09 15
VCL Use
0
200
400
600
800
1000
1200
1400
1600
1800
20009/
1/20
04
12/1
/200
4
3/1/
2005
6/1/
2005
9/1/
2005
12/1
/200
5
3/1/
2006
6/1/
2006
9/1/
2006
12/1
/200
6
3/1/
2007
6/1/
2007
9/1/
2007
12/1
/200
7
3/1/
2008
6/1/
2008
9/1/
2008
12/1
/200
8
3/1/
2009
Date
Num
ber o
f Res
erva
tions
VCL Reservations by Day
600 images availablecca 100,000 reservations per year
NCDG/V2c/Sep-09 16
Capacity Planning
0
100
200
300
400
500
600
700
800
9/1/
2004
12/1
/200
4
3/1/
2005
6/1/
2005
9/1/
2005
12/1
/200
5
3/1/
2006
6/1/
2006
9/1/
2006
12/1
/200
6
3/1/
2007
6/1/
2007
9/1/
2007
12/1
/200
7
3/1/
2008
6/1/
2008
9/1/
2008
12/1
/200
8
3/1/
2009
Date
Con
curr
ent R
eser
vatio
ns
0
200
400
600
800
1000
1200
1400
1600
11/1/
2008
11/3/
2008
11/5/
2008
11/7/
2008
11/9/
2008
11/11
/2008
11/13
/2008
11/15
/2008
11/17
/2008
11/19
/2008
11/21
/2008
11/23
/2008
11/25
/2008
11/27
/2008
11/29
/2008
Date
Num
ber o
f Res
erva
tions
Total
Concurrent
NCDG/V2c/Sep-09 17
Green & Cost-Effective
0
20
40
60
80
100
120
140
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
Time of Day (24 hr clock)
Ave
rage
Num
ber o
f Res
erva
tions
November 2008
0
100,000
200,000
300,000
400,000
500,000
600,000
700,000
800,000
Mar-08
Apr-08
May-08
Jun-08
Jul-08 Aug-08
Sep-08
Oct-08
Nov-08
Dec-08
Jan-09
Feb-09
Month
CPU
Hou
rs
High-Performance Computing
Average daily active reservations
NCDG/V2c/Sep-09 18
Economics• In 2008, about 7,200,000 CPU hours (about 6.9 million
on HPC and about 300,000 on non-HPC) on about 1,500 blades (cca 3,000 processors) – upto about 1000 in HPC mode.
• About 70-80% utilization on the average, but lower on non-HPC side (over provisioned to handle peak loads), high on the HPC side.
• About $2 million annually (refresh, management and maintenance, improvements, personnel, …).
• About 27 cents or less per CPU hour (cca 3 cents HPC, 24 or less cents non-HPC).
• This can come down to 10 to 15 cents per CPU hour –and lower - with scale-up, large-scale virtualization, and new hardware (moving to quad-core processors).
NCDG/V2c/Sep-09 19
Case-Study: Wake Tech Community College
• 60,000+ students• Pilot project with cca 800 students
– Several introductory class laboratories.– Using VCL with about 60 blades, no bare-
metal loads (virtualization using VMware)• Lab cost savings: cca 50%• NCCCS ramping up VCL to 14+
Community Colleges
NCDG/V2c/Sep-09 22
VCL Usage2004-2008
HPC
Non-HPC:Total Reservations: 352,488"Now" Reservations: 338,245"Later" Reservations: 24,876Unavailable or failed: 10,633Failed: 5,080 Reliability: 0.969 – 0.985
Non-HPC
Non-HPC Reservations:0 - 30 Min: 132,05230 Min - 1 Hour: 77,0231 Hour - 2 Hours: 75,8092 Hours - 4 Hours: 54,922> 4 Hours: 23,315
NCDG/V2c/Sep-09 23
VCL Usage1-Jul-07 to 30-Jun-08
0 - 30 Min: 48,61430 Min - 1 Hour: 31,0141 Hour - 2 Hours: 27,4212 Hours - 4 Hours: 22,222> 4 Hours: 7,443
Non-HPC: Total Reservations: 130,800Total Hours Used: 198,583"Now" Reservations: 125,278"Later" Reservations: 11,436Unavailable + Failed: 5,914Failed: 1,611Reliability: 0.955 – 0.988Load times < 2 minutes: 109,223Load times >= 2 minutes: 21,577
Non-HPC
HPC
NCDG/V2c/Sep-09 24
0 - 30 Min: 595930 Min - 1 Hour: 50691 Hour - 2 Hours: 56042 Hours - 4 Hours: 3224> 4 Hours: 1847
Total Reservations: 20,686Total Hours Used: 31,853"Now" Reservations: 19,770"Later" Reservations: 1,933Unavailable + Failed: 1,017Failed: 429Reliability: 0.950 - 0.979Load times < 2 minutes: 17,013Load times >= 2 minutes: 3,673Total Unique Users: 4,095
cca 500 blades
November 2008Non-HPC
Reservations Concurrent
AverageDaily
NCDG/V2c/Sep-09 25
Awards“NC State Cloud Computing Services" received 2009 "Laureate Medal” from the Computerworld Honors Program, Computerworld Information Technology Awards Foundation."Virtual Computing Laboratory (VCL)" received 2007 "Laureate Medal” from the Computerworld Honors Program, Computerworld Information Technology Awards Foundation.Finalist in the 2007 Best Practices in Infrastructure Management –Computerworld – Infrastructure Management World
NCDG/V2c/Sep-09 26
VCL Configurations
NCDG/V2c/Sep-09 27
VCL Components
• Web Interface/Scheduler• Database• Management node• Servers
LAMP (Linux/Apache/MySQL/php/perl) serverVCL scheduler code and DB schema
xCAT & VCL management node code
Servers - physical and/or virtual to be managed by VCL
NCDG/V2c/Sep-09 28
Small VCL Configuration• 1 BladeCenter E chassis
– 2 Ethernet Switch Modules (BNT Layer 2/3 copper)– Power supplies 3&4 (for 7 or more blades)– Chassis network module to connect management node to
storage• Fiber Channel - Optical pass through • iSCSI - Copper pass through
• 2-14 HSxy Blades– At least one blade configured to attach to external storage for
Image Library (FC, iSCSI, …)– Server for scheduler, database, and management node– Server(s) to deliver VCL services
• Storage for Images– FC or iSCSI storage array (few TB)
NCDG/V2c/Sep-09 29
Small VCL Configuration
ESM
ESM
OPM
MM
NCDG/V2c/Sep-09 30
Scaling BladeCenter VCL Configuration• Network switch
– Cisco 6509e (or equivalent in your favorite network vendor flavor)
– 3 separate networks (at least)• Network connected to Internet for user access• Private Network connected to VCL management node
(for loading and managing images)• Private Management network (connecting BladeCenter
Management Modules and VCL management node -controls power on/off, reboot, …)
• VCL Management nodes– One management node for every ~100 blades– Physical connection to storage array - shared file
system (GFS, GPFS) for multiple management nodes at one site
NCDG/V2c/Sep-09 31
GigE Switch
GigE Switch
Scaling VCL
Public Network
GigE Switch
Private Management Network Private Network
NCDG/V2c/Sep-09 32
HPC Cluster in VCL
• Network switch– Add another private network for message
passing traffic - use NIC that would be used for Public network user access
• BladeCenter Chassis– Configure two VLANs in one chassis switch
module.. one for public Internet access and one for private message passing interface
• VCL management node– configures blade VLAN based on image
metadata
NCDG/V2c/Sep-09 33
GigE Switch
GigE Switch
HPC Cluster in VCL
Public Network
GigE Switch
Private Management Network Private Network
HPCStorageServers
GigE Switch
Message Passing Network
NCDG/V2c/Sep-09 34
Adding Low Latency Interconnect for HPC workload
• BladeCenter chassis (not chassis housing management nodes)– Chassis network module for low-latency
interconnect• Optical pass through (Myrinet, InfiniBand)• IB Switch
• Blade servers– Daughtercard for low-latency interconnect
(Myrinet, InfiniBand)
NCDG/V2c/Sep-09 35
Large Scale VCL Deployment• IBM BladeCenters or iDataPlex - ~84 physical
servers/rack, Dell and HP equipment can also be used• LAMP & Management node servers• Network switch(es)
– Possibly 1 less network - no separate management network port (combined with one of two GbE ports and/or 10Gbps ports)
– Server switches in iDataPlex rack, if iDataPlex is used.– High-security version requires VPN and VLANs to individual
VMs.• Storage
NCDG/V2c/Sep-09 36
Shades of Things to Come
NCDG/V2c/Sep-09 37
Plans• Virtualization variety (VMware, XEN, KVM, …)• Pro-active and speculative scheduling• Automated image construction• Government and military-level security options• UNC build-out• Community Colleges and K-12• Increased performance• Seamless resource sharing• Modularization• Other ...
NCDG/V2c/Sep-09 38
Desktop, Cloud, HPC
Write-upshttp://vcl.ncsu.edu/papers-publications
NCDG/V2c/Sep-09 39
WebInterface Scheduler VCL
DB
ManagementNode
Server
Service Environment(Image) Library
Internet
Typical Student Computing, Desktop Augmentation, Use of VCL
NCDG/V2c/Sep-09 40
LoginNodeInternet
Typical HPC Use of VCL
HPCScheduler
HPCStorage
Job
ComputeNodes
NCDG/V2c/Sep-09 41
Internet
Typical “Cloud” Use of VCL
CloudController
CloudStorage
Cloud Members
On-demand construction and reservationof clusters of homogenous or non-homogenousresources, operating systems and apps.