CEG7380 Cloud Computing Lecture 1
description
Transcript of CEG7380 Cloud Computing Lecture 1
![Page 1: CEG7380 Cloud Computing Lecture 1](https://reader035.fdocuments.us/reader035/viewer/2022062409/568147a4550346895db4dd99/html5/thumbnails/1.jpg)
CEG7380 Cloud ComputingLecture 1
Keke Chen
![Page 2: CEG7380 Cloud Computing Lecture 1](https://reader035.fdocuments.us/reader035/viewer/2022062409/568147a4550346895db4dd99/html5/thumbnails/2.jpg)
Outline Syllabus
Scope of this course Tentative schedule Prerequisites Resources Assignments
Introduction
![Page 3: CEG7380 Cloud Computing Lecture 1](https://reader035.fdocuments.us/reader035/viewer/2022062409/568147a4550346895db4dd99/html5/thumbnails/3.jpg)
Scope of this course Understand the basic ideas of cloud
computing Get familiar with
Tools Systems
Expose to some research topics
![Page 4: CEG7380 Cloud Computing Lecture 1](https://reader035.fdocuments.us/reader035/viewer/2022062409/568147a4550346895db4dd99/html5/thumbnails/4.jpg)
Two major parts: Processing large data with the cloud Scaling up/down web applications
with the cloud
Note: some programming parts need self-study
![Page 5: CEG7380 Cloud Computing Lecture 1](https://reader035.fdocuments.us/reader035/viewer/2022062409/568147a4550346895db4dd99/html5/thumbnails/5.jpg)
Prerequisites Some programming skills
Java, python, shell Comfortable with learning new
programming frameworks
Sufficient knowledge about Data structure and databases Operating systems Distributed systems
![Page 6: CEG7380 Cloud Computing Lecture 1](https://reader035.fdocuments.us/reader035/viewer/2022062409/568147a4550346895db4dd99/html5/thumbnails/6.jpg)
Assignments and Grading Reading papers (~3) (10%) Some miniprojects (4~5) (60%)
Help you master the concepts Learn to use tools and systems
Self-motivated research projects are strongly encouraged!
Final exam (20%) Class attendance and discussion
(10%)
![Page 7: CEG7380 Cloud Computing Lecture 1](https://reader035.fdocuments.us/reader035/viewer/2022062409/568147a4550346895db4dd99/html5/thumbnails/7.jpg)
Resources updated reference list Inhouse hadoop cluster AWS access
coupon code for each student
Pilot Submitting reading assignments and
projects
![Page 8: CEG7380 Cloud Computing Lecture 1](https://reader035.fdocuments.us/reader035/viewer/2022062409/568147a4550346895db4dd99/html5/thumbnails/8.jpg)
Tentative Schedule Parallel data processing
Distributed file systems (GFS, HDFS) MapReduce High-level distributed data management
Cloud infrastructures Virtualization AWS and Eucalyptus Interactive front-end – Google App Engine
Cloud security and privacy Research topics
![Page 9: CEG7380 Cloud Computing Lecture 1](https://reader035.fdocuments.us/reader035/viewer/2022062409/568147a4550346895db4dd99/html5/thumbnails/9.jpg)
In projects, we will learn to use Hadoop Mapreduce, Pig Latin AWS google app engine
![Page 10: CEG7380 Cloud Computing Lecture 1](https://reader035.fdocuments.us/reader035/viewer/2022062409/568147a4550346895db4dd99/html5/thumbnails/10.jpg)
Cloud Computinglecture 1-2
Some slides are borrowed from UC Berkeley RAD Lab
Keke Chen
![Page 11: CEG7380 Cloud Computing Lecture 1](https://reader035.fdocuments.us/reader035/viewer/2022062409/568147a4550346895db4dd99/html5/thumbnails/11.jpg)
Outline What is cloud computing? Why now? Cloud killer applications Cloud economics Challenges and opportunities
“above the cloud” “Clairemont Report”
![Page 12: CEG7380 Cloud Computing Lecture 1](https://reader035.fdocuments.us/reader035/viewer/2022062409/568147a4550346895db4dd99/html5/thumbnails/12.jpg)
What is Cloud Computing?
Old idea: Software as a Service (SaaS) Def: delivering applications over the
Internet Recently: “[Hardware, Infrastrucuture,
Platform] as a service”
Utility Computing: pay-as-you-use computing Illusion of infinite resources No up-front cost Fine-grained billing (e.g. hourly)
12
![Page 13: CEG7380 Cloud Computing Lecture 1](https://reader035.fdocuments.us/reader035/viewer/2022062409/568147a4550346895db4dd99/html5/thumbnails/13.jpg)
Cloud computing vs. grid computing Cloud computing = virtualization+
grid + services + utility computing Grid computing: resource provisioning,
load balancing, parallel processing
Views of different users System admin/hadoop users: grid Application owners/service users:
service, utility
![Page 14: CEG7380 Cloud Computing Lecture 1](https://reader035.fdocuments.us/reader035/viewer/2022062409/568147a4550346895db4dd99/html5/thumbnails/14.jpg)
Users and cloud providers
![Page 15: CEG7380 Cloud Computing Lecture 1](https://reader035.fdocuments.us/reader035/viewer/2022062409/568147a4550346895db4dd99/html5/thumbnails/15.jpg)
Why Now?
Experience with very large datacenters – profitable for cloud providers economics of scale Pervasive broadband Internet Fast x86 virtualization Pay-as-you-go billing model
Large user base Online payment Online Ads Content distribution Web 2.0 lowers the entry point to e-business
more small e-business owners Large user base of clouds
15
![Page 16: CEG7380 Cloud Computing Lecture 1](https://reader035.fdocuments.us/reader035/viewer/2022062409/568147a4550346895db4dd99/html5/thumbnails/16.jpg)
Spectrum of Clouds
Instruction Set VM (Amazon EC2, 3Tera)
Bytecode VM (Microsoft Azure) Framework VM
Google AppEngine, Force.com
EC2 Azure AppEngine Force.com
Lower-level,Less management
Higher-level,More management
16
![Page 17: CEG7380 Cloud Computing Lecture 1](https://reader035.fdocuments.us/reader035/viewer/2022062409/568147a4550346895db4dd99/html5/thumbnails/17.jpg)
Cloud Killer Apps
Mobile and web applications Batch processing / MapReduce
Data analytics (big data) E.g., OLAP, data mining, machine learning
Extensions of desktop software Matlab, Mathematica
17
![Page 18: CEG7380 Cloud Computing Lecture 1](https://reader035.fdocuments.us/reader035/viewer/2022062409/568147a4550346895db4dd99/html5/thumbnails/18.jpg)
Unused resources
Cloud Economics
• Pay by use instead of provisioning for peak
Static data center Data center in the cloud
Demand
Capacity
Time
Demand
Capacity
Time
18
![Page 19: CEG7380 Cloud Computing Lecture 1](https://reader035.fdocuments.us/reader035/viewer/2022062409/568147a4550346895db4dd99/html5/thumbnails/19.jpg)
Unused resources
Economics of Cloud Users
• Risk of over-provisioning: underutilization
Static data center
Demand
Capacity
Time
19
![Page 20: CEG7380 Cloud Computing Lecture 1](https://reader035.fdocuments.us/reader035/viewer/2022062409/568147a4550346895db4dd99/html5/thumbnails/20.jpg)
Economics of Cloud Users
• Heavy penalty for under-provisioning
Lost revenue
Lost users
Demand
Capacity
Time (days)1 2 3
Demand
Capacity
Time (days)1 2 3
Demand
Capacity
Time (days)1 2 3
20
![Page 21: CEG7380 Cloud Computing Lecture 1](https://reader035.fdocuments.us/reader035/viewer/2022062409/568147a4550346895db4dd99/html5/thumbnails/21.jpg)
Economics of Cloud Providers
5-7x economies of scale [Hamilton 2008]
Extra benefits Amazon: utilize off-peak capacity Microsoft: sell .NET tools Google: reuse existing infrastructure
ResourceCost in
Medium DCCost in
Very Large DC Ratio
Network $95 / Mbps / month $13 / Mbps / month 7.1x
Storage $2.20 / GB / month $0.40 / GB / month 5.7x
Administration ≈140 servers/admin >1000 servers/admin 7.1x
21
![Page 22: CEG7380 Cloud Computing Lecture 1](https://reader035.fdocuments.us/reader035/viewer/2022062409/568147a4550346895db4dd99/html5/thumbnails/22.jpg)
Adoption Challenges
Challenge Opportunity
Availability Multiple providers & DCs
Data lock-in Standardization
Data Confidentiality, Auditability, and privacy
Encryption, VLANs, Firewalls; Geographical Data Storage; Privacy preserving data outsourcing
22
![Page 23: CEG7380 Cloud Computing Lecture 1](https://reader035.fdocuments.us/reader035/viewer/2022062409/568147a4550346895db4dd99/html5/thumbnails/23.jpg)
Growth Challenges
Challenge Opportunity
Data transfer bottlenecks
FedEx-ing disks, Data Backup/Archival
Performance unpredictability
Improved VM support, flash memory, scheduling VMs
Scalable storage Invent scalable store
Bugs in large distributed systems
Invent Debugger that relies on Distributed VMs
Scaling quickly Invent Auto-Scaler that relies on ML; Snapshots
23
![Page 24: CEG7380 Cloud Computing Lecture 1](https://reader035.fdocuments.us/reader035/viewer/2022062409/568147a4550346895db4dd99/html5/thumbnails/24.jpg)
Policy and Business Challenges
Challenge Opportunity
Reputation Fate Sharing Offer reputation-guarding services like those for email
Software Licensing Pay-for-use licenses; Bulk use sales
24
![Page 25: CEG7380 Cloud Computing Lecture 1](https://reader035.fdocuments.us/reader035/viewer/2022062409/568147a4550346895db4dd99/html5/thumbnails/25.jpg)
Research Challenges Mentioned by Database Community (Claremont
Report)
![Page 26: CEG7380 Cloud Computing Lecture 1](https://reader035.fdocuments.us/reader035/viewer/2022062409/568147a4550346895db4dd99/html5/thumbnails/26.jpg)
Functionality and operational cost Background: compare massive-scale
data intensive computing systems with today’s DBMS
Limited functionality Simple APIs (e.g. mapreduce) Pushes more burden on developers
Benefits Easier to manage Lower operational cost Service Level Agreement (SLA) that is hard
to provide for a SQL DBMSP.S. DB Systems are notorious for their expenses in
installation and maintenance.
![Page 27: CEG7380 Cloud Computing Lecture 1](https://reader035.fdocuments.us/reader035/viewer/2022062409/568147a4550346895db4dd99/html5/thumbnails/27.jpg)
Manageability Features of cloud systems
Limited human intervention High variance workloads A variety of shared infrastructures No DBAs or Administrators to assist developers
Systems need to do work automatically Self-managing Adaptive (autonomous) computing
![Page 28: CEG7380 Cloud Computing Lecture 1](https://reader035.fdocuments.us/reader035/viewer/2022062409/568147a4550346895db4dd99/html5/thumbnails/28.jpg)
Data security and privacy Users sharing physical resources in a
cloud Protect from each other (security) Protect from curious cloud providers
(privacy)
Successes may depend on specific target usage scenarios Examples
Query based services Mining based services
![Page 29: CEG7380 Cloud Computing Lecture 1](https://reader035.fdocuments.us/reader035/viewer/2022062409/568147a4550346895db4dd99/html5/thumbnails/29.jpg)
Datasets over multiple clouds Interesting datasets might be
available in different clouds Different cloud providers Private or public clouds
Services mashing up datasets Inevitably crossing clouds
Federated cloud architectures
![Page 30: CEG7380 Cloud Computing Lecture 1](https://reader035.fdocuments.us/reader035/viewer/2022062409/568147a4550346895db4dd99/html5/thumbnails/30.jpg)
Algorithms on Big data Working on “Big Data”
Data mining Machine learning Visualization
Traditionally assume data is in flat files or relational databases
Distributed data organization puts new challenges Redesign algorithms Redesign frameworks