Infrastructure for cloud_computing

Post on 28-Jan-2015

106 views 0 download

Tags:

description

 

Transcript of Infrastructure for cloud_computing

Infrastructure for Cloud Infrastructure for Cloud Computing

Dahai Li

2008/06/12

• About Cloud Computing

• Tools for Cloud Computing in Google

• Google’s partnerships with universities

Agenda

2

What’s new?

3

• Data safety and reliability

• Data synchronization between different devices

• Low requirement of end device

Advantages

• Unlimited potential of the cloud

Cloud for end user

Google Cloud

Cloud for web developer

APIs

Google Cloud

Example: Earthquake map based on Map API

7

• About Cloud Computing

• Tools for Cloud Computing in Google

• Google’s partnerships with universities

Agenda

8

google.stanford.edu (circa 1997)

google.com (1999)

Google Data Center (circa 2000)

Google File System (GFS)

12

Why GFS?

• Google has unusual requirements

• Unfair advantage

• Fun and challenging to build large-scale systems

13

Client

Client

Client

Client

Client

GFS Architecture

Google

48%

MSN

19%Replic

as

Masters

GFS Master

GFS Master

Client

Client

Client

Client

Client

Client

14

Yahoo

33%

C0 C1

C2C5

Chunkserver 1

C0

C2

C5

Chunkserver N

C1

C3C5

Chunkserver 2

Master

• Maintain Metadata:

– File namespace

– Access control info

– Maps files to chunks

• Control system activities:

– Monitor state of chunkservers

– Chunk allocation and placement

– Initiate chunk recovery and rebalancing

– Garbage collect dead chunks

– Collect and display stats, admin functions15

Client

• Protocol implemented by client library

• Read protocol

16

GFS Usage in Google Cloud

• 50+ clusters

• Filesystem clusters of up to 1000+ machines

• Pools of 1000+ clients

• 10+ GB/s read/write load

– in the presence of frequent hardware failures

17

MapReduce

18

What’s MapReduce

• A simple programming model that applies to many large-scale computing problems

• Hide messy details in MapReduce runtime library

19

Typical problem solved by MapReduce

• Read a lot of data

• Map: extract something you care about from each record

• Shuffle and Sort

• Reduce: aggregate, summarize, filter, or • Reduce: aggregate, summarize, filter, or transform

• Write the results

20

More specifically…

• Programmer specifies two primary methods:

– map(k, v) → <k', v'>*

– reduce(k', <v'>*) → <k', v'>*

• All v' with same k' are reduced together, in order.order.

21

Example: Word Frequencies in Web Pages

• Input is files with one document per record

• Specify a map function that takes a key/value pair

– key = document URL

– value = document contents

• Output of map function is (potentially many) key/value • Output of map function is (potentially many) key/value pairs.

– In our case, output (word, “1”) once per word in the document

22

<“网页1”, “是也不是”>

<“是”, “1”><“也”, “1”><“不”, “1”>…

• MapReduce library gathers together all pairs with the same key (shuffle/sort)

• The reduce function combines the values for a keyIn our case, compute the sum

Continued: word frequencies in web pages

key = “不”values = “1”

key = “是”values = “1”, “1”

key = “也”values = “1”

23

“是”, “2”“也”, “1”“不”, “1”

values = “1”

“1”

values = “1”, “1”

“2”

values = “1”

“1”

• Output of reduce (usually 0 or 1 value) paired with key and saved

Example: Pseudo-code

Map(String input_key, String input_value):

// input_key: document name

// input_value: document contents

for each word w in input_values:

EmitIntermediate(w, "1");

24

Reduce(String key, Iterator intermediate_values):

// key: a word, same for input and output

// intermediate_values: a list of counts

int result = 0;

for each v in intermediate_values:

result += ParseInt(v);

Emit(AsString(result));

Conclusion to MapReduce

• MapReduce has proven to be a remarkably-useful abstraction

• Greatly simplifies large-scale computations at Google

• Fun to use: focus on problem, let library deal with messy details

• Many thousands of parallel programs written by • Many thousands of parallel programs written by hundreds of different programmers in last few years

– Many had no prior parallel or distributed programming experience

25

BigTable

26

Overview

• Structure data storage, not database

• Wide applicability

• Scalability

• High performance• High performance

• High availability

27

Basic Data Model

• Distributed multi-dimensional sparse map

(row, column, timestamp) � cell contents

… t1

ROWS

COLUMNS“contents”

28

“<html>…”

t1t2

t3

www.cnn.com

TIMESTAMPS

• Good match for most of our applications

BigTable API

• Metadata operations

– Create/delete tables, column families, change metadata

• Writes (atomic)

– Set(): write cells in a row

– DeleteCells(): delete cells in a row– DeleteCells(): delete cells in a row

– DeleteRow(): delete all cells in a row

• Reads

– Scanner: read arbitrary cells in a bigtable

29

System Structure

Bigtable master

performs metadata ops,load balancing

Bigtable cellBigtable client

Bigtable clientlibrary

Open()

Cluster Scheduling Master

handles failover, monitoring

GFS

holds tablet data, logs

Lock service

holds metadata,handles master-election

Bigtable tablet server

serves data

Bigtable tablet server

serves data

Bigtable tablet server

serves data

Current status of BigTable

• Design/initial implementation started beginning of 2004

• Currently ~100 BigTable cells

• Production use or active development for many projects:

– Google Print

– My Search History

– Orkut– Orkut

– Crawling/indexing pipeline

– Google Maps/Google Earth

– Blogger

– …

• Largest bigtable cell manages ~200TB of data spread over several thousand machines (larger cells planned)

31

Typical Cluster

Scheduling masters

Machine 1

User

GFS masterLock service

Machine N

UserUser

Machine 2

32

GFS

chunkserver

Scheduler

slave

Linux

User app2

User

app1

…GFS

chunkserver

Scheduler

slave

Linux

User

app3

User app2

User

app1

GFS

chunkserver

Scheduler

slave

Linux

User

app3

• About Cloud Computing

• Tools for Cloud Computing in Google

• Google’s partnerships with universities

Agenda

33

ACCI in Oct. 2007

• Stand for Academic Cloud Computing Initiative

• IBM and Google partnership

• Facilitate universities education with distributed system programming skillsdistributed system programming skills

• Started from University of Washington and scaling to many others

34

Google’s ACCI activities in Greater China

• Google Greater China has helped create a cloud computing course at Tsinghua in summer 2007

• Now scaling to other mainland China and Taiwan UniversitiesTaiwan Universities

Example: THU MR Course, Fall 2007

• “Massive Data Processing” course based on Google Cloud technology

• Google employees gave lectures during the course offering;

• Got interesting results from the smart studentsstudents

• http://hpc.cs.tsinghua.edu.cn/dpcourse/

Count: THU MR Course, Fall 2007

Students presenting course

project “simulating the operation

of solar system based on

MapReduce technology” at

Google office

Massive data processing to

simulate the operation of

the solar system

THANK YOU

More info on

http://code.google.com/intl/zh-CN/