Bigtable: A Distributed Storage System for Structured Data 1.

41
The name of Allah Bigtable : A Distributed Storage System for Structured Data PRESENTER: BAHAREH HABIBI SHILAN HABIBI SAMAN FOROUZANDEH Bigtable: A Distributed Storage System for Structured Data 1

Transcript of Bigtable: A Distributed Storage System for Structured Data 1.

Page 1: Bigtable: A Distributed Storage System for Structured Data 1.

The name of Allah

Bigtable : A Distributed Storage System for Structured Data

PRESENTER:

BAHAREH HABIBI SHILAN HABIBI

SAMAN FOROUZANDEH

Bigtable: A Distributed Storage System for Structured Data 1

Page 2: Bigtable: A Distributed Storage System for Structured Data 1.

Before we begin …

BigTable

Sawzall

MapReduce

Bloom Filters

Bigtable: A Distributed Storage System for Structured Data 2

Page 3: Bigtable: A Distributed Storage System for Structured Data 1.

Table of Contents

Introduction Data Model API Building Blocks Implementation Refinements Performance Evaluation Real Applications Lessons Related Work Conclusions

Bigtable: A Distributed Storage System for Structured Data 3

Page 4: Bigtable: A Distributed Storage System for Structured Data 1.

Introduction

What is Bigtable?

A distributed storage system for managing structured data at Google

Used by > 60 Google products

Google Analytics

Google reader

Personalized Search

Orkut

Bigtable: A Distributed Storage System for Structured Data 4

Page 5: Bigtable: A Distributed Storage System for Structured Data 1.

Introduction

Goals Wide applicability Scalability High performance High availability

Bigtable and Database

Bigtable does not support a full relational data model

Bigtable: A Distributed Storage System for Structured Data 5

Page 6: Bigtable: A Distributed Storage System for Structured Data 1.

A Bigtable is sparse, distributed, persistent multi-dimensional

sorted map

Distributed multi-dimensional sparse map (row, column, timestamp) cell contents

Webtable

Data Model

Bigtable: A Distributed Storage System for Structured Data 6

Page 7: Bigtable: A Distributed Storage System for Structured Data 1.

Rows

row keys are arbitrary strings up to 64KB

every read or write of data in a single row is atomic (regardless

of the # or columns)

row ranges are dynamically partitioned into tablets

Bigtable: A Distributed Storage System for Structured Data 7

Data Model

Page 8: Bigtable: A Distributed Storage System for Structured Data 1.

Data Model

Column Families

column keys are grouped into sets called column families

usually of the same type

number of columns families should be small

number of columns is unbounded

access control is at the column family level

Bigtable: A Distributed Storage System for Structured Data 8

Page 9: Bigtable: A Distributed Storage System for Structured Data 1.

Data Model

Timestamps

Each cell in a Bigtable can contain multiple versions of the

same data

Versions are indexed by 64-bit integer timestamps

Garage-collection settings per-column-family:

only the last n versions of a cell be kept, or

only new-enough versions be kept

Bigtable: A Distributed Storage System for Structured Data 9

Page 10: Bigtable: A Distributed Storage System for Structured Data 1.

Data Model

Bigtable: A Distributed Storage System for Structured Data 10

Rows

Columns

Timestamps

Page 11: Bigtable: A Distributed Storage System for Structured Data 1.

API

Metadata operations Create/delete tables or column families

Change metadata

Writes (atomic)

Bigtable does not support general transactions across row keys does not support writing to Bigtable

filtering, summarization, and transformation

Bigtable can be used with MapReduce

Bigtable: A Distributed Storage System for Structured Data 11

Page 12: Bigtable: A Distributed Storage System for Structured Data 1.

Building Blocks

Google File System (GFS) used to store log and data files

Scheduler cluster management system used to manage jobs and resources

SSTable file format used internally to store Bigtable data

Chubby distributed lock service highly-available with five active replicas• 0.0047% unavailability for 14 Bibtable clusters• 0.0326% unavailability for most affectected cluster

Bigtable: A Distributed Storage System for Structured Data 12

Page 13: Bigtable: A Distributed Storage System for Structured Data 1.

Implementation

What is a tablet?

A Bigtable cluster stores a number of tables

Each table consists of a set of tablets

Each tablet managed by a specific tablet server

As a table grows, it is automatically split into multiple

tablets (100-200) MB in size by default

Tablet servers handle read/write requests for their tablets

Bigtable: A Distributed Storage System for Structured Data 13

Page 14: Bigtable: A Distributed Storage System for Structured Data 1.

Implementation

BigTable: Servers

Master manages assignment of tablets servers

Bigtable: A Distributed Storage System for Structured Data 14

Tablet server 1

Bigtable Master

Tablet server 2

Tablets Tablets

Page 15: Bigtable: A Distributed Storage System for Structured Data 1.

Implementation

Tablet Location A three-level hierarchy of tablets is used to store tablet

locations The root tablet is never split

Bigtable: A Distributed Storage System for Structured Data 15

Page 16: Bigtable: A Distributed Storage System for Structured Data 1.

ImplementationTablet Assignment

A master server is responsible for assigning tablets to tablet

servers

The master server also:

detects addition and expiration of tablet servers

balances tablet server loads

initiates garbage collection of files in GFS

reassigns tablets when a tablet server is lost

If the master server dies, a new master server is recreated

Bigtable: A Distributed Storage System for Structured Data 16

Page 17: Bigtable: A Distributed Storage System for Structured Data 1.

Implementation

Tablet Serving

The persistent state of a tablet as stored in GFS

Bigtable: A Distributed Storage System for Structured Data 17

memtable Read Op

TABLET LOG

Write OpSSTable Files

Memory

GFS

Page 18: Bigtable: A Distributed Storage System for Structured Data 1.

Implementation

Compactions

Minor Compactions

memtable size reaches a threshold

memtable is frozen

new memtable is created

frozen memtable is converted into a new SSTable

Merging Compactions

Bigtable: A Distributed Storage System for Structured Data 18

Page 19: Bigtable: A Distributed Storage System for Structured Data 1.

Refinements

A number of refinements were required for Bigtable implementations to achieve high:

performance

availability

reliability

Bigtable: A Distributed Storage System for Structured Data 19

Page 20: Bigtable: A Distributed Storage System for Structured Data 1.

Refinements

Locality groups

Clients can group multiple column families together into a

locality group

A separate SSTable is generated for each locality group

Segregating column families which are not typically accessed

together enables more efficient reads

Bigtable: A Distributed Storage System for Structured Data 20

Page 21: Bigtable: A Distributed Storage System for Structured Data 1.

Refinements

Compression

Clients can control whether compression is used on a locality

group

Many clients use a two pass compression algorithm

Bentley and McIlroy's scheme

Bigtable: A Distributed Storage System for Structured Data 21

Page 22: Bigtable: A Distributed Storage System for Structured Data 1.

Refinements

Caching & Bloom Filters

Tablets use two levels of caching to improve read performance

Scan caching is useful for data which tends to be read

repeatedly

Block caching is useful for when read data tends to be close to

data recently read

Bloom filters reduce disk seeks by allowing a client to ask

whether a SSTable contains a row/column key pair

Bigtable: A Distributed Storage System for Structured Data 22

Page 23: Bigtable: A Distributed Storage System for Structured Data 1.

Refinements

Speeding Table Recovery

When a tablet is moved to another tablet server : A minor compaction is performed

The tablet server stop serving the tablet

Another minor compaction (unusually fast)

Then the tablet is moved without requiring any log entry recovery

Bigtable: A Distributed Storage System for Structured Data 23

Page 24: Bigtable: A Distributed Storage System for Structured Data 1.

Refinements

Exploiting Immutability

Because SSTables are immutable, various parts of the Bigtable

system have been simplified: file system access synchronization

permanently removing deleted data is completely handled thru garbage

collection

splitting tables is efficient because child tablets can share the SSTable of

parent tablets

Bigtable: A Distributed Storage System for Structured Data 24

Page 25: Bigtable: A Distributed Storage System for Structured Data 1.

Performance Evaluation

Google setup a Bigtable cluster with N tablet servers to measure

performance and scalability as N is varied.

configured to use 1 GB of memory

each with two 400GB IDE hard drives, two dual core 2 GHz chips, and a single

gigabit Ethernet link

N client machines generated the Bigtable load used for tests

Every machine ran a GFS server.

Bigtable: A Distributed Storage System for Structured Data 25

Page 26: Bigtable: A Distributed Storage System for Structured Data 1.

Performance Evaluation

Single tablet - server performance

Bigtable: A Distributed Storage System for Structured Data 26

Experiment# of Tablet Servers

1 50 250 500

Random Reads 1212 593 479 241Random Reads (mem) 10811 8511 8000 6250

Random Writes 8850 3745 3425 2000Sequintial Reads 4425 2463 2625 2469Sequintial Writes 8547 3623 2451 1905

Scans 15385 10526 9524 7843

Page 27: Bigtable: A Distributed Storage System for Structured Data 1.

Performance Evaluation

Scaling : Aggregate throughput increases by over a factor of

100 as the number of tablet servers is increased from 1 to 500.

Bigtable: A Distributed Storage System for Structured Data 27

Page 28: Bigtable: A Distributed Storage System for Structured Data 1.

Real Applications

As of August 2006 388 non-test Bigtable cluster

24500 tablet servers

Bigtable: A Distributed Storage System for Structured Data 28

# of Tablet Servers # of Clusters

0 .. 19 259

20 .. 49 47

50 .. 99 20

100 .. 499 50

> 500 12

Page 29: Bigtable: A Distributed Storage System for Structured Data 1.

Real Applications

Bigtable: A Distributed Storage System for Structured Data 29

This table provides some data about a few of the tables currently in use

Table size (measured before compression) and # Cells indicate approximate sizes

Page 30: Bigtable: A Distributed Storage System for Structured Data 1.

Real Applications

Google Analytics Google Analytics is supported by 2 Bigtables

200 TB raw click table 20 TB summary table

Bigtable: A Distributed Storage System for Structured Data 30

Page 31: Bigtable: A Distributed Storage System for Structured Data 1.

Real Applications

Google Earth Google Earth is supported by 2 Bigtables

70 TB images table, compression turned off 500 GB index table

Bigtable: A Distributed Storage System for Structured Data 31

Page 32: Bigtable: A Distributed Storage System for Structured Data 1.

Real Applications

Personalized Search Personalized Search supported by 1 Bigtable

one row per user id separate column family for each type of action

Bigtable: A Distributed Storage System for Structured Data 32

Page 33: Bigtable: A Distributed Storage System for Structured Data 1.

Lessons learned

Large distributed systems are vulnerable to many types of failures

memory and network corruption hung machines extended and asymmetric network partitions bugs in other systems (i.e. Chubby) overflow of GFS quotas planned and unplanned hardware maintenance

To address experience problems some protocols have been changed some assumptions have been modified

Bigtable: A Distributed Storage System for Structured Data 33

Page 34: Bigtable: A Distributed Storage System for Structured Data 1.

Lessons learned

It is important to delay adding new features until it is clear

how the new features will be used

It is important to support system-level monitoring allowed for detection and fixing of many issues

also enables tracking clusters to answer common questions

Bigtable: A Distributed Storage System for Structured Data 34

Page 35: Bigtable: A Distributed Storage System for Structured Data 1.

Related Work

The Boxwood project's goal is to provide infrastructure for

building higher-level services such as file systems or

databases

while the goal of Bigtable is to directly support client

applications that wish to store data

Bigtable: A Distributed Storage System for Structured Data 35

Page 36: Bigtable: A Distributed Storage System for Structured Data 1.

Related Work

C-Store and Bigtable share many characteristics

shared-nothing architecture

two different data structures

however these two systems vary significantly in their APIs performance

optimization

Bigtable: A Distributed Storage System for Structured Data 36

Page 37: Bigtable: A Distributed Storage System for Structured Data 1.

Conclusions

Bigtable is a distributed system for storing structure data at Google

in production since April 2005 seven person-years to design and implement more than 60 projects using in August 2006 users like performance and high availability

Users can scale their applications capacity by simply adding more machines to their system

Bigtable: A Distributed Storage System for Structured Data 37

Page 38: Bigtable: A Distributed Storage System for Structured Data 1.

Conclusions

Google has begun deploying Bigtable as a service to product groups

Google has gained significant advantages by building their own storage solution

has control over implementation and infrastructure can remove bottlenecks and inefficiencies as the arise

Bigtable: A Distributed Storage System for Structured Data 38

Page 39: Bigtable: A Distributed Storage System for Structured Data 1.

Strengths

Implementation and Usable

Optimization

Performance Evaluation

Used by > 60 Google products

Bigtable: A Distributed Storage System for Structured Data 39

Page 40: Bigtable: A Distributed Storage System for Structured Data 1.

Weaknesses

Complexity

Chubby

Master

Network

Bigtable: A Distributed Storage System for Structured Data 40

Page 41: Bigtable: A Distributed Storage System for Structured Data 1.

Thanks for your attention

Bigtable: A Distributed Storage System for Structured Data 41