Post on 02-Jan-2016
The name of Allah
Bigtable : A Distributed Storage System for Structured Data
PRESENTER:
BAHAREH HABIBI SHILAN HABIBI
SAMAN FOROUZANDEH
Bigtable: A Distributed Storage System for Structured Data 1
Before we begin …
BigTable
Sawzall
MapReduce
Bloom Filters
Bigtable: A Distributed Storage System for Structured Data 2
Table of Contents
Introduction Data Model API Building Blocks Implementation Refinements Performance Evaluation Real Applications Lessons Related Work Conclusions
Bigtable: A Distributed Storage System for Structured Data 3
Introduction
What is Bigtable?
A distributed storage system for managing structured data at Google
Used by > 60 Google products
Google Analytics
Google reader
Personalized Search
Orkut
Bigtable: A Distributed Storage System for Structured Data 4
Introduction
Goals Wide applicability Scalability High performance High availability
Bigtable and Database
Bigtable does not support a full relational data model
Bigtable: A Distributed Storage System for Structured Data 5
A Bigtable is sparse, distributed, persistent multi-dimensional
sorted map
Distributed multi-dimensional sparse map (row, column, timestamp) cell contents
Webtable
Data Model
Bigtable: A Distributed Storage System for Structured Data 6
Rows
row keys are arbitrary strings up to 64KB
every read or write of data in a single row is atomic (regardless
of the # or columns)
row ranges are dynamically partitioned into tablets
Bigtable: A Distributed Storage System for Structured Data 7
Data Model
Data Model
Column Families
column keys are grouped into sets called column families
usually of the same type
number of columns families should be small
number of columns is unbounded
access control is at the column family level
Bigtable: A Distributed Storage System for Structured Data 8
Data Model
Timestamps
Each cell in a Bigtable can contain multiple versions of the
same data
Versions are indexed by 64-bit integer timestamps
Garage-collection settings per-column-family:
only the last n versions of a cell be kept, or
only new-enough versions be kept
Bigtable: A Distributed Storage System for Structured Data 9
Data Model
Bigtable: A Distributed Storage System for Structured Data 10
Rows
Columns
Timestamps
API
Metadata operations Create/delete tables or column families
Change metadata
Writes (atomic)
Bigtable does not support general transactions across row keys does not support writing to Bigtable
filtering, summarization, and transformation
Bigtable can be used with MapReduce
Bigtable: A Distributed Storage System for Structured Data 11
Building Blocks
Google File System (GFS) used to store log and data files
Scheduler cluster management system used to manage jobs and resources
SSTable file format used internally to store Bigtable data
Chubby distributed lock service highly-available with five active replicas• 0.0047% unavailability for 14 Bibtable clusters• 0.0326% unavailability for most affectected cluster
Bigtable: A Distributed Storage System for Structured Data 12
Implementation
What is a tablet?
A Bigtable cluster stores a number of tables
Each table consists of a set of tablets
Each tablet managed by a specific tablet server
As a table grows, it is automatically split into multiple
tablets (100-200) MB in size by default
Tablet servers handle read/write requests for their tablets
Bigtable: A Distributed Storage System for Structured Data 13
Implementation
BigTable: Servers
Master manages assignment of tablets servers
Bigtable: A Distributed Storage System for Structured Data 14
Tablet server 1
Bigtable Master
Tablet server 2
Tablets Tablets
Implementation
Tablet Location A three-level hierarchy of tablets is used to store tablet
locations The root tablet is never split
Bigtable: A Distributed Storage System for Structured Data 15
ImplementationTablet Assignment
A master server is responsible for assigning tablets to tablet
servers
The master server also:
detects addition and expiration of tablet servers
balances tablet server loads
initiates garbage collection of files in GFS
reassigns tablets when a tablet server is lost
If the master server dies, a new master server is recreated
Bigtable: A Distributed Storage System for Structured Data 16
Implementation
Tablet Serving
The persistent state of a tablet as stored in GFS
Bigtable: A Distributed Storage System for Structured Data 17
memtable Read Op
TABLET LOG
Write OpSSTable Files
Memory
GFS
Implementation
Compactions
Minor Compactions
memtable size reaches a threshold
memtable is frozen
new memtable is created
frozen memtable is converted into a new SSTable
Merging Compactions
Bigtable: A Distributed Storage System for Structured Data 18
Refinements
A number of refinements were required for Bigtable implementations to achieve high:
performance
availability
reliability
Bigtable: A Distributed Storage System for Structured Data 19
Refinements
Locality groups
Clients can group multiple column families together into a
locality group
A separate SSTable is generated for each locality group
Segregating column families which are not typically accessed
together enables more efficient reads
Bigtable: A Distributed Storage System for Structured Data 20
Refinements
Compression
Clients can control whether compression is used on a locality
group
Many clients use a two pass compression algorithm
Bentley and McIlroy's scheme
Bigtable: A Distributed Storage System for Structured Data 21
Refinements
Caching & Bloom Filters
Tablets use two levels of caching to improve read performance
Scan caching is useful for data which tends to be read
repeatedly
Block caching is useful for when read data tends to be close to
data recently read
Bloom filters reduce disk seeks by allowing a client to ask
whether a SSTable contains a row/column key pair
Bigtable: A Distributed Storage System for Structured Data 22
Refinements
Speeding Table Recovery
When a tablet is moved to another tablet server : A minor compaction is performed
The tablet server stop serving the tablet
Another minor compaction (unusually fast)
Then the tablet is moved without requiring any log entry recovery
Bigtable: A Distributed Storage System for Structured Data 23
Refinements
Exploiting Immutability
Because SSTables are immutable, various parts of the Bigtable
system have been simplified: file system access synchronization
permanently removing deleted data is completely handled thru garbage
collection
splitting tables is efficient because child tablets can share the SSTable of
parent tablets
Bigtable: A Distributed Storage System for Structured Data 24
Performance Evaluation
Google setup a Bigtable cluster with N tablet servers to measure
performance and scalability as N is varied.
configured to use 1 GB of memory
each with two 400GB IDE hard drives, two dual core 2 GHz chips, and a single
gigabit Ethernet link
N client machines generated the Bigtable load used for tests
Every machine ran a GFS server.
Bigtable: A Distributed Storage System for Structured Data 25
Performance Evaluation
Single tablet - server performance
Bigtable: A Distributed Storage System for Structured Data 26
Experiment# of Tablet Servers
1 50 250 500
Random Reads 1212 593 479 241Random Reads (mem) 10811 8511 8000 6250
Random Writes 8850 3745 3425 2000Sequintial Reads 4425 2463 2625 2469Sequintial Writes 8547 3623 2451 1905
Scans 15385 10526 9524 7843
Performance Evaluation
Scaling : Aggregate throughput increases by over a factor of
100 as the number of tablet servers is increased from 1 to 500.
Bigtable: A Distributed Storage System for Structured Data 27
Real Applications
As of August 2006 388 non-test Bigtable cluster
24500 tablet servers
Bigtable: A Distributed Storage System for Structured Data 28
# of Tablet Servers # of Clusters
0 .. 19 259
20 .. 49 47
50 .. 99 20
100 .. 499 50
> 500 12
Real Applications
Bigtable: A Distributed Storage System for Structured Data 29
This table provides some data about a few of the tables currently in use
Table size (measured before compression) and # Cells indicate approximate sizes
Real Applications
Google Analytics Google Analytics is supported by 2 Bigtables
200 TB raw click table 20 TB summary table
Bigtable: A Distributed Storage System for Structured Data 30
Real Applications
Google Earth Google Earth is supported by 2 Bigtables
70 TB images table, compression turned off 500 GB index table
Bigtable: A Distributed Storage System for Structured Data 31
Real Applications
Personalized Search Personalized Search supported by 1 Bigtable
one row per user id separate column family for each type of action
Bigtable: A Distributed Storage System for Structured Data 32
Lessons learned
Large distributed systems are vulnerable to many types of failures
memory and network corruption hung machines extended and asymmetric network partitions bugs in other systems (i.e. Chubby) overflow of GFS quotas planned and unplanned hardware maintenance
To address experience problems some protocols have been changed some assumptions have been modified
Bigtable: A Distributed Storage System for Structured Data 33
Lessons learned
It is important to delay adding new features until it is clear
how the new features will be used
It is important to support system-level monitoring allowed for detection and fixing of many issues
also enables tracking clusters to answer common questions
Bigtable: A Distributed Storage System for Structured Data 34
Related Work
The Boxwood project's goal is to provide infrastructure for
building higher-level services such as file systems or
databases
while the goal of Bigtable is to directly support client
applications that wish to store data
Bigtable: A Distributed Storage System for Structured Data 35
Related Work
C-Store and Bigtable share many characteristics
shared-nothing architecture
two different data structures
however these two systems vary significantly in their APIs performance
optimization
Bigtable: A Distributed Storage System for Structured Data 36
Conclusions
Bigtable is a distributed system for storing structure data at Google
in production since April 2005 seven person-years to design and implement more than 60 projects using in August 2006 users like performance and high availability
Users can scale their applications capacity by simply adding more machines to their system
Bigtable: A Distributed Storage System for Structured Data 37
Conclusions
Google has begun deploying Bigtable as a service to product groups
Google has gained significant advantages by building their own storage solution
has control over implementation and infrastructure can remove bottlenecks and inefficiencies as the arise
Bigtable: A Distributed Storage System for Structured Data 38
Strengths
Implementation and Usable
Optimization
Performance Evaluation
Used by > 60 Google products
Bigtable: A Distributed Storage System for Structured Data 39
Weaknesses
Complexity
Chubby
Master
Network
Bigtable: A Distributed Storage System for Structured Data 40
Thanks for your attention
Bigtable: A Distributed Storage System for Structured Data 41