C-Store: A Column-oriented DBMS

32
1 C-Store: A Column-oriented DBMS By New England Database Group

description

C-Store: A Column-oriented DBMS. By New England Database Group. Current DBMS Gold Standard. Store fields in one record contiguously on disk Use B-tree indexing Use small (e.g. 4K) disk blocks Align fields on byte or word boundaries - PowerPoint PPT Presentation

Transcript of C-Store: A Column-oriented DBMS

Page 1: C-Store: A Column-oriented DBMS

1

C-Store: A Column-oriented DBMS

By

New England Database Group

Page 2: C-Store: A Column-oriented DBMS

2

M.I.T

Current DBMS Gold StandardCurrent DBMS Gold Standard Current DBMS Gold StandardCurrent DBMS Gold Standard

Store fields in one record contiguously on diskUse B-tree indexingUse small (e.g. 4K) disk blocksAlign fields on byte or word boundariesConventional (row-oriented) query optimizer

and executor (technology from 1979)Aries-style transactions

Store fields in one record contiguously on diskUse B-tree indexingUse small (e.g. 4K) disk blocksAlign fields on byte or word boundariesConventional (row-oriented) query optimizer

and executor (technology from 1979)Aries-style transactions

Page 3: C-Store: A Column-oriented DBMS

3

M.I.T

Terminology -- “Row Store”

Record 2

Record 4

Record 1

Record 3

E.g. DB2, Oracle, Sybase, SQLServer, …

Page 4: C-Store: A Column-oriented DBMS

4

M.I.T

Row Stores are Row Stores are Write OptimizedWrite Optimized Row Stores are Row Stores are Write OptimizedWrite Optimized

Can insert and delete a record in one physical

writeGood for on-line transaction processing (OLTP)But not for read mostly applications

Data warehousesCRM

Can insert and delete a record in one physical

writeGood for on-line transaction processing (OLTP)But not for read mostly applications

Data warehousesCRM

Page 5: C-Store: A Column-oriented DBMS

5

M.I.T

Elephants Have Extended Row StoresElephants Have Extended Row StoresElephants Have Extended Row StoresElephants Have Extended Row Stores

With Bitmap indicesBetter sequential readIntegration of “datacube” productsMaterialized views

With Bitmap indicesBetter sequential readIntegration of “datacube” productsMaterialized views

But there may be a better idea…….

Page 6: C-Store: A Column-oriented DBMS

6

M.I.T

Column StoresColumn Stores

Page 7: C-Store: A Column-oriented DBMS

7

M.I.T

At 100K Feet…. At 100K Feet….

Ad-hoc queries read 2 columns out of 20In a very large warehouse, Fact table is

rarely clustered correctlyColumn store reads 10% of what a row

store reads

Ad-hoc queries read 2 columns out of 20In a very large warehouse, Fact table is

rarely clustered correctlyColumn store reads 10% of what a row

store reads

Page 8: C-Store: A Column-oriented DBMS

8

M.I.T

C-Store (Column Store) ProjectC-Store (Column Store) Project

Brandeis/Brown/MIT/UMass-Boston

projectUsual suspects participatingEnough coded to get performance

numbers for some queriesComplete status later

Brandeis/Brown/MIT/UMass-Boston

projectUsual suspects participatingEnough coded to get performance

numbers for some queriesComplete status later

Page 9: C-Store: A Column-oriented DBMS

9

M.I.T

We Build on Previous Pioneering We Build on Previous Pioneering Work….Work….

Sybase IQ (early ’90s)Monet (see CIDR ’05 for the most recent

description)

Sybase IQ (early ’90s)Monet (see CIDR ’05 for the most recent

description)

Page 10: C-Store: A Column-oriented DBMS

10

M.I.T

C-Store Technical IdeasC-Store Technical Ideas

Code the columns to save spaceNo alignmentBig disk blocksOnly materialized views (perhaps many)Focus on Sorting not indexingAutomatic physical DBMS design

Code the columns to save spaceNo alignmentBig disk blocksOnly materialized views (perhaps many)Focus on Sorting not indexingAutomatic physical DBMS design

Page 11: C-Store: A Column-oriented DBMS

11

M.I.T

C-store (Column Store) Technical C-store (Column Store) Technical IdeasIdeas

Optimize for grid computingInnovative redundancyXacts – but no need for MohanData ordered on anything, Not just timeColumn optimizer and executor

Optimize for grid computingInnovative redundancyXacts – but no need for MohanData ordered on anything, Not just timeColumn optimizer and executor

Page 12: C-Store: A Column-oriented DBMS

12

M.I.T

How to Evaluate This Paper….How to Evaluate This Paper….

None of the ideas in isolation merit

publication Judge the complete system by its

(hopefully intelligent) choice ofSmall collection of inter-related

powerful ideasThat together put performance in a

new sandbox

None of the ideas in isolation merit

publication Judge the complete system by its

(hopefully intelligent) choice ofSmall collection of inter-related

powerful ideasThat together put performance in a

new sandbox

Page 13: C-Store: A Column-oriented DBMS

13

M.I.T

Code the ColumnsCode the Columns

Work hard to shrink spaceUse extra space for multiple orders

Fundamentally easier than in a row storeE.g. RLE works well

Work hard to shrink spaceUse extra space for multiple orders

Fundamentally easier than in a row storeE.g. RLE works well

Page 14: C-Store: A Column-oriented DBMS

14

M.I.T

No AlignmentNo Alignment

Densepack columnsE.g. a 5 bit field takes 5 bits

Current CPU speed going up faster than

disk bandwidthFaster to shift data in CPU than to

waste disk bandwidth

Densepack columnsE.g. a 5 bit field takes 5 bits

Current CPU speed going up faster than

disk bandwidthFaster to shift data in CPU than to

waste disk bandwidth

Page 15: C-Store: A Column-oriented DBMS

15

M.I.T

Big Disk BlocksBig Disk Blocks

TunableBig (minimum size is 64K)

TunableBig (minimum size is 64K)

Page 16: C-Store: A Column-oriented DBMS

16

M.I.T

Only Materialized ViewsOnly Materialized Views

Projection (materialized view) is some

number of columns from a fact tablePlus columns in a dimension table – with

a 1-n join between Fact and Dimension

tableStored in order of a storage key(s)Several may be stored!!!!!With a permutation, if necessary, to map

between them

Projection (materialized view) is some

number of columns from a fact tablePlus columns in a dimension table – with

a 1-n join between Fact and Dimension

tableStored in order of a storage key(s)Several may be stored!!!!!With a permutation, if necessary, to map

between them

Page 17: C-Store: A Column-oriented DBMS

17

M.I.T

Only Materialized ViewsOnly Materialized Views

Table (as the user specified it and sees

it) is not stored!No secondary indexes (they are a one

column sorted MV plus a permutation, if

you really want one)

Table (as the user specified it and sees

it) is not stored!No secondary indexes (they are a one

column sorted MV plus a permutation, if

you really want one)

Page 18: C-Store: A Column-oriented DBMS

18

M.I.T

Example

User view:

EMP (name, age, salary, dept)

Dept (dname, floor)

Possible set of MVs:MV-1 (name, dept, floor) in floor order

MV-2 (salary, age) in age order

MV-3 (dname, salary, name) in salary order

Page 19: C-Store: A Column-oriented DBMS

19

M.I.T

Different Indexing

Few values Many values

Sequential

RLE encoded

Conventional B-tree at

the value level

Delta encoded

Conventional B-tree at

the block level

Non sequential Bitmap per value

Conventional Gzip

Conventional B-tree at

the block level

Page 20: C-Store: A Column-oriented DBMS

20

M.I.T

Automatic Physical DBMS DesignAutomatic Physical DBMS Design

Not enough 4-star wizards to go aroundAccept a “training set” of queries and a

space budgetChoose the MVs auto-magicallyRe-optimize periodically based on a log

of the interactions

Not enough 4-star wizards to go aroundAccept a “training set” of queries and a

space budgetChoose the MVs auto-magicallyRe-optimize periodically based on a log

of the interactions

Page 21: C-Store: A Column-oriented DBMS

21

M.I.T

Optimize for Grid ComputingOptimize for Grid Computing

I.e. shared-nothingDewitt (Gamma) was right

Horizontal partitioning and intra-query

parallelism as in Gamma

I.e. shared-nothingDewitt (Gamma) was right

Horizontal partitioning and intra-query

parallelism as in Gamma

Page 22: C-Store: A Column-oriented DBMS

22

M.I.T

Innovative RedundancyInnovative Redundancy

Hardly any warehouse is recovered by a

redo from the logTakes too long!

Store enough MVs at enough places to

ensure K-safetyRebuild dead objects from elsewhere in

the networkK-safety is a DBMS-design problem!

Hardly any warehouse is recovered by a

redo from the logTakes too long!

Store enough MVs at enough places to

ensure K-safetyRebuild dead objects from elsewhere in

the networkK-safety is a DBMS-design problem!

Page 23: C-Store: A Column-oriented DBMS

23

M.I.T

XACTS – No MohanXACTS – No Mohan

Undo from a log (that does not need to

be persistent)Redo by rebuild from elsewhere in the

network

Undo from a log (that does not need to

be persistent)Redo by rebuild from elsewhere in the

network

Page 24: C-Store: A Column-oriented DBMS

24

M.I.T

XACTS – No MohanXACTS – No Mohan

Snapshot isolation (run queries as of a

tunable time in the recent past)To solve read-write conflicts

Distributed XactsWithout a prepare message (no 2

phase commit)

Snapshot isolation (run queries as of a

tunable time in the recent past)To solve read-write conflicts

Distributed XactsWithout a prepare message (no 2

phase commit)

Page 25: C-Store: A Column-oriented DBMS

25

M.I.T

Storage (sort) Key(s) is not Storage (sort) Key(s) is not Necessarily TimeNecessarily Time

That would be too limitingSo how to do fast updates to densepack

column storage that is not in entry

sequence?

That would be too limitingSo how to do fast updates to densepack

column storage that is not in entry

sequence?

Page 26: C-Store: A Column-oriented DBMS

26

M.I.T

Solution – a Hybrid StoreSolution – a Hybrid Store

Read-optimized

Column store

Write-optimized

Column store

Tuple mover

(Much like Monet)

(What we have been

talking about so far)

(Batch rebuilder)

Page 27: C-Store: A Column-oriented DBMS

27

M.I.T

Column ExecutorColumn Executor

Column operations – not row operationsColumns remain coded – if possibleLate materialization of columns

Column operations – not row operationsColumns remain coded – if possibleLate materialization of columns

Page 28: C-Store: A Column-oriented DBMS

28

M.I.T

Column OptimizerColumn Optimizer

Chooses MVs on which to run the queryMost important task

Build in snowflake schemas Which are simple to optimize without

exhaustive searchLooking at extensions

Chooses MVs on which to run the queryMost important task

Build in snowflake schemas Which are simple to optimize without

exhaustive searchLooking at extensions

Page 29: C-Store: A Column-oriented DBMS

29

M.I.T

Current PerformanceCurrent Performance

100X popular row store in 40% of the

space10X popular column store in 70% of the

space7X popular row store in 1/6th of the spaceCode available with BSD license

100X popular row store in 40% of the

space10X popular column store in 70% of the

space7X popular row store in 1/6th of the spaceCode available with BSD license

Page 30: C-Store: A Column-oriented DBMS

30

M.I.T

Structure Going ForwardStructure Going Forward

Vertica Very well financed start-up to

commercialize C-storeDoing the heavy lifting

University ResearchFunded by Vertica

Vertica Very well financed start-up to

commercialize C-storeDoing the heavy lifting

University ResearchFunded by Vertica

Page 31: C-Store: A Column-oriented DBMS

31

M.I.T

VerticaVertica

Complete alpha system in December ‘05Everything, including DBMS designerWith current performance!Looking for early customers to work with

(see me if you are interested)

Complete alpha system in December ‘05Everything, including DBMS designerWith current performance!Looking for early customers to work with

(see me if you are interested)

Page 32: C-Store: A Column-oriented DBMS

32

M.I.T

University ResearchUniversity Research

Extension of algorithms to non-snowflake

schemasStudy of L2 cache performanceStudy of coding strategiesStudy of executor optionsStudy of recovery tacticsNon-cursor interfaceStudy of optimizer primitives

Extension of algorithms to non-snowflake

schemasStudy of L2 cache performanceStudy of coding strategiesStudy of executor optionsStudy of recovery tacticsNon-cursor interfaceStudy of optimizer primitives