Consistent Join Queries for Cloud Data Stores

Post on 25-Feb-2016

27 views 2 download

description

Consistent Join Queries for Cloud Data Stores. Zhou Wei 1,2 Guillaume Pierre 1 , Chi-Hung Chi 2 1 VU University Amsterdam 2 Tsinghua University Beijing. Background. Cloud Computing Platforms for Web Apps Unlimited resources Scalable NoSQL data stores Pay-as-you-go. - PowerPoint PPT Presentation

Transcript of Consistent Join Queries for Cloud Data Stores

Consistent Join Queries for Cloud Data Stores

Zhou Wei 1,2

Guillaume Pierre1, Chi-Hung Chi2

1 VU University Amsterdam2 Tsinghua University Beijing

Background

• Cloud Computing Platforms for Web Apps– Unlimited resources– Scalable NoSQL data stores– Pay-as-you-go

Problems of Cloud data stores

• Problem 1: No Join queries– Only support queries within a single table– No queries across tables

• Problem 2: Relaxed data consistency– Single-row transaction only– No consistency guarantee across multi-items

• Web apps need consistent join queries

Why Web App needs join?

• Data Normalization Foreign-key Equi-joins

PK Title Comment1 IPhone Cool2 IPhone Nice3 IPhone Expensive

PostPK Title1 IPhone

PK Post_id Comment1 1 Cool2 1 Nice3 1 Expensive

Post

Comment

Why consistent?

• Application correctness is not an option• Hard to maintain correctness with an

inconsistent data store– Require skillful programmers– Error-Prone– Cost more time

Alternatives

• SQL databases– Centralized– Full Replication– NOT scalable to update

workload– Complicated queries– ACID Transaction

• NoSQL data stores– Distributed– Partitioned– Scalable to update

workload– No Join Queries– Single-row Transaction

Google BigtableAmazon SimpleDBCassandra …

MySQLOracle …

Our Position

• Scalable NoSQL data stores supporting equi-joins queries for Web applications

• Join query as a ACID transaction• Queries of Web applications access only a

small portion of data– Also true for join queries– All of them are foreign-key equi-joins

8

How it works?

Web Application

Cloud Data Store

Load Data

TM

Checkpoint

Transactions

TM TM TM

CloudTPSJoin Queries

Outline

• Join Algorithm• Transaction Protocol• Performance Evaluation• Conclusion

Join Algorithm

• Web applications are response-time sensitive– Table scan is not an option

• The basic idea– Linking records by FK relationships in advance– Each join query has well identified “root records”– Begin from accessing each “root record”– Recursively obtaining its related records

Link records for forward join

PK FK1 22 33 2

Comment

PK123

Post

Given referring row identify referenced row(FK) (PK)

Comment

(1,3)2

Link records for backward join

PK FK1 22 33 2

PK123

Comment Post Index

Give referenced row identify referring rows(PK) (FK)

Joining two tables

SELECT * FROM post, comment WHERE post.id = 1 and comment.post_id = post.id

post

comment

id=1

post.id = comment.post_id

Multi-Joins

• The table of the “root record” is the root

• Recursively join child tables

• Length of critical execution path = 2

Post

Comment

Id=1

User

Revision

Outline

• Join Algorithm• Transaction Protocol• Performance Evaluation• Conclusion

Distributed Transaction

TM

RW-TC

Data1

Transaction

•SubT(1)•SubT(2)•SubT(3) TM

RW-TC

Data2

TM

RW-TC

Data3

SubT(1) SubT(2) SubT(3)

2 Phase Commit (2PC)

Problem

• 2PC requires all records identified in advance• Join queries/index management identify

accessed records on the fly• Solution: extend 2PC to allow adding records

dynamically

The Original 2PC

TC

Data1 Data2

TC

Data1 Data2

TC

Data1 Data2

2. Vote 3. Commit/Abort1. Submit

Extended 2PC

• Dynamically add data items to the transaction

TC

Data1 Data2

TC

Data1 Data2

2. Vote1. Submit

Commit Add Data3

Extended 2PC

TC

Data1 Data2

3. Derive

Data3

Extended 2PC

TC

Data1 Data2

4. Vote

Data3

Commit

Extended 2PC

TC

Data1 Data2

5. Commit

Data3

Outline

• Join Algorithm• Transaction Protocol• Performance Evaluation• Conclusion

Experiment Setup

• Implement a prototype of CloudTPS – Java Servlet– Deployed in Tomcat v6.0

• Environments– 1. Das3+Hbase+Tomcat (99% reqs < 100ms)– 2. EC2+SimpleDB+Tomcat (90% reqs < 100ms)• High-CPU medium instances

Micro-benchmarks

• Generate specific workload– Containing only Join queries/RW transactions– Changing the number of accessed data items– Changing the length of critical execution path– Switch between Update indexes/Not update

• All configured with 10 CloudTPS nodes• Measuring maximum sustainable throughput– Under response time constraint

Two-table join evaluation

Transaction Per Second Record Per Second

Increasing number of accessed records in a transaction

Multi-Joins Evaluation

Increasing the height of join-query treeAll join queries access 12 records

RW transactions evaluation

Transaction Per Second Record Per Second

Increasing number of accessed records in a transaction

Scalability Evaluation

• Macro-Benchmark– Derived from TPC-W, simulating an online book

store– Our workload includes only the join queries and

RW transactions– No simple queries

Scalability Results

Linearly Scalability: Any increase of workload can be addressed by adding more machines

CloudTPS vs. Postgresql

Postgresql: binary-replication

Fault Tolerance

• Recovers from 3 network partitions and 2 machine failures

Conclusion

• We demonstrate supporting consistent join queries over partitioned and distributed data

• Join queries from Web applications only access a small portion of data

• It scales linearly even under an intensive workload of only join queries and read-write transactions

Questions?

Thank you!

Details in Macro-benchmark workload

Join queries RW transactions

PK SK1 A2 B3 B

Secondary-key Queries

• Select * from Table1 Where SK = ‘B’• Create a separate index table for the SK• Translate into a join query

Table 1Index Table

PK’(SK) T1A 1B 2,3

Index Management• Indexes link the related records

PK Title1 Hello2 Good3 Well

PK FK1 12 13 2

post comment

T11,23-

Index

Index Management (cont.)

• Case: a new comment inserted• Updating indexes atomically

PK Title1 Hello2 Good3 Well

PK FK1 12 13 24 2

post comment

T11,23, 4

-

Index

Index Management (cont.)

• Consistency Violated Case 1

PK Title1 Hello2 Good3 Well

PK FK1 12 13 24 2

post comment

T11,23-

Index

?

Index Management (cont.)

• Consistency Violated Case 2

PK Title1 Hello2 Good3 OK

PK FK1 12 13 2

post comment

T11,23, 4

-

Index

?

Fault tolerance

• Maintain ACID during server failure– Transactions that reach an agreement of “COMMIT”

should Commit– Abort other transactions

• Replication to N transaction managers– Values of Data Items– States of Transactions (Vote result and “redo logs”)

• Failover of a failed server– Transfer ownership of data items– Transfer leadership of transactions being coordinated

Optimization for Join Queries

• Join Query is a Read-only transaction– Remove the final Commit phase– Sub-transactions removed immediately after local

execution– RO transactions will not block other transactions

43

Concurrency Control

Data1

SubT(1)

SubT(1)5

6

Timestamp Ordering

Data2

SubT(2)

SubT(2)5

6

Read-Only Transaction

Data1

ROSubT(1)

SubT(1)5

6

Data2

SubT(2)5

TM

RW-TC

Data3

RO-TC

Join(Data1) TM TM

SubT(1)7

Read-Only Transaction (Cont.)

Data1

ROSubT(1)6

Data2

TM

RW-TC

Data3

RO-TC

Join(Data1) TM TM

Data2

SubT(1)7

Read-only Transaction (cont.)

Data2

TM

RW-TC

Data3

RO-TC

TM

NoneROSubT(2)6

Result

Data1

SubT(1)7