Synthesizing Representative I/O Workloads for TPC-H J. Zhang*, A. Sivasubramaniam*, H. Franke, N....

32
Synthesizing Representative I/O Workloads for TPC-H J. Zhang*, A. Sivasubramaniam*, H. Franke, N. Gautam*, Y. Zhang, S. Nagar * Pennsylvania State University IBM T.J. Watson Rutgers University

Transcript of Synthesizing Representative I/O Workloads for TPC-H J. Zhang*, A. Sivasubramaniam*, H. Franke, N....

Page 1: Synthesizing Representative I/O Workloads for TPC-H J. Zhang*, A. Sivasubramaniam*, H. Franke, N. Gautam*, Y. Zhang, S. Nagar * Pennsylvania State University.

Synthesizing Representative I/O Workloads for TPC-H

J. Zhang*, A. Sivasubramaniam*, H. Franke, N. Gautam*, Y. Zhang, S.

Nagar

* Pennsylvania State UniversityIBM T.J. Watson

Rutgers University

Page 2: Synthesizing Representative I/O Workloads for TPC-H J. Zhang*, A. Sivasubramaniam*, H. Franke, N. Gautam*, Y. Zhang, S. Nagar * Pennsylvania State University.

Outline

• Motivation• Related Work• Methodology

– Arrival Time– Access Pattern– Request Sizes

• Accuracy of synthetic traces• Concluding Remarks

Page 3: Synthesizing Representative I/O Workloads for TPC-H J. Zhang*, A. Sivasubramaniam*, H. Franke, N. Gautam*, Y. Zhang, S. Nagar * Pennsylvania State University.

Motivation

• I/O subsystems are critical for commercial services and in production environments.

• Real applications are essential for system design and evaluation.

• TPC-H is a decision-support workload for business enterprises.

Page 4: Synthesizing Representative I/O Workloads for TPC-H J. Zhang*, A. Sivasubramaniam*, H. Franke, N. Gautam*, Y. Zhang, S. Nagar * Pennsylvania State University.

Disadvantages of Traces

• Not easily obtainable• Can be very large• Difficult to get statistical confidence• Very difficult to change workload behavior• Does not isolate the influence of one

parameter

• On the other hand, a deeper understanding of the workload can:• Help generate a synthetic workload• Help in system design itself.

Page 5: Synthesizing Representative I/O Workloads for TPC-H J. Zhang*, A. Sivasubramaniam*, H. Franke, N. Gautam*, Y. Zhang, S. Nagar * Pennsylvania State University.

What do we need to synthesize?

• Inter-arrival times (temporal behavior) of disk block requests.

• Access pattern (spatial behavior) of blocks being referenced

• Size (volume) of each I/O request.

Page 6: Synthesizing Representative I/O Workloads for TPC-H J. Zhang*, A. Sivasubramaniam*, H. Franke, N. Gautam*, Y. Zhang, S. Nagar * Pennsylvania State University.

Related work

• Scientific Application I/O behavior– Time-series models for arrivals– Sequentiality/Markov models for

access pattern• Commercial/production

workloads– Self-similar arrival patterns – Sequentiality in TPC-H/TPC-D

• No prior complete synthesis of all three attributes for TPC-H

Page 7: Synthesizing Representative I/O Workloads for TPC-H J. Zhang*, A. Sivasubramaniam*, H. Franke, N. Gautam*, Y. Zhang, S. Nagar * Pennsylvania State University.

Our TPC-H Workload

• Trace Collection Platform– IBM Netfinity 8-way SMP with 2.5GB

memory and 15 disks– Linux 2.4.17– DB2 UDB EE V7.2

• TPC-H Configuration– Power Run of 22 queries– Partitioning tables across the disks– 30 GB dataset

Page 8: Synthesizing Representative I/O Workloads for TPC-H J. Zhang*, A. Sivasubramaniam*, H. Franke, N. Gautam*, Y. Zhang, S. Nagar * Pennsylvania State University.

Validation

Identify characteristi

cs

Disksim 2.0

Original I/O traces

Generate

synthetic traces

Response time

CD

F

RMS: root-mean-square error of differences between two CDF curves

nRMS: RMS/m, m is average response time for the original trace

Metrics

Page 9: Synthesizing Representative I/O Workloads for TPC-H J. Zhang*, A. Sivasubramaniam*, H. Franke, N. Gautam*, Y. Zhang, S. Nagar * Pennsylvania State University.

Overall Methodology

• Arrival pattern characteristics– Investigate correlations

• Time series• Self-similar• iid distributions

• Access pattern characteristics– Sequentiality/pseudo

sequentiality/randomness– Size characteristics

• Investigating correlations between time, space and volume to get final synthesis

Page 10: Synthesizing Representative I/O Workloads for TPC-H J. Zhang*, A. Sivasubramaniam*, H. Franke, N. Gautam*, Y. Zhang, S. Nagar * Pennsylvania State University.

Arrival pattern

• Statistical analysis– Auto-correlation

function (ACF) plots

• Shows the correlation between current inter-arrival time and one that is x-steps away

Page 11: Synthesizing Representative I/O Workloads for TPC-H J. Zhang*, A. Sivasubramaniam*, H. Franke, N. Gautam*, Y. Zhang, S. Nagar * Pennsylvania State University.

– Correlations seem very weak (<0.15 for 12 queries, and <0.30 for the rest)•Errors with Time series models

(AR/MA/ARIMA/ARFIMA) are high• No suggestions for self-similar either

– Perhaps iid (independent and identically distributed) is not a bad assumption.

Page 12: Synthesizing Representative I/O Workloads for TPC-H J. Zhang*, A. Sivasubramaniam*, H. Franke, N. Gautam*, Y. Zhang, S. Nagar * Pennsylvania State University.

• Fitting distributions– Tried hyper-exponential/normal/pareto– Used Maximum Likelihood Estimator

(normal/pareto) and Expectation Maximization (hyper-exponential) to estimate distribution parameters

– Use K-S test to measure goodness-of-fit– Maximum distance between fitted

distribution and original CDF was ensured to be less than 0.1

Page 13: Synthesizing Representative I/O Workloads for TPC-H J. Zhang*, A. Sivasubramaniam*, H. Franke, N. Gautam*, Y. Zhang, S. Nagar * Pennsylvania State University.

Comparing CDF of fitted distribution and data

Page 14: Synthesizing Representative I/O Workloads for TPC-H J. Zhang*, A. Sivasubramaniam*, H. Franke, N. Gautam*, Y. Zhang, S. Nagar * Pennsylvania State University.

Access Pattern (Location + Size)

• Most studies use sequentiality to describe TPC-H

• However, this is not always the case.

Cat1: Q10

Q4, Q14

Cat2: Q12,

Q1,Q3,Q5,Q7,

Q8,Q15,Q18,

Q19,Q21

Cat3: Q20

Q9, Q17

Arrival Time

Locati

on Locati

on L

ocati

on

Arrival Time

Arrival Time

Page 15: Synthesizing Representative I/O Workloads for TPC-H J. Zhang*, A. Sivasubramaniam*, H. Franke, N. Gautam*, Y. Zhang, S. Nagar * Pennsylvania State University.

Category 1: Intermingling sequential streams

• Consider the following:– Run: A strictly sequential set of I/O

requests– Stream: A pseudo-sequential set of

I/O requests that could be interrupted by another stream.

– i.e. a stream could have several runs that are interrupted by runs of other streams.

Page 16: Synthesizing Representative I/O Workloads for TPC-H J. Zhang*, A. Sivasubramaniam*, H. Franke, N. Gautam*, Y. Zhang, S. Nagar * Pennsylvania State University.

Run and Stream

1-4 5-8 11-149-10 15-18

An example run of 5 requests

1-4 7-8 11-149-12

A stream (pseudo-sequential) of 4 requests

An example trace:1-4 7-8 11-149-12

100-104 105-108 109-112

Stream AStream B

1-4 7-8 11-14100-104 105-108 109-112 Trace 9-12

Page 17: Synthesizing Representative I/O Workloads for TPC-H J. Zhang*, A. Sivasubramaniam*, H. Franke, N. Gautam*, Y. Zhang, S. Nagar * Pennsylvania State University.

Secondary Attributes• Run Length: # of requests in a run• Run Start location: start sector of run• Stream Length: # of requests in a stream• Inter-stream Jump Distance: spatial separation

between start of run and previous request• Intra-stream Jump Distance: spatial separation

between successive requests within a stream• Number of active streams (at any instant)• Interference Distance: number of requests

between 2 successive requests in a stream

• Derive empirical distributions for these from the trace

Page 18: Synthesizing Representative I/O Workloads for TPC-H J. Zhang*, A. Sivasubramaniam*, H. Franke, N. Gautam*, Y. Zhang, S. Nagar * Pennsylvania State University.

Location Synthesis - Q10(Time and size from trace)

LocIID: locations are i.i.d.

LocRUN: incorporate run length distribution and run start location distribution.

LocSTREAM: combine all stream and run statistics.

Page 19: Synthesizing Representative I/O Workloads for TPC-H J. Zhang*, A. Sivasubramaniam*, H. Franke, N. Gautam*, Y. Zhang, S. Nagar * Pennsylvania State University.

Request Size

• Requests are one of

– 64, 128, 192, 256, 320, 384, 448, 512 blocks

• But attributes (location, size, time) are not independent !!!

Page 20: Synthesizing Representative I/O Workloads for TPC-H J. Zhang*, A. Sivasubramaniam*, H. Franke, N. Gautam*, Y. Zhang, S. Nagar * Pennsylvania State University.

Correlations between size and location

64 128 192 256 320 384 448 512

.716

.009

.010

.009

.009

.011

.011

.225

.577

.012

.013

.012

.013

.015

.016

.342

.916

.004

.004

.004

.004

.005

.005

.057

Size

All req.Run start

Within run

Fraction of requests

Page 21: Synthesizing Representative I/O Workloads for TPC-H J. Zhang*, A. Sivasubramaniam*, H. Franke, N. Gautam*, Y. Zhang, S. Nagar * Pennsylvania State University.

Correlations between size and time

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29I nter- arrival time interval

Siz

e fr

eque

ncy

512

128- 448

64

Page 22: Synthesizing Representative I/O Workloads for TPC-H J. Zhang*, A. Sivasubramaniam*, H. Franke, N. Gautam*, Y. Zhang, S. Nagar * Pennsylvania State University.

Correlations between location and time

Page 23: Synthesizing Representative I/O Workloads for TPC-H J. Zhang*, A. Sivasubramaniam*, H. Franke, N. Gautam*, Y. Zhang, S. Nagar * Pennsylvania State University.

Final Synthesis Methodology (Category 1)

Location: use LocSTREAM to generate start locations. Two kinds of requests: a run start request or a request within a run

Time: use Pr(inter-arrival time | run start requests) and Pr(inter-arrival time | within a run requests) to generate times.

Size: 1)For run start request, use Pr(size | inter-arrival

times of run start requests) to generate sizes.2)For within a run requests, use Pr(size | within a

run requests) to generate sizes.

Page 24: Synthesizing Representative I/O Workloads for TPC-H J. Zhang*, A. Sivasubramaniam*, H. Franke, N. Gautam*, Y. Zhang, S. Nagar * Pennsylvania State University.

• Can be easily adapted for Category 2 (strictly sequential) and Category 3 (random) queries.

• Validation: Compare the response time characteristics of synthesized and real trace.

Page 25: Synthesizing Representative I/O Workloads for TPC-H J. Zhang*, A. Sivasubramaniam*, H. Franke, N. Gautam*, Y. Zhang, S. Nagar * Pennsylvania State University.

Validation of CDF of response times

(Category 1)

Page 26: Synthesizing Representative I/O Workloads for TPC-H J. Zhang*, A. Sivasubramaniam*, H. Franke, N. Gautam*, Y. Zhang, S. Nagar * Pennsylvania State University.

Validation of CDF of response times

(Category 2)

Page 27: Synthesizing Representative I/O Workloads for TPC-H J. Zhang*, A. Sivasubramaniam*, H. Franke, N. Gautam*, Y. Zhang, S. Nagar * Pennsylvania State University.

Validation of CDF of response times

(Category 3)

Page 28: Synthesizing Representative I/O Workloads for TPC-H J. Zhang*, A. Sivasubramaniam*, H. Franke, N. Gautam*, Y. Zhang, S. Nagar * Pennsylvania State University.

Storage Requirements

Q1 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10

3.46

3.64

2.76

3.43

3.46

3.47

3.66

.004

2.79

0.10

0.09

0.20

0.07

0.01

0.04

0.05

0.15

0.16

Storage Fraction(x0.00

1)nRMS

Q12 Q14 Q15 Q17 Q18 Q19 Q20 Q21

3.73 6.49 3.46 2.03 3.54 3.44 4.57 2.95

0.06 0.19 0.01 0.05 0.06 0.03 0.10 0.07

Storage Fraction(x0.00

1)nRMS

Page 29: Synthesizing Representative I/O Workloads for TPC-H J. Zhang*, A. Sivasubramaniam*, H. Franke, N. Gautam*, Y. Zhang, S. Nagar * Pennsylvania State University.

Contributions• A synthesis methodology to capture

– Inter-mingling streams of requests– Exploiting correlations between

request attributes• An application of this methodology

to TPC-H• Along the way (for TPC-H),

– iid can capture arrival time characteristics

– Strict sequentiality is not always the case

Page 30: Synthesizing Representative I/O Workloads for TPC-H J. Zhang*, A. Sivasubramaniam*, H. Franke, N. Gautam*, Y. Zhang, S. Nagar * Pennsylvania State University.

Backup slides

Page 31: Synthesizing Representative I/O Workloads for TPC-H J. Zhang*, A. Sivasubramaniam*, H. Franke, N. Gautam*, Y. Zhang, S. Nagar * Pennsylvania State University.

Validating arrival time synthesis

Page 32: Synthesizing Representative I/O Workloads for TPC-H J. Zhang*, A. Sivasubramaniam*, H. Franke, N. Gautam*, Y. Zhang, S. Nagar * Pennsylvania State University.

LocSTREAM

1. Use Pr(stream length) to generate stream lengths.

2. Use Pr(run length | stream length) to generate run lengths for each stream length.

3. Generate start location for each run:a) Use Pr(inter-stream jump dist.) to generate the start

location of the first run in the stream.b) Use Pr(intra-stream jump distance | this stream) to

generate other runs’ start location in this stream.

4. Use Pr(interference distance) to interleave all streams.