Efficient Computation of Parsimonious Temporal Aggregationdignoes/res/adbis2015-slides.pdf ·...

32
Efficient Computation of Parsimonious Temporal Aggregation Giovanni Mahlknecht , Anton Dign¨ os, Johann Gamper Free University of Bozen-Bolzano, Italy ADBIS 2015 September 8-11, 2015 - Futuroscope, Poitiers, France ADBIS 2015 1/23 G. Mahlknecht et al.

Transcript of Efficient Computation of Parsimonious Temporal Aggregationdignoes/res/adbis2015-slides.pdf ·...

Page 1: Efficient Computation of Parsimonious Temporal Aggregationdignoes/res/adbis2015-slides.pdf · Instant Temporal Aggregation (ITA) Patient treatment periods with daily cost P C T r

Efficient Computation of Parsimonious TemporalAggregation

Giovanni Mahlknecht, Anton Dignos, Johann Gamper

Free University of Bozen-Bolzano, Italy

ADBIS 2015September 8-11, 2015 - Futuroscope, Poitiers, France

ADBIS 2015 1/23 G. Mahlknecht et al.

Page 2: Efficient Computation of Parsimonious Temporal Aggregationdignoes/res/adbis2015-slides.pdf · Instant Temporal Aggregation (ITA) Patient treatment periods with daily cost P C T r

Outline

Introduction

Diagonal Pruning

Split Point Graph

Experimental Evaluation

ADBIS 2015 2/23 G. Mahlknecht et al.

Page 3: Efficient Computation of Parsimonious Temporal Aggregationdignoes/res/adbis2015-slides.pdf · Instant Temporal Aggregation (ITA) Patient treatment periods with daily cost P C T r

Instant Temporal Aggregation (ITA)

Patient treatment periods with daily cost

P C Tr1 Bob 600 [1,4]r2 Mary 400 [1,2]r3 Eve 40 [3,3]r4 Eric 310 [3,4]r5 Joe 30 [6,6]r6 John 300 [6,9]r7 Alex 100 [9,9]

.

days1 2 3 4 5 6 7 8 9

Eve 40 Joe 30 Alex 100

Eric 310 John 300

Mary 400

Bob 600

ITA: at each timepoint SUM(C)Val T

s1 1000 [1,2]s2 950 [3,3]s3 910 [4,4]s4 330 [6,6]s5 300 [7,8]s6 400 [9,9]

.

days1 2 3 4 5 6 7 8 9

1000 950 910 300330 400

ADBIS 2015 3/23 G. Mahlknecht et al.

Page 4: Efficient Computation of Parsimonious Temporal Aggregationdignoes/res/adbis2015-slides.pdf · Instant Temporal Aggregation (ITA) Patient treatment periods with daily cost P C T r

Instant Temporal Aggregation (ITA)

Patient treatment periods with daily cost

P C Tr1 Bob 600 [1,4]r2 Mary 400 [1,2]r3 Eve 40 [3,3]r4 Eric 310 [3,4]r5 Joe 30 [6,6]r6 John 300 [6,9]r7 Alex 100 [9,9]

.

days1 2 3 4 5 6 7 8 9

Eve 40 Joe 30 Alex 100

Eric 310 John 300

Mary 400

Bob 600

ITA: at each timepoint SUM(C)Val T

s1 1000 [1,2]s2 950 [3,3]s3 910 [4,4]s4 330 [6,6]s5 300 [7,8]s6 400 [9,9]

.

days1 2 3 4 5 6 7 8 9

1000 950 910 300330 400

ADBIS 2015 3/23 G. Mahlknecht et al.

Page 5: Efficient Computation of Parsimonious Temporal Aggregationdignoes/res/adbis2015-slides.pdf · Instant Temporal Aggregation (ITA) Patient treatment periods with daily cost P C T r

Parsimonious Temporal Aggregation (PTA)

Input: ITA resultOutput: merged tuples to size c with minimum error

Rules:

I Merge only adjacent tuples

I Merged values are weighted mean

I Error is Squared Sum Error

I Result is of size c

.

days1 2 3 4 5 6 7 8 9

1000

950

910 300

330 400

value = 983.33, error = 1667 value = 333.33, error = 6667

ADBIS 2015 4/23 G. Mahlknecht et al.

Page 6: Efficient Computation of Parsimonious Temporal Aggregationdignoes/res/adbis2015-slides.pdf · Instant Temporal Aggregation (ITA) Patient treatment periods with daily cost P C T r

Parsimonious Temporal Aggregation (PTA)

Input: ITA resultOutput: merged tuples to size c with minimum error

Rules:

I Merge only adjacent tuples

I Merged values are weighted mean

I Error is Squared Sum Error

I Result is of size c

.

days1 2 3 4 5 6 7 8 9

1000

950

910 300

330 400

value = 332.5, error = 6675

ADBIS 2015 4/23 G. Mahlknecht et al.

Page 7: Efficient Computation of Parsimonious Temporal Aggregationdignoes/res/adbis2015-slides.pdf · Instant Temporal Aggregation (ITA) Patient treatment periods with daily cost P C T r

PTA Optimal Solution

Result of ITA for SUM(C) (size n = 6)

.

days1 2 3 4 5 6 7 8 9

1000

950

910 300

330 400

error = 800 error = 600

Result PTA (c = 4) total error 1,400 (optimal solution)

.

days1 2 3 4 5 6 7 8 9

1, 000

930 310

400

ADBIS 2015 5/23 G. Mahlknecht et al.

Page 8: Efficient Computation of Parsimonious Temporal Aggregationdignoes/res/adbis2015-slides.pdf · Instant Temporal Aggregation (ITA) Patient treatment periods with daily cost P C T r

Split Points / Split PathI PTA computes a split path (sequence of split points)I Tuples between split points are merged

.

days1 2 3 4 5 6 7 8 9

1000

950

910 300

330 400

error = 800 error = 600

split 5split 3split 1

Split Path: [1, 3, 5]

.

days1 2 3 4 5 6 7 8 9

1, 000

930 310

400

ADBIS 2015 6/23 G. Mahlknecht et al.

Page 9: Efficient Computation of Parsimonious Temporal Aggregationdignoes/res/adbis2015-slides.pdf · Instant Temporal Aggregation (ITA) Patient treatment periods with daily cost P C T r

PTA Existing Algorithm

Dynamic Programming Algorithm

Error Matrix Ei=1 2 3 4 5 6

k=1 0 1667 5700 ∞ ∞ ∞2 - 0 800 5700 6300 12375

3 - - 0 800 1400 6300

4 - - - 0 600 1400

Ei,k minimum error in reducing the firsti tuples to size k

Split Point Matrix Ji=1 2 3 4 5 6

k=1 0 0 0 0 0 0

2 0 1 1 3 3 3

3 0 0 2 3 3 5

4 0 0 0 3 3 5

Ji,k optimum split pointwhen reducing the first ituples to size k

only the last two rows are used whole matrix

ADBIS 2015 7/23 G. Mahlknecht et al.

Page 10: Efficient Computation of Parsimonious Temporal Aggregationdignoes/res/adbis2015-slides.pdf · Instant Temporal Aggregation (ITA) Patient treatment periods with daily cost P C T r

Problem and Contribution

Problem

I Runtime and space requirements of existing algorithm notscalable

Contribution

I Diagonal Pruning: Reduces the computational complexity byavoiding unnecessary computations

I Split Point Graph: Reduces the space complexity

I Result remains optimal

ADBIS 2015 8/23 G. Mahlknecht et al.

Page 11: Efficient Computation of Parsimonious Temporal Aggregationdignoes/res/adbis2015-slides.pdf · Instant Temporal Aggregation (ITA) Patient treatment periods with daily cost P C T r

Introduction

Diagonal Pruning

Split Point Graph

Experimental Evaluation

ADBIS 2015 9/23 G. Mahlknecht et al.

Page 12: Efficient Computation of Parsimonious Temporal Aggregationdignoes/res/adbis2015-slides.pdf · Instant Temporal Aggregation (ITA) Patient treatment periods with daily cost P C T r

Diagonal Pruning

Lemma (Diagonal Pruning)

For the computation of the error matrix E and split point matrix Jthere exists an upper bound for variable i.

i=1 2 3 4 5 6

k=1 0 0 0 0 0 02 - 1 1 3 3 33 - - 2 3 3 54 - - - 3 3 5

I Red cells can be avoided, reduces runtime

I allows to eliminate parts of the matrices, reduces memory

ADBIS 2015 10/23 G. Mahlknecht et al.

Page 13: Efficient Computation of Parsimonious Temporal Aggregationdignoes/res/adbis2015-slides.pdf · Instant Temporal Aggregation (ITA) Patient treatment periods with daily cost P C T r

Introduction

Diagonal Pruning

Split Point Graph

Experimental Evaluation

ADBIS 2015 11/23 G. Mahlknecht et al.

Page 14: Efficient Computation of Parsimonious Temporal Aggregationdignoes/res/adbis2015-slides.pdf · Instant Temporal Aggregation (ITA) Patient treatment periods with daily cost P C T r

Split Point Graph

Challenge

Substitution of Split Point Matrix J by alternative structure toreduce memory consumption

Ideaunnecessary nodes are not stored

Split Point Graph

I Only necessary nodes are inserted

I Nodes are removed when they become obsolete (PathPruning)

ADBIS 2015 12/23 G. Mahlknecht et al.

Page 15: Efficient Computation of Parsimonious Temporal Aggregationdignoes/res/adbis2015-slides.pdf · Instant Temporal Aggregation (ITA) Patient treatment periods with daily cost P C T r

Graph Evolution

1 2 3 4 5 6

2 diagonal pruned node

1 active path pruned nodes

1 path pruned nodes

ADBIS 2015 13/23 G. Mahlknecht et al.

Page 16: Efficient Computation of Parsimonious Temporal Aggregationdignoes/res/adbis2015-slides.pdf · Instant Temporal Aggregation (ITA) Patient treatment periods with daily cost P C T r

Graph Evolution

1 2 3 4 5 6

1 2 3 4 5 6

2 diagonal pruned node

1 active path pruned nodes

1 path pruned nodes

ADBIS 2015 13/23 G. Mahlknecht et al.

Page 17: Efficient Computation of Parsimonious Temporal Aggregationdignoes/res/adbis2015-slides.pdf · Instant Temporal Aggregation (ITA) Patient treatment periods with daily cost P C T r

Graph Evolution

1 2 3 4 5 6

1 2 3 4 5 6

2 diagonal pruned node

1 active path pruned nodes

1 path pruned nodes

ADBIS 2015 13/23 G. Mahlknecht et al.

Page 18: Efficient Computation of Parsimonious Temporal Aggregationdignoes/res/adbis2015-slides.pdf · Instant Temporal Aggregation (ITA) Patient treatment periods with daily cost P C T r

Graph Evolution

1 2 3 4 5 6

1 2 3 4 5 6

1 2 3 4 5 6

2 diagonal pruned node

1 active path pruned nodes

1 path pruned nodes

ADBIS 2015 13/23 G. Mahlknecht et al.

Page 19: Efficient Computation of Parsimonious Temporal Aggregationdignoes/res/adbis2015-slides.pdf · Instant Temporal Aggregation (ITA) Patient treatment periods with daily cost P C T r

Graph Evolution

1 2 3 4 5 6

1 2 3 4 5 6

1 2 3 4 5 6

2 diagonal pruned node

1 active path pruned nodes

1 path pruned nodes

ADBIS 2015 13/23 G. Mahlknecht et al.

Page 20: Efficient Computation of Parsimonious Temporal Aggregationdignoes/res/adbis2015-slides.pdf · Instant Temporal Aggregation (ITA) Patient treatment periods with daily cost P C T r

Graph Evolution

1 2 3 4 5 6

1 2 3 4 5 6

1 2 3 4 5 6

1 2 3 4 5 6

2 diagonal pruned node

1 active path pruned nodes

1 path pruned nodes

ADBIS 2015 13/23 G. Mahlknecht et al.

Page 21: Efficient Computation of Parsimonious Temporal Aggregationdignoes/res/adbis2015-slides.pdf · Instant Temporal Aggregation (ITA) Patient treatment periods with daily cost P C T r

Graph Evolution

1 2 3 4 5 6

1 2 3 4 5 6

1 2 3 4 5 6

1 2 3 4 5 6

2 diagonal pruned node

1 active path pruned nodes

1 path pruned nodes

ADBIS 2015 13/23 G. Mahlknecht et al.

Page 22: Efficient Computation of Parsimonious Temporal Aggregationdignoes/res/adbis2015-slides.pdf · Instant Temporal Aggregation (ITA) Patient treatment periods with daily cost P C T r

Graph Evolution

1 2 3 4 5 6

1 2 3 4 5 6

1 2 3 4 5 6

1 2 3 4 5 6

2 diagonal pruned node

1 active path pruned nodes

1 path pruned nodes

Total number of nodes: 24

Not computed nodes: 12

Path pruned nodes: 4

ADBIS 2015 13/23 G. Mahlknecht et al.

Page 23: Efficient Computation of Parsimonious Temporal Aggregationdignoes/res/adbis2015-slides.pdf · Instant Temporal Aggregation (ITA) Patient treatment periods with daily cost P C T r

Introduction

Diagonal Pruning

Split Point Graph

Experimental Evaluation

ADBIS 2015 14/23 G. Mahlknecht et al.

Page 24: Efficient Computation of Parsimonious Temporal Aggregationdignoes/res/adbis2015-slides.pdf · Instant Temporal Aggregation (ITA) Patient treatment periods with daily cost P C T r

Experimental Configuration

Synthetic Datasets

I SYNTH: random distributed values

I ETDS: evolution of employees in a company

Algorithm Comparisons

I PTA: original Algorithm

I DP: PTA with diagonal pruning

I SGP: Split point graph with diagonal and path pruning

ADBIS 2015 15/23 G. Mahlknecht et al.

Page 25: Efficient Computation of Parsimonious Temporal Aggregationdignoes/res/adbis2015-slides.pdf · Instant Temporal Aggregation (ITA) Patient treatment periods with daily cost P C T r

Runtime: PTA vs Diagonal Pruning

0 1 2 3 4 50

5

10

Reduction size [k]

Ru

nti

me

[sec

]

ETDS

PTA

DP

0 1 2 3 4 50

50

100

150

Reduction size [k]

Ru

nti

me

[sec

]

SYNTH

PTA

DP

Diagonal pruning substantially reduces runtime

ADBIS 2015 16/23 G. Mahlknecht et al.

Page 26: Efficient Computation of Parsimonious Temporal Aggregationdignoes/res/adbis2015-slides.pdf · Instant Temporal Aggregation (ITA) Patient treatment periods with daily cost P C T r

Runtime: Split Point Graph vs PTA with Diagonal Pruning

0 1 2 3 4 50

2

4

6

Reduction size [k]

Ru

nti

me

[sec

]

ETDS

SPG

DP

0 1 2 3 4 50

50

100

150

Reduction size [k]

Ru

nti

me

[sec

]

SYNTH

SPG

DP

The overhead of the dynamic graph structure and pathpruning is very small

ADBIS 2015 17/23 G. Mahlknecht et al.

Page 27: Efficient Computation of Parsimonious Temporal Aggregationdignoes/res/adbis2015-slides.pdf · Instant Temporal Aggregation (ITA) Patient treatment periods with daily cost P C T r

Space Efficiency: PTA vs SPG (compression to 10%)

0 1 2 3 4 5 6 7 8 9100

20

40

60

80

Input cardinality n [k]

Mem

ory

[MB

]

ETDS

PTA, c=10%

SPG, c=10%

0 1 2 3 4 5 6 7 8 9100

20

40

60

80

Input cardinality n [k]

Mem

ory

[MB

]

SYNTH

PTA, c=10%

SPG, c=10%

Graph Implementation with Diagonal Pruning and PathPruning substantially reduces space consumption

ADBIS 2015 18/23 G. Mahlknecht et al.

Page 28: Efficient Computation of Parsimonious Temporal Aggregationdignoes/res/adbis2015-slides.pdf · Instant Temporal Aggregation (ITA) Patient treatment periods with daily cost P C T r

Space Efficiency: PTA vs SPG (compression to 1%)

0 1 2 3 4 5 6 7 8 9100

2

4

6

8

Input cardinality n [k]

Mem

ory

[MB

]

ETDS

PTA, c=1%

SPG, c=1%

0 1 2 3 4 5 6 7 8 9100

2

4

6

8

Input cardinality n [k]

Mem

ory

[MB

]

SYNTH

PTA, c=1%

SPG, c=1%

Graph Implementation with Diagonal Pruning and PathPruning substantially reduces space consumption

ADBIS 2015 19/23 G. Mahlknecht et al.

Page 29: Efficient Computation of Parsimonious Temporal Aggregationdignoes/res/adbis2015-slides.pdf · Instant Temporal Aggregation (ITA) Patient treatment periods with daily cost P C T r

Space Efficiency: Effect of Path Pruning

0 1 2 3 4 50

50

100

150

200

Reduction size c [103]

Mem

ory

[MB

]ETDS

SPG without Path Pruning

PTA

SPG

0 1 2 3 4 50

50

100

150

200

Reduction size c [103]

Mem

ory

[MB

]

SYNTH

SPG without Path Pruning

PTA

SPG

Path Pruning has a huge pruning effect. It prunes about 2/3of the graph

ADBIS 2015 20/23 G. Mahlknecht et al.

Page 30: Efficient Computation of Parsimonious Temporal Aggregationdignoes/res/adbis2015-slides.pdf · Instant Temporal Aggregation (ITA) Patient treatment periods with daily cost P C T r

Related Work

I Tuma, P.: Implementing Historical Aggregates in TempIS.Ph.D. thesis, Wayne State University, Detroit, Michigan(1992)

I Kline, N., Snodgrass, R.T.: Computing temporal aggregates.In: ICDE. pp. 222-231 (1995)

I Moon, B., Vega Lopez, I.F., Immanuel, V.: Efficientalgorithms for large-scale temporal aggregation. IEEE Trans.Knowl. Data Eng. 15(3), 744-759 (2003)

I Tao, Y., Papadias, D., Faloutsos, C.: Approximate temporalaggregation. In: ICDE. pp. 190-201 (2004)

I Gordevicius, J., Gamper, J., Bohlen, M.H.: Parsimonioustemporal aggregation. VLDB J. 21(3), 309-332 (2012)

ADBIS 2015 21/23 G. Mahlknecht et al.

Page 31: Efficient Computation of Parsimonious Temporal Aggregationdignoes/res/adbis2015-slides.pdf · Instant Temporal Aggregation (ITA) Patient treatment periods with daily cost P C T r

Conclusion

I Diagonal Pruning reduces the runtime of the computationby reducing the search space of the DP scheme adopted byPTA

I Split Point Graph in combination with Path Pruning reducesmemory consumption

I Experiments showed that the two optimizations reducememory requirements to one third of the original PTAimplementation

ADBIS 2015 22/23 G. Mahlknecht et al.

Page 32: Efficient Computation of Parsimonious Temporal Aggregationdignoes/res/adbis2015-slides.pdf · Instant Temporal Aggregation (ITA) Patient treatment periods with daily cost P C T r

Future Work

I Generalization of split point graph for some classes of DPproblems

I Computation of PTA queries while the ITA result is computedavoiding two step computation

ADBIS 2015 23/23 G. Mahlknecht et al.