Multi-resolution Resource Behavior Queries Using
Wavelets
Jason Skicewicz
Peter A. Dinda
Jennifer M. Schopf
Northwestern University
2
The Tension
Sensor
Video App
Network
Course-grain measurement
Resource-appropriate
measurement
Fine-grain measurement
Grid App
…
Resource Signal (periodic sampling)Example: host load
3
Video Scheduling
Sensor
Network
Video App
Fine-grain measurements needed
4
Grid Scheduling
Network
Grid AppSensor
Coarse-grain measurements sufficient
5
Interval Averages
Sensor
Network
Application
Ideal Result Adequate Result
Average over interval
Average over interval
6
Contributions / Outline
• Application-sensor tension
• Query model to address tension
• Wavelets as basis for query model
• Promising early results
• Delay conundrum
7
Schematic Representation of Query Model
Network
ApplicationSensor
Measurements atfs samples/second
Desired rate atfq samples/second
Lower bandwidthused
The desired rate signal is an estimateerror = x – x^
xx ^
8
Query
Stream + Error
Application
2ˆ,ˆ eiq xfyStreamQuer
/1sf qqf /1
x
tΔ t
x̂
Δq
sq ff
Sensor
9
Query
Average + CI
Application
NavgNavgNavgfcNeryIntervalQu hlq ,,,,
Sensor
x
t t
x̂
Application gets average over this interval
tnow=inow(inowN+1)
Application wants average over this interval
10
Contributions / Outline
• Application-sensor tension
• Query model to address tension
• Wavelets as basis for query model
• Promising early results
• Delay conundrum
11
Wavelets As Basis for Query Model
• Natural time/frequency decomposition• Provides a multi-resolution view of a resource
• Well known mathematical tool• Invented in the ’80s, hot in ‘90s and today• Linear complexity• Non-stationarity, other “normal” behaviors acceptable• Burrus, Gopinath, Gao, intro to wavelets and wavelet transforms: A primer
• Analytic enabler• Prediction on different resolutions• Compression of measurement streams• …
Queries over wavelet domain representation of signal
12
Multi-resolution Views
13
High Level View of a 4-level Wavelet Decomposition
• Resource Signal is decomposed into levels• Samples at each level are at a different rate• Each level captures different frequency content• Corresponding inverse transform
WaveletTransform
Level 1
Level 0
Level 3
Level 2Wavelet
Coefficients
Sensor
14
4-level Wavelet DecompositionTime-frequency Localization
Level
0
1
2
3
x[n]
time
Frequency
[0 fs/2]
[fs/4 fs/2]
[fs/8 fs/4]
[fs/16 fs/8]
[0 fs/16]
Δ fs=1/Δ
15
Example Decomposition of Host Load
Lossless representation of resource signal
16
Computing Wavelet Coefficients
• Streaming operation– Number of levels, M, chosen arbitrarily– Amortized work per sample: O(1)– O(n) for n samples
• Block by block operation– Block of samples, n=2k
– Levels, M = lg(n) + 1– Circular convolution over block, O(n)
17
Proposed System
WaveletTransform
Level 0
Sensor
InverseWavelet
Transform
Application
Level M-1
Level M
Level 0
Level L
Network
Application receives levels based on its needs
Stream Interval
18
Multi-resolution Views Using 14 Levels
19
Wavelet Compression Gains, 14 Levels
Typical appropriate number of levels for host load, error < 20%
20
Contributions / Outline
• Application-sensor tension
• Query model to address tension
• Wavelets as basis for query model
• Promising early results
• Delay conundrum
21
Offline Analysis System
Seg. 1 Seg. 2 Seg. PWavelet
Transform
Host LoadTraces
8192 SecondSegments
knx ,
knx ,ˆkne ,
14 Levels
knc ,
ChooseSubset of
Levels
L LevelsL Levels
knx ,
+-
Reconstructionknx ,ˆ
Streaming Error Interval Error
+-
Ne
NxE
Interval
2,8,32,...,8192
Average NxE ˆ
NxE ˆ
22
Load Traces
• DEC Unix 5 second exponential average– 1 Hz sample rate– Traces collected in August 1997
• AXP0-PSC – Interactive machine with high load• AXP7-PSC – Batch machine• Sahara-CMU – Large-memory compute server• Themis-CMU – Desktop workstation
• Windows 2000 percentage of CPU– 1Hz sample rate– Trace collected in May 2001
• Tlab-03-NU – Desktop, teaching lab machine
23
Testcases
• Stream Queries• One million samples per trace
• Interval Queries• 2, 8, 32, 128, 512, 2048, 8192 second intervals• 1000 randomized queries per interval length per trace
24
Performance Evaluation
• Streaming queries metrics– Error variance– Error histograms– Error mean– Energy in error auto-covariance
• Interval query metrics– Error variance– Error histograms– Error mean
Error mean ~ 0 for all evaluations
25
Streaming Queries, Relative Error Variance
Fewer than 1% of coefficients, error < 20%
26
Streaming Queries, Error Histogram at Level 6
Errors follow a near-Gaussian distribution
27
Interval Queries, Error Variance
Error variance approaches zero as interval increases
28
Interval Queries, Error Histograms at Level 5
Distributions not always Gaussian
29
Contributions / Outline
• Application-sensor tension
• Query model to address tension
• Wavelets as basis for query model
• Promising early results
• Delay conundrum
30
Block By Block System Delay
WaveletTransform
InverseWavelet
TransformBlock Block
x[n] xr[n]…
M Levels
n samplesin block
n samplesin block
Sample AcquisitionsWavelet transformInverse transform
time
Samples delayed by block size
^
31
Streaming System Delay, Example with Length 4 Wavelets (D4), 4 Levels
High levels delayed waiting for low frequency computations, output delayed by high order filter
x[n]
Length 22
Length 22
Length 10
Length 4
Length 22
Length 22
Length 10
Length 4
xr[n-d]
Delay K1
Delay K2
Level 0
Level 1
Level 2
Level 3
32
Delay Conclusions
• System implementation• Delay must be taken into account• Prediction may help reduce streaming delay
• Application scheduling• Fine-grain apps more sensitive to delay• Coarse-grain apps less sensitive to delay
• Suggestions?
We are working on a solution!
33
Related Work• Database queries over wavelet coefficients
– Shahabi, et al [SSDBM 2000]– Chakrabarti, et al [VLDB 2000]– Vitter, et al [CIKM ‘98, SIGMOD ‘99]
• Network traffic analysis and modeling– Ribeiro, et al [IEEE INFOCOM 2000]– Riedi, et al [IEEE DSPCS ’99]– Feldman, et al [SIGCOMM ’98]
• Wavelet theory– Daubechies [Ten Lectures on Wavelets ‘92, SIAM]– Mallat [IEEE Trans. on Pattern Analysis and Machine
Intelligence, ’89]
34
Conclusions
• Application-sensor tension
• Query model to address tension
• Wavelets as basis for query model
• Promising early results
• Delay conundrum
35
Future Work
• Wavelets are an enabler of other techniques– Prediction over wavelet coefficients
• Possibility of better results• Can reduce system delay
– Further compression through processing– Adaptive decompositions based on resource
• Looking at other resource streams
• RPS implementation
36
Contact Information
• Webpage• http://www.cs.northwestern.edu/~jskitz
• Email address• [email protected]
• Load traces and tools• http://www.cs.northwestern.edu/~pdinda/LoadTraces
• Matlab scripts• Available by request ([email protected])
37
Frequency Information Vs. Rate
Input Signal, x[n] Decomposition
f(Hz) fs/2 f(Hz) fs/2 fs/8 fs/4 fs/16
0 1 2 3
Levels
• Frequency information retained = fs/2 • Measurement rate, fs
Q: Why is this true?A: The Nyquist Criterion- sampling theory
38
Wavelet Transform, 1 StageLevel 0 yl[n]
2
2Level 1
HPF
LPF
yh[n]
x[n]
LPF, HPF FIR filters
h[n] x[n] y[n]
N
k
knxkhny0
Downsampler
2 y[n] c[k]
kykc 2 ,for all k
39
Increasing Stages, Mallat’s Tree Algorithm
x[n]
Level 0
Level 1 HPF
LPF
HPF
LPF
HPF
LPF
Level M-1
Level M
Stages can be arbitrarily increased
40
Frequency Response
• Filters must be even order for PR• Other special properties to retain PR• The filters are order N=8 (D8 wavelet)
HPFLPF
41
Reconstruction From the Wavelet Coefficients, 1 Stage
Upsampler
LPF, HPF time reversed filters, same response
0
2nc
else ,
2 of multiplen , y[n] = c[k] 2
Level 0
2Level 1
HPF
LPF
xr[n]2
+^
42
Reconstruction From Multiple Stages, The Inverse Wavelet Transform
Level 0
Level 1
xr[n]
HPF
+ LPF
HPF
+ LPF
HPF
+ LPF
Level M-1
Level M
^
Reconstructed signal is exactly the resource
43
• Determined by accuracy constraints
• Determined by what levels are available• Determined by the rate (fq) at which
measurements are requested:
Q: How are the number of levels determined?
Answers:
LMs
qLMs f
ff
22 1
44
Example, Choosing Levels
f(Hz) fs/2 fs/8 fs/4 fs/16
0 1 2 3
Levels
M = 4 levels fq = fs / 6
Lss
Ls fff
414 262
Solution:
23 262sss fff
L = 2:
Equation Satisfied!
Levels 0, 1 and 2 coefficients returned
45
Streaming Query Tradeoffs
• Measurement rate, fq high– Lower error variance– Higher communication costs
• Measurement rate, fq low– Higher error variance– Very low communication costs
Wavelet approach yields accuracy at low rates
46
Interval Query Tradeoffs
• Interval length N long– Less dynamic rate– Tighter confidence
intervals
• Interval length N short– More dynamic rate– Wider confidence
intervals
• Rate, fq high– Shorter interval length– Tighter confidence
intervals
• Rate, fq low– Longer interval length– Wider confidence
intervals
Confidence interval (c) provides flexibility
47
Streaming Queries, Energy in Auto-covariance
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18
1 3 5 7 9 11 13
0.001 0.01 0.1 1
Level
Compression and Peak Frequency
Error becomes uncorrelated as levels added
48
Interval Queries, Error Mean (32 seconds)
-0.002
-0.0015
-0.001
-0.0005
0
0.0005
0.001
0.0015
0.002
0.0025
0.003
1 3 5 7 9 11 13
0.001 0.01 0.1 1
Level
Compression and Peak Frequency
Error mean is zero at 8 levels, 3% of coefficients
49
Interval Queries, Error Mean (512 seconds ~ 8½ minutes)
-0.0005
0
0.0005
0.001
0.0015
0.002
0.0025
0.003
0.0035
0.004
0.0045
1 3 5 7 9 11 13
0.001 0.01 0.1 1
Level
Compression and Peak Frequency
As interval increases, need fewer levels
Top Related