An Efficient Defense Against Statistical Traffic Analysis - The Free
Statistical Data Reduction for Efficient Application Performance Monitoring Statistical Data...
-
date post
19-Dec-2015 -
Category
Documents
-
view
219 -
download
2
Transcript of Statistical Data Reduction for Efficient Application Performance Monitoring Statistical Data...
Statistical Data Reduction Statistical Data Reduction for Efficient Application Performance for Efficient Application Performance
Monitoring Monitoring
Lingyun Yang, Jennifer M. Schopf, Lingyun Yang, Jennifer M. Schopf,
Catalin L. Dumitrescu, Ian FosterCatalin L. Dumitrescu, Ian Foster
University of ChicagoUniversity of Chicago
Argonne National LaboratoryArgonne National Laboratory
Introduction In distributed and shared systems
– Performance of resources change dynamically
– Variability in resource performance can have a major influence on application performance
To deliver dependable and sustained performance to applications – Performance monitoring and anomaly diagnosis are
necessary
What is the problem?
System can be characterized by a set of system metrics:
– M=(m1, m2,… mn;)
– Example: (cpu load, band, free mem size, # of opened file… ) Application performance can be described quantitatively
by a performance metric: Y– Example: number of computations finished in unit time
Monitor the performance of system components ( value of M), such that we can diagnose the reason if an anomaly happens in application performance (value of Y).
Solution Challenges
– Computer systems and applications continue to increase in complexity and size
– Interactions among components are poor understood– Instrumentation will produce tremendous volumes of
data > Result in complexity for data analysis and anomaly
diagnosis.
Requires a data reduction strategy:
– Reduce the number of system metrics that a monitoring system must manage (necessary)
– Retain interesting characteristics of performance data (sufficient)
Outline Problems >Data Reduction Strategy
– Two observations
– Redundant system metrics reduction
– Statistical Variable Selection
Experiments Conclusion
Two Observations Some system metrics may capture the same or
similar information– They are correlated each other
– Only one is necessary, the other is redundant
Not all system metrics will be related with a particular application performance – Some system metrics are unrelated to the performance
of application, so unnecessary.
Two steps data reduction strategy
Redundant system metrics reduction
Clustering based method:– Use correlation coefficient (r) to measure the degree
of correlation between two system metrics
– Group metrics with high correlation coefficient into clusters
– Eliminate all but one of those metrics in one cluster
Two questions:– A threshold value t ( determined experimentally)
– A method to compare
How to compareHow to compare
Traditional method: Mathematical comparison
– r >t ? Problems:
– Only limited number of sample data are available
– r may change using data collected during different runs.
May eliminate uncorrelated metrics only by chance.
Sample correlation coefficient between the number of transfers issued per second and the number of memory pages cached per second for 20 runs of cactus application
Z-testZ-test
Reduce false error given limited number of sample data.
And avoid group uncorrelated metrics into one cluster
Z-test– A statistical method
– Determine whether an observed correlation is statistically significant larger than threshold value (95% confidence in my work).
Redundant metrics reduction Alg.
Given a set of samples, we proceed as follows. – Perform the Z-test for correlation coefficient between
every pair of system metrics.
– Group two metrics into one cluster only when the absolute value of their correlation coefficient is statistical significantly larger than the threshold value.
– The result of this computation is a set of system metric clusters.
– System metrics in each cluster are strongly correlated, so only one metric from the cluster can be used as the representative of the cluster while the others are deleted as redundant.
Outline Problems Data Reduction Strategy
– Two observations
– Redundant system metrics reduction
– > Statistical Variable Selection
Experiments Conclusion
Statistical Variable Selection Statistical Variable Selection
Some of these system metrics may not relate to our chosen performance metric
Identify the subset of all system metrics that are necessary to capture the performance metric
This form of data reduction is also known as variable selection
We use the Backward Elimination (BE) stepwise regression method to select the system metrics
BE stepwise regression methodBE stepwise regression method
System metrics concerned X=(x1, x2,… xn)
The application performance metric y Steps:
1. Y=0+1x1+2x2+…nxn
2. Which xi is the most useless in this model? By calculating the F value of each xi The F value of each xi captures its contribution to the model
3. Is the smallest F value < predefined significant value ? If yes, delete according xi, go to 1.
4. All metrics left are useful when to capture the variation of Y.
Outline Problems Data Reduction Strategy >Experiments
– Application and data collection
– Two criteria
– Experiment methodology
– Results Conclusion
Application and Data Collection Application and Data Collection
Application: Cactus Testbed: six Linux machines on UCSD Data collected at 0.033HZ for 24 hours Every data point include 600+ system metric value
and 1 application performance value Collect system metrics on each machine using
three utilities:– (1) The sar command of the SYSSTAT tool set,
– (2) Network weather service (NWS) sensors, and
– (3) The Unix command ping
Two criteriaTwo criteria
Reduction degree (RD) --necessary– Total percentage of system metrics eliminated
coefficient of determination ( R2 )-- sufficient– A statistical measurement
– Indicates the fraction of the total variability in the performance of application, that can be explained by the system metrics selected.
– Larger R2 value means system metrics selected can better capture the variation of performance of application.
Experiment methodologyExperiment methodology 24 hour long data is partitioned into 12 equal-sized chunks. Using the first chunk of data as the training data,the left 11
chunks of data as the verification data.2 steps experiment: Data Reduction
– Using training data to select system metrics. Verification: Is these system metrics sufficient? Is the result stable? How is this method compared with other strategies?
– RAND, randomly picks a subset of system metrics equal in number to those selected by our strategy
– MAIN, uses a subset of 75 system metrics that are commonly used to model the performance of applications by other works.
Data Reduction using training dataData Reduction using training data
Threshold , RD . Since fewer system metrics group into clusters and thus are removed as redundant
R2 , Since more information is available to model the application performance
RD=0.78, R2 = 0.98. when the threshold value = 0.95 A total of 141 of the original 628 system metrics were selected
System metrics selected on one machineSystem metrics selected on one machine Name Measurementwtps Total number of write requests per second issued to the physical disk.activepg Number of active (recently touched) pages in memoryproc/s Total number of processes created per second.rxpck/s Total number of packets received per second txpck/s Total number of packets transmitted per second.coll/s Number of collisions that happened per second while transmitting packets.kbbuffers Amount of memory used as buffers by the kernel in kilobytes.ip-frag Number of IP fragments currently in use. runq-sz Run queue length (number of processes waiting for run time)ldavg-5 System load average for the past 5 minutes.ldavg-15 System load average for the past 15 minutes. campg/s Number of additional memory pages cached by the system per second. dentunusd Number of unused cache entries in the directory cache.file-sz Number of used file handles. Rtsig-sz Number of queued RT signals. cswch/s Number of context switches per second. Latency Amount of time required to transmit a TCP message to target machinebandwidth Speed with which data can be sent to a target machine per secondAvailCPU Fraction of CPU available to a newly-started process.FreeMem Amount of space unused in memory
VerificationVerification
R2 value of SDR, MAIN and RAND
SDR exhibited an average R2 value of 0.907 55.0% and 98.5% higher than those of RAND and MAIN System metrics selected by SDR are significantly more
efficient than the alternatives for capturing Cactus performance
Verification Results AnalysisVerification Results Analysis
The system metrics selected by our strategy is:– Sufficient to capture the variation in the application
performance (average R2 value of 0.907)
– Stable (high R2 value over a far long time:24 hours)
– Better than the other two strategies concerned.
Conclusion
Statistical data reduction strategy – Reduce redundant system metrics which conveying
the same information> Cluster based method +Z test
– Reduce the unnecessary system metrics which are unrelated to the performance of applications
> BE stepwise regression method
Identify system metrics that are:– Only necessary ( high reduction degree value)
– And sufficient to capture application behavior( higher R2 value than other strategies)
Contact
Lingyun Yang: [email protected] Jennifer M . Schopf: [email protected] Catalin L. Dumitrescu: [email protected] Ian Foster: [email protected]