Using Statistical Tests in a Trust Model
-
Upload
javier-nieto-de-santos -
Category
Education
-
view
202 -
download
4
description
Transcript of Using Statistical Tests in a Trust Model
10/04/23
Using statistical Tests in a Trust Model
Fco. Javier Nieto ([email protected])
SERVICE COMPUTATION 2011
2
10/04/23Fco. Javier NietoService Computation 2011
Outline
▶ Introduction
▶ State of the Art
▶ Motivation and Objectives
▶ Agreements Fulfilment
▶ Release Improvement
▶ Opinions and Feedback
▶ Conclusions and Future Work
3
10/04/23Fco. Javier NietoService Computation 2011
Introduction
▶ Services are a key piece in business processes▶ Functionalities provided by third parties▶ They are expected to fulfil certain QoS levels▶ Will they perform as expected? Determine Trust Belief in the reliability,
truth and capability of the service
▶ How to do it?▶ Users’ feedback Reputation▶ Monitoring tools▶ Other platforms▶ Exploit all this information together!!
▶ Statistical hypothesis tests are a useful tool for exploiting all the gathered data!
4
10/04/23Fco. Javier NietoService Computation 2011
State of the Art
▶ Current approaches take care only about:▶ Feedback received▶ Some QoS aspects (response time, availability, etc...)▶ SLAs
▶ Applied approaches▶ Averages▶ Probabilistic distribution▶ Semantics▶ Fuzzy logic
▶ Go beyond! Calculate more aspects and exploit all the data▶ General characteristics of the service▶ Service Description analysis▶ External data sources
▶ Apply statistical hypothesis tests in some new aspects
5
10/04/23Fco. Javier NietoService Computation 2011
Motivation and Objectives
▶ Why to use statistical tests?▶ More aspects could be taken into account Trust models should be as
accurate as possible▶ Trust models may be attacked Robustness is limited to filtering inputs
What about data as a whole?▶ Statistics may be quite useful when used correctly▶ Past and current experiences must be taken into account
▶ Main Objectives ▶ Calculate aspects not available in other models Increase accuracy of the
model▶ Improve the robustness of the model More data from different sources is
exploited and analysed deeply Harder to attack
6
10/04/23Fco. Javier NietoService Computation 2011
Agreements Fulfilment
▶ How is the service behaving according to contractual commitments?▶ Compare measured values and agreed values▶ Determine if QoS and other parameters are as expected▶ Failure in agreement fulfilment is always negative
▶ Parameters taken into account ▶ Service stability Does the behaviour of the service vary too much?▶ Agreements fulfilment Are measured values (from monitoring) the
expected ones?▶ Count how many parameters are under agreed boundaries
▶ Apply two statistical tests▶ One test for the variance▶ One test for the average of a set of values
7
10/04/23Fco. Javier NietoService Computation 2011
Agreements Fulfilment – Stability Analysis
▶ Apply a Chi Square Test Determine if the variance of the population exceeds a given one▶ H0: δ²≤ δ₀²▶ H1: The contrary▶ For δ₀² we take the 5% of the value in the SLA for a parameter
p = 0.05; 9 degrees of freedom expected X² = 16.92
Obtained X² = 19.57 Null hypothesis is wrong with 95% confidence
Expected Response Time = 4 δ₀² = 0.2
RT 4 4 3.5 4.5 5 4.5 3.5 4 4 4
8
10/04/23Fco. Javier NietoService Computation 2011
Agreements Fulfilment – Fulfilment Analysis
▶ Apply a 2-tailed Z Test Determine if the average of the population is equal to the agreed value▶ H0: µ = µ₀, being µ₀ the agreed value for the parameter▶ H1: µ ≠ µ₀▶ For µ₀ we take the agreed value of the parameter in the SLA
Significance level is p = 0.05Obtained z = 1.66 Go to the table 1.6 / 0.06 0.0485
Since z is less than p, H0 is true!
Agreed Response Time = 4 RT 4 4 3.5 4.5 5 4.5 3.5 4 4 4
9
10/04/23Fco. Javier NietoService Computation 2011
Release Improvement
▶ How is the maintenance of the service? Are new releases successful?▶ Maintaining properly a service means it will remain updated and working▶ We expect improvements in each release
▶ Parameters taken into account▶ Releases periodicity Is it updated too much, almost never or properly?▶ Releases successfulness Has a new release really improved the service?▶ Problem solved How many problems were solved?▶ Functionalities improved How many functionalities have been added or
improved?
▶ Apply one statistical test Compare averages before and after the release in monitored parameters (check successfulness). McNemar could be used as well, in systems with many non-functional parameters, to check previous and current fulfilment.
10
10/04/23Fco. Javier NietoService Computation 2011
Release Improvement – Release Successfulness▶ Apply a Student’s t Test Unpaired t-test for detecting if a parameter
has varied▶ H0: There are no significant differences between the average values of the
compared groups▶ H1: There are significant differences between the average values of the
compared groups
For r=10 and p=0.05
BR 4 4 3.5 4.5 5 4.5 3.5 4 4 4
AR 3 3 3.5 2.5 2.5 3 3 3 3.5 3.5
p = 0.05; 18 degrees of freedom expected t = 2.10 (3.92 with p = 0.001)
Obtained t = 5.78 Null hypothesis is wrong with 99% confidence
11
10/04/23Fco. Javier NietoService Computation 2011
Opinions and Feedback
▶ What do others think about a service? What is its reputation?▶ Users may provide feedback about their experience with a service▶ In federated environments, platforms may interchange information about a
service behaviour
▶ Parameters taken into account▶ In COIN, there is a weighted average of evaluations▶ Credibility of each party providing information is determined If received
information is very different, something wrong is going on
▶ Apply one statistical test Cohen’s Kappa will determine the degree of agreement on feedback received and measures taken This time, data is normalized to categories
12
10/04/23Fco. Javier NietoService Computation 2011
Opinions and Feedback (II)
▶ Apply Cohen’s Kappa for determining level of agreement▶ Establish categories for continuous data Very low, low, medium, high, very
high▶ Compare measured values with received values
VH H M L VL
VH 5 1
H 1 2
M 2
L 1 1
VL 3
Pa = 0.8125Pr = 0.2418
k = 0.7527
K value Agreement
< 0 Disagreement
0 – 0,2 Slight agreement
0,2 – 0,4 Fair agreement
0,4 – 0,6 Moderate agreement
0,6 – 0,8 Good agreement
0,8 - 1 Very good agreement
13
10/04/23Fco. Javier NietoService Computation 2011
Conclusions and Future Work
▶ Statistical analysis can be performed on the data received, providing valuable information for calculating trust Optimize the exploitation of the data available
▶ Many tests may be applied, thanks to data normalization in categories
▶ Usage of more data and calculation of more parameters are useful for:▶ Increasing accuracy More parameters are evaluated▶ Increasing robustness Service data is analysed as a whole
▶ Future Work▶ Analyse correlation between aspects in the trust model Apply statistical
hypothesis tests for obtaining their relationships▶ Improve the feedback analysis with more tests Fleiss’ Kappa gives an idea
of global agreement between raters▶ What about the relationship between trust model for services and for cloud
infrastructure?
10/04/23
Thank you!
Atos, the Atos logo, Atos Consulting, Atos Worldline, Atos Sphere, Atos Cloud and Atos WorldGridare registered trademarks of Atos SA. June 2011
© 2011 Atos. Confidential information owned by Atos, to be used by the recipient only. This document, or any part of it, may not be reproduced, copied, circulated and/or distributed nor quoted without prior written approval from Atos.
15
10/04/23Fco. Javier NietoService Computation 2011
COIN Model
16
10/04/23Fco. Javier NietoService Computation 2011
Release Improvement – Non-Functional Properties Checking▶ Apply a McNemar’s X² Test Determine claims about non-functional
properties are fulfiled before and after the release▶ H0: The number of correct claims about NFPs is invariable before and after the
release▶ H1: The number of correct claims about NFPs has varied after the release▶ Use claims in service description, compared to real values measured
Significance level is p = 0.05; 1 degree of freedom expected X² = 3.84
Obtained X² = 1.125 Null hypothesis is right with 95% confidence
F AR NF AR
F BR 7 0
NF BR 2 1
Test 2 Positiv
e
Test 2 Negati
ve
Test 1 Positiv
e
a b
Test 1 Negativ
e
c d