Using Statistical Tests in a Trust Model

10/04/23

Using statistical Tests in a Trust Model

Fco. Javier Nieto ([email protected])

SERVICE COMPUTATION 2011

2

10/04/23Fco. Javier NietoService Computation 2011

Outline

▶ Introduction

▶ State of the Art

▶ Motivation and Objectives

▶ Agreements Fulfilment

▶ Release Improvement

▶ Opinions and Feedback

▶ Conclusions and Future Work

3


Introduction

▶ Services are a key piece in business processes▶ Functionalities provided by third parties▶ They are expected to fulfil certain QoS levels▶ Will they perform as expected? Determine Trust Belief in the reliability,

truth and capability of the service

▶ How to do it?▶ Users’ feedback Reputation▶ Monitoring tools▶ Other platforms▶ Exploit all this information together!!

▶ Statistical hypothesis tests are a useful tool for exploiting all the gathered data!

4


State of the Art

▶ Current approaches take care only about:▶ Feedback received▶ Some QoS aspects (response time, availability, etc...)▶ SLAs

▶ Applied approaches▶ Averages▶ Probabilistic distribution▶ Semantics▶ Fuzzy logic

▶ Go beyond! Calculate more aspects and exploit all the data▶ General characteristics of the service▶ Service Description analysis▶ External data sources

▶ Apply statistical hypothesis tests in some new aspects

5


Motivation and Objectives

▶ Why to use statistical tests?▶ More aspects could be taken into account Trust models should be as

accurate as possible▶ Trust models may be attacked Robustness is limited to filtering inputs

What about data as a whole?▶ Statistics may be quite useful when used correctly▶ Past and current experiences must be taken into account

▶ Main Objectives ▶ Calculate aspects not available in other models Increase accuracy of the

model▶ Improve the robustness of the model More data from different sources is

exploited and analysed deeply Harder to attack

6


Agreements Fulfilment

▶ How is the service behaving according to contractual commitments?▶ Compare measured values and agreed values▶ Determine if QoS and other parameters are as expected▶ Failure in agreement fulfilment is always negative

▶ Parameters taken into account ▶ Service stability Does the behaviour of the service vary too much?▶ Agreements fulfilment Are measured values (from monitoring) the

expected ones?▶ Count how many parameters are under agreed boundaries

▶ Apply two statistical tests▶ One test for the variance▶ One test for the average of a set of values

7


Agreements Fulfilment – Stability Analysis

▶ Apply a Chi Square Test Determine if the variance of the population exceeds a given one▶ H0: δ²≤ δ₀²▶ H1: The contrary▶ For δ₀² we take the 5% of the value in the SLA for a parameter

p = 0.05; 9 degrees of freedom expected X² = 16.92

Obtained X² = 19.57 Null hypothesis is wrong with 95% confidence

Expected Response Time = 4 δ₀² = 0.2

RT 4 4 3.5 4.5 5 4.5 3.5 4 4 4

8


Agreements Fulfilment – Fulfilment Analysis

▶ Apply a 2-tailed Z Test Determine if the average of the population is equal to the agreed value▶ H0: µ = µ₀, being µ₀ the agreed value for the parameter▶ H1: µ ≠ µ₀▶ For µ₀ we take the agreed value of the parameter in the SLA

Significance level is p = 0.05Obtained z = 1.66 Go to the table 1.6 / 0.06 0.0485

Since z is less than p, H0 is true!

Agreed Response Time = 4 RT 4 4 3.5 4.5 5 4.5 3.5 4 4 4

9


Release Improvement

▶ How is the maintenance of the service? Are new releases successful?▶ Maintaining properly a service means it will remain updated and working▶ We expect improvements in each release

▶ Parameters taken into account▶ Releases periodicity Is it updated too much, almost never or properly?▶ Releases successfulness Has a new release really improved the service?▶ Problem solved How many problems were solved?▶ Functionalities improved How many functionalities have been added or

improved?

▶ Apply one statistical test Compare averages before and after the release in monitored parameters (check successfulness). McNemar could be used as well, in systems with many non-functional parameters, to check previous and current fulfilment.

10


Release Improvement – Release Successfulness▶ Apply a Student’s t Test Unpaired t-test for detecting if a parameter

has varied▶ H0: There are no significant differences between the average values of the

compared groups▶ H1: There are significant differences between the average values of the

compared groups

For r=10 and p=0.05

BR 4 4 3.5 4.5 5 4.5 3.5 4 4 4

AR 3 3 3.5 2.5 2.5 3 3 3 3.5 3.5

p = 0.05; 18 degrees of freedom expected t = 2.10 (3.92 with p = 0.001)

Obtained t = 5.78 Null hypothesis is wrong with 99% confidence

11


Opinions and Feedback

▶ What do others think about a service? What is its reputation?▶ Users may provide feedback about their experience with a service▶ In federated environments, platforms may interchange information about a

service behaviour

▶ Parameters taken into account▶ In COIN, there is a weighted average of evaluations▶ Credibility of each party providing information is determined If received

information is very different, something wrong is going on

▶ Apply one statistical test Cohen’s Kappa will determine the degree of agreement on feedback received and measures taken This time, data is normalized to categories

12


Opinions and Feedback (II)

▶ Apply Cohen’s Kappa for determining level of agreement▶ Establish categories for continuous data Very low, low, medium, high, very

high▶ Compare measured values with received values

VH H M L VL

VH 5 1

H 1 2

M 2

L 1 1

VL 3

Pa = 0.8125Pr = 0.2418

k = 0.7527

K value Agreement

< 0 Disagreement

0 – 0,2 Slight agreement

0,2 – 0,4 Fair agreement

0,4 – 0,6 Moderate agreement

0,6 – 0,8 Good agreement

0,8 - 1 Very good agreement

13


Conclusions and Future Work

▶ Statistical analysis can be performed on the data received, providing valuable information for calculating trust Optimize the exploitation of the data available

▶ Many tests may be applied, thanks to data normalization in categories

▶ Usage of more data and calculation of more parameters are useful for:▶ Increasing accuracy More parameters are evaluated▶ Increasing robustness Service data is analysed as a whole

▶ Future Work▶ Analyse correlation between aspects in the trust model Apply statistical

hypothesis tests for obtaining their relationships▶ Improve the feedback analysis with more tests Fleiss’ Kappa gives an idea

of global agreement between raters▶ What about the relationship between trust model for services and for cloud

infrastructure?

10/04/23

Thank you!

Atos, the Atos logo, Atos Consulting, Atos Worldline, Atos Sphere, Atos Cloud and Atos WorldGridare registered trademarks of Atos SA. June 2011

© 2011 Atos. Confidential information owned by Atos, to be used by the recipient only. This document, or any part of it, may not be reproduced, copied, circulated and/or distributed nor quoted without prior written approval from Atos.

15


COIN Model

16


Release Improvement – Non-Functional Properties Checking▶ Apply a McNemar’s X² Test Determine claims about non-functional

properties are fulfiled before and after the release▶ H0: The number of correct claims about NFPs is invariable before and after the

release▶ H1: The number of correct claims about NFPs has varied after the release▶ Use claims in service description, compared to real values measured

Significance level is p = 0.05; 1 degree of freedom expected X² = 3.84

Obtained X² = 1.125 Null hypothesis is right with 95% confidence

F AR NF AR

F BR 7 0

NF BR 2 1

Test 2 Positiv

e

Test 2 Negati

ve

Test 1 Positiv

e

a b

Test 1 Negativ

e

c d

Using Statistical Tests in a Trust Model

Education

Transcript of Using Statistical Tests in a Trust Model