Statistical Decision Theory · The SNMP server maintains a database of management variables called...

82
Statistical Decision Theory L. Fillatre Outlines Part I: Anomaly detection in networks : state-of-the-art Part II: Statistical testing : fundamentals Part III: Statistical testing : sequential approaches Part IV: Statistical tests : a case study Statistical Decision Theory Lionel Fillatre ENST Bretagne, Computer Science Department

Transcript of Statistical Decision Theory · The SNMP server maintains a database of management variables called...

Page 1: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

OutlinesPart I: Anomalydetection innetworks :state-of-the-art

Part II: Statisticaltesting :fundamentals

Part III: Statisticaltesting : sequentialapproaches

Part IV: Statisticaltests : a case study

Statistical Decision Theory

Lionel Fillatre

ENST Bretagne, Computer Science Department

Page 2: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

OutlinesPart I: Anomalydetection innetworks :state-of-the-art

Part II: Statisticaltesting :fundamentals

Part III: Statisticaltesting : sequentialapproaches

Part IV: Statisticaltests : a case study

Part I : Anomaly detection in networks

1 Motivation

2 Network anomalies

3 Sources of network data

4 Anomaly detection methods

Page 3: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

OutlinesPart I: Anomalydetection innetworks :state-of-the-art

Part II: Statisticaltesting :fundamentals

Part III: Statisticaltesting : sequentialapproaches

Part IV: Statisticaltests : a case study

Part II : statistical testing

5 Motivation

6 Test between two simple hypotheses

7 Test between two composed hypotheses

Page 4: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

OutlinesPart I: Anomalydetection innetworks :state-of-the-art

Part II: Statisticaltesting :fundamentals

Part III: Statisticaltesting : sequentialapproaches

Part IV: Statisticaltests : a case study

Part III : sequential approaches

8 Motivation

9 Sequential probability ratio test

10 Change detection: known change

11 Change detection: unknown change

Page 5: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

OutlinesPart I: Anomalydetection innetworks :state-of-the-art

Part II: Statisticaltesting :fundamentals

Part III: Statisticaltesting : sequentialapproaches

Part IV: Statisticaltests : a case study

Part IV : a case study

12 DOS attack detection

13 Multichannel parametric CUSUM

14 Multichannel non-parametric CUSUM

15 Practical example

Page 6: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

Motivation

Networkanomalies

Sources ofnetwork data

Anomalydetectionmethods

Part I

Anomaly detection in networks

Page 7: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

Motivation

Networkanomalies

Sources ofnetwork data

Anomalydetectionmethods

Outlines of Part I

1 Motivation

2 Network anomalies

3 Sources of network data

4 Anomaly detection methods

Page 8: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

Motivation

Networkanomalies

Sources ofnetwork data

Anomalydetectionmethods

Motivation

Networks are complex system: vast amounts ofinformation need to be collected and processed.

It is desirable to detect network anomalies andperformance bottlenecks to improve networkmanagement.

To detect anomalies, it is necessary:To give a definition of network anomalies,

To choose the sources of network data relevant todetect anomalies,

To choose a method to detect anomalies.

Page 9: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

Motivation

Networkanomalies

Sources ofnetwork data

Anomalydetectionmethods

Network anomalies

Definition: networks anomalies typically refer tocircumstances when network operations deviate fromnormal network behavior.

Classification: there are two kinds of anomalies:

Network failures: server failures, broadcast storms,transient congestions,. . .

Security-related problems: denial of services (DOS),network intrusions,. . .

For the purpose of anomaly detection, we mustcharacterize normal traffic behavior.

Page 10: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

Motivation

Networkanomalies

Sources ofnetwork data

Anomalydetectionmethods

Data from network probes

Network probes are specialized tools such as “ping”and “traceroute”.

These methods do not require the cooperation of thenetwork service provider.

Performance metrics derived from such tools canprovide only a coarse grained view of the network.

Hence, the data obtained from probing mechanismsmay be of limited value for anomaly detection.

Page 11: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

Motivation

Networkanomalies

Sources ofnetwork data

Anomalydetectionmethods

Data from packet filtering

Packet flows are sampled by capturing the IP headersof a select set of packets at different points in thenetwork.

For flow-based monitoring, a flow is identified bysource-destination addresses and source-destinationport numbers.

Data obtained from this method can be used to detectanomalous network flows.

However, the hardware requirements required for thismeasurement method makes it difficult to use inpractice.

Page 12: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

Motivation

Networkanomalies

Sources ofnetwork data

Anomalydetectionmethods

Data from routing protocols

The data collected can be used to build the networktopology and provides link status updates.

Since routing updates occur at frequent intervals, anychange in link utilization will be updated in near realtime.

However, since routing updates must be kept small,only limited information pertaining link statistics can bepropagated through routing updates.

Page 13: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

Motivation

Networkanomalies

Sources ofnetwork data

Anomalydetectionmethods

Network management protocols

Network management protocols provide informationabout network traffic statistics.

The information obtained can be used to characterizenetwork behavior.

This source of data is obtained by using the SimpleNetwork Management Protocol (SNMP):

This protocol provides a mechanism to communicatebetween the manager and hundred of SNMP agents.

The SNMP server maintains a database ofmanagement variables called the ManagementInformation Base (MIB) variables.

It is a widely deployed protocol and has beenstandardized for all different network devices.

Due to the fine-grained data available from SNMP, it is agood data source for network anomaly detection.

Page 14: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

Motivation

Networkanomalies

Sources ofnetwork data

Anomalydetectionmethods

Hierarchical scheme of methods

Anomaly detection

Rule-based

Pattern matching

Statistical testing

approachesapproachesapproaches approaches

approaches

Signal processing

Finite state

machines

Deterministic Stochastic Non-sequential Sequential

Page 15: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

Motivation

Networkanomalies

Sources ofnetwork data

Anomalydetectionmethods

Rule-based approaches (1/2)

Early work in the area of fault or anomaly detection wasbased on expert systems.

An exhaustive database containing the rules ofbehavior of the faulty system is used to determined if afault occurred.

Two kinds of rule selection are possible: deterministicor stochastic (belief networks for example).

Page 16: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

Motivation

Networkanomalies

Sources ofnetwork data

Anomalydetectionmethods

Rule-based approaches (2/2)

These rule-based systems rely heavily on the expertiseof the network manager and do not adapt well to theevolving network environment.

Is is possible to improve such a system by adding apicture of previous fault scenarios, which leads tocase-based reasoning systems.

These systems have an heavy dependance on pastinformation and the number of functions to be learnedalso increases with the number of fault studied.

Page 17: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

Motivation

Networkanomalies

Sources ofnetwork data

Anomalydetectionmethods

Finite state machines

Anomaly or fault detection using finite state machinesmodel alarm sequences that occur during and prior tofault events.

An alarm is modeled as a state of the finite statemachine.

Finite state machines are built for a known network faultusing history data.

Not all faults can be captured by a finite sequence ofalarms of reasonable length.

Page 18: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

Motivation

Networkanomalies

Sources ofnetwork data

Anomalydetectionmethods

Pattern matching

Online learning is used to build a traffic profile for agiven network.

Traffic profiles are built using symptom-specific featurevectors such as link utilization.

When acquired data failed to fit the developed profileswithin some confidence interval, then an anomaly isdeclared.

The efficiency depends on the accuracy of the trafficprofile generated. It is necessary to spend aconsiderable amount of time building traffic profiles (thismethod is not scale gracefully).

Page 19: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

Motivation

Networkanomalies

Sources ofnetwork data

Anomalydetectionmethods

Signal processing techniques

Signal processing techniques have been used to modeldata flows.

The normal behavior of data flows are modeled byusing several approaches: spectral analysis, timeseries analysis, wavelets decompositions,. . .

Anomalies correspond to deviations in the normalbehavior of the data flows.

Page 20: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

Motivation

Networkanomalies

Sources ofnetwork data

Anomalydetectionmethods

Statistical testing (1/2)

Statistical testing has been used to detect bothanomalies corresponding to network failures as well asnetwork intrusions.

The statistical nature of the available information isused to define the normal behavior of the network(distribution of packet sizes,. . . ).

Non-sequential and sequential approaches can beused according to the network manager’s requirements.

Page 21: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

Motivation

Networkanomalies

Sources ofnetwork data

Anomalydetectionmethods

Statistical testing (2/2)

Non-sequential approches allow us to define optimalalgorithms: minimization of false alarms andmaximization of the probability of anomaly detection.

Sequential approaches are used to minimize thenumber of observations needed to detect an anomaly.

When data flows are modeled by using parametricmodels, the design of optimal algorithms is possible.

Non-parametric approaches are particularly studiedbecause of the lack of parametric models. Theseapproaches are often suboptimal.

Page 22: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

MotivationMain objectives

Practical examples

Test betweentwo simplehypothesesBasic definitions

Bayes test

Most powerful test

Example

Test betweentwo composedhypothesesBasic definitions

UMP Test

Example

GLR test

Example

Part II

Statistical testing : fundamentals

Page 23: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

MotivationMain objectives

Practical examples

Test betweentwo simplehypothesesBasic definitions

Bayes test

Most powerful test

Example

Test betweentwo composedhypothesesBasic definitions

UMP Test

Example

GLR test

Example

Outlines of Part II

5 Motivation

6 Test between two simple hypotheses

7 Test between two composed hypotheses

Page 24: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

MotivationMain objectives

Practical examples

Test betweentwo simplehypothesesBasic definitions

Bayes test

Most powerful test

Example

Test betweentwo composedhypothesesBasic definitions

UMP Test

Example

GLR test

Example

Main objectives

Given some observations, it is aimed to diagnose asystem: detection and identification of an anomaly.

Observations are often noisy due to model errorsand/or measurement errors.

For our purpose, the final aim consists of designingautomatic systems to monitor a network and to launchalarms when an anomaly appears.

Page 25: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

MotivationMain objectives

Practical examples

Test betweentwo simplehypothesesBasic definitions

Bayes test

Most powerful test

Example

Test betweentwo composedhypothesesBasic definitions

UMP Test

Example

GLR test

Example

Practical examples

To detect Denial Of Services (DOS) attacks on a server.

To detect an abrupt change in the link utilizations on anetwork.

To identify the protocol associated to a flow of packets:http, ftp,. . .

Page 26: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

MotivationMain objectives

Practical examples

Test betweentwo simplehypothesesBasic definitions

Bayes test

Most powerful test

Example

Test betweentwo composedhypothesesBasic definitions

UMP Test

Example

GLR test

Example

Basic notations

Assume that we have 2 distributions of probabilityP1, P2.

Let a n-size sample y1, . . . ,yn of independent andidentically distributed (i.i.d.) random variablesgenerated by one of these distributions.

It is assumed that yi ∈ Ω for all i (for example Ω = Rm)

and Ωn is the observation space.

Let us denote Ei[yk] the expectation of yk when yk

follows the distribution Pi, which is denoted yk ∼ Pi.

Assume that each distribution Pi has a probabilitydensity function (pdf) fi(y).All results can be applied to discrete random variables.

Page 27: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

MotivationMain objectives

Practical examples

Test betweentwo simplehypothesesBasic definitions

Bayes test

Most powerful test

Example

Test betweentwo composedhypothesesBasic definitions

UMP Test

Example

GLR test

Example

Basic definitions

Definition (simple hypothesis)We call simple hypothesis Hk any assumption concerningthe distribution Pk that can be reduced to a single value inthe space of probability distributions, which is denoted:

Hk = y1, . . . ,yn ∼ Pk, k = 1, 2.

Definition (statistical test)We call a statistical test for testing between hypotheses H1

and H2 any measurable mapping g : Ωn 7→ H1,H2.

Page 28: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

MotivationMain objectives

Practical examples

Test betweentwo simplehypothesesBasic definitions

Bayes test

Most powerful test

Example

Test betweentwo composedhypothesesBasic definitions

UMP Test

Example

GLR test

Example

Basic definitions: an illustration

Criterion of optimality Design of the test P1, P2

H1,H2y1, . . . ,yn

g(·)Observation space Ωn

Page 29: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

MotivationMain objectives

Practical examples

Test betweentwo simplehypothesesBasic definitions

Bayes test

Most powerful test

Example

Test betweentwo composedhypothesesBasic definitions

UMP Test

Example

GLR test

Example

Basic definitions

Definition (quality of a test)The quality of a test is defined with the aid of a set of errorprobabilities:

αi = Pr(g(y1, . . . ,yn) 6= Hi | Hi true)

= Pri(g(y1, . . . ,yn) 6= Hi)

where αi is the probability of rejecting hypothesis Hi when itis true.

Remarkα1 is called the probability of false alarm ;

α2 is called the probability of miss.

Page 30: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

MotivationMain objectives

Practical examples

Test betweentwo simplehypothesesBasic definitions

Bayes test

Most powerful test

Example

Test betweentwo composedhypothesesBasic definitions

UMP Test

Example

GLR test

Example

Bayes test (1/2)

Assume that each hypothesis Hi has a known a prioriprobaility qi such that q1 + q2 = 1.

Definition (Weighted error probability)

For a test g, we define the weighted error probability α(g) by

α(g) = q1α1 + q2α2.

Definition (Bayes test)The test g is said to be a Bayes test if it minimizes α(g) forgiven a priori probabilities qi.

Page 31: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

MotivationMain objectives

Practical examples

Test betweentwo simplehypothesesBasic definitions

Bayes test

Most powerful test

Example

Test betweentwo composedhypothesesBasic definitions

UMP Test

Example

GLR test

Example

Bayes test (2/2)

Definition (Likelihood ratio)The Likelihood Ratio (RT) between two pdfs f1 and f2 forthe independent sequence of observations y1, . . . ,yn is

Λ(y1, . . . ,yn) =n∏

i=1

f2(yi)

f1(yi).

Theorem (Bayes test)

The test g which minimizes α(g) is defined by

g(y1, . . . ,yn) =

H1 if Λ(y1, . . . ,yn) <q1q2

H2 if Λ(y1, . . . ,yn) ≥ q1q2

.

Page 32: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

MotivationMain objectives

Practical examples

Test betweentwo simplehypothesesBasic definitions

Bayes test

Most powerful test

Example

Test betweentwo composedhypothesesBasic definitions

UMP Test

Example

GLR test

Example

Most Powerful Test (1/2)

DefinitionLet Kα be the class of tests with a bounded probability offalse alarm:

Kα = g : α1(g) ≤ α.

Definition (Most powerful test)We say that a test g∗ ∈ Kα is the Most Powerful (MP) in theclass Kα if, for all g ∈ Kα,

α2(g∗) ≤ α2(g),

or, equivalently,β(g∗) ≥ β(g),

where β(g) = 1− α2(g) is the power of the test g.

Page 33: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

MotivationMain objectives

Practical examples

Test betweentwo simplehypothesesBasic definitions

Bayes test

Most powerful test

Example

Test betweentwo composedhypothesesBasic definitions

UMP Test

Example

GLR test

Example

Most Powerful Test (2/2)

Theorem (Neyman-Pearson’s lemma)The MP test g∗ in Kα is given by

g∗(y1, . . . ,yn) =

H1 if Λ(y1, . . . ,yn) < λα

H2 if Λ(y1, . . . ,yn) ≥ λα.

by choosing λα such as α1(g∗) = α.

RemarkThis lemma is fundamental from the theoretical point of viewbut its interest is often limited from the practical point ofview.

Page 34: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

MotivationMain objectives

Practical examples

Test betweentwo simplehypothesesBasic definitions

Bayes test

Most powerful test

Example

Test betweentwo composedhypothesesBasic definitions

UMP Test

Example

GLR test

Example

Location testing with Gaussian errors

Assume yi ∼ N (θ, 1).

The two hypotheses are H1 : θ = θ1 andH2 : θ = θ2 with 0 < θ1 < θ2.

The pdf of a Gaussian variable N (θ, 1) isϕθ(x) = ϕ(x− θ) with

ϕ(x) =1√2π

exp (−x2

2).

QuestionFind the Neyman-Pearson test.

Page 35: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

MotivationMain objectives

Practical examples

Test betweentwo simplehypothesesBasic definitions

Bayes test

Most powerful test

Example

Test betweentwo composedhypothesesBasic definitions

UMP Test

Example

GLR test

Example

Solution (1/2)

By subtracting θ1 from yi, we can suppose that θ1 = 0.

log Λ(y1, . . . ,yn) = θ2

(

∑ni=1 yi − n θ2

2

)

.

The Neyman-Pearson test is given by

g∗(y1, . . . ,yn) =

H1 if 1√n

∑ni=1 yi < λ′

α

H2 if 1√n

∑ni=1 yi ≥ λ′

α

with λ′α = λα/(θ2

√n) + θ2

√n/2.

Under H1, Λn = 1√n

∑ni=1 yi ∼ N (0, 1) and

λα = Φ−1(1− α), i.e. α1(g∗) = Pr(Λn > λα) = α, where

Φ is the cumulative function of the standardizedGaussian variable N (0, 1).

Page 36: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

MotivationMain objectives

Practical examples

Test betweentwo simplehypothesesBasic definitions

Bayes test

Most powerful test

Example

Test betweentwo composedhypothesesBasic definitions

UMP Test

Example

GLR test

Example

Solution (2/2): graphical illustration

-10 -8 -6 -4 -2 2 4 6 8 100

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxx

λ0.01

α1(g∗)α2(g

∗)

ϕ(x)ϕ(x− 5)

xθ1 = 0 θ2 = 5

Page 37: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

MotivationMain objectives

Practical examples

Test betweentwo simplehypothesesBasic definitions

Bayes test

Most powerful test

Example

Test betweentwo composedhypothesesBasic definitions

UMP Test

Example

GLR test

Example

Basic notations

Let a n-size sample y1, . . . ,yn of i.i.d. random variablesgenerated by a distribution Pθ parameterized by avector θ ∈ Θ.

It is assumed that yi ∈ Ω for all i (for example Ω = Rm)

and Ωn is the observation space.

Let us denote Eθ[yk] the expectation of yk when yk

follows the distribution Pθ, which is denoted yk ∼ Pθ.

Assume that each distribution Pθ has a probabilitydensity function (pdf) fθ(y).All results can be applied to discrete random variables.

Page 38: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

MotivationMain objectives

Practical examples

Test betweentwo simplehypothesesBasic definitions

Bayes test

Most powerful test

Example

Test betweentwo composedhypothesesBasic definitions

UMP Test

Example

GLR test

Example

Basic definitions

Definition (composed hypothesis)Any nonsimple hypothesis is called a composed hypothesis.

DefinitionLet us denote H1 : θ ∈ Θ1 and H2 : θ ∈ Θ2 withΘ1 ∩Θ2 = ∅ and Θ1,Θ2 two specified subsets of Θ.

Definition (size of a test)Let α1(g) be the size of a test defined by:

α1(g) = supθ∈Θ1

Pr(g(y1, . . . ,yn) 6= H1 | H1 true)

= supθ∈Θ1

Prθ(g(y1, . . . ,yn) 6= H1).

and let Kα the class of tests with fixed size:

Kα = g : α1(g) ≤ α.

Page 39: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

MotivationMain objectives

Practical examples

Test betweentwo simplehypothesesBasic definitions

Bayes test

Most powerful test

Example

Test betweentwo composedhypothesesBasic definitions

UMP Test

Example

GLR test

Example

Uniformly most powerful test

Definition (power function of a test)The power function of a test g is defined by:

βg(θ) = Prθ(g(y1, . . . ,yn) = H2), θ ∈ Θ2.

Definition (Uniformly Most Powerful test)A test g∗ ∈ Kα is said to be Uniformly Most Powerful (UMP)in the class Kα of tests with fixed size α1(g) = α if, for allother tests g ∈ Kα, we have:

∀θ ∈ Θ2, βg(θ) ≤ βg∗(θ).

Page 40: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

MotivationMain objectives

Practical examples

Test betweentwo simplehypothesesBasic definitions

Bayes test

Most powerful test

Example

Test betweentwo composedhypothesesBasic definitions

UMP Test

Example

GLR test

Example

Graphical interpretation

0

1

Other tests

UMP test

α

β(θ)

θ

θΘ1 = 0 ≤ θ < θ Θ2 = θ ≤ θ

Page 41: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

MotivationMain objectives

Practical examples

Test betweentwo simplehypothesesBasic definitions

Bayes test

Most powerful test

Example

Test betweentwo composedhypothesesBasic definitions

UMP Test

Example

GLR test

Example

Location testing with Gaussian errors

Assume yi ∼ N (θ, 1).

The two hypotheses are H1 : θ = 0 andH2 : θ ≥ θ2 with θ2 > 0.

The pdf of a Gaussian variable N (θ, 1) isϕθ(x) = ϕ(x− θ) with

ϕ(x) =1√2π

exp (−x2

2).

QuestionFind the UMP test.

Page 42: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

MotivationMain objectives

Practical examples

Test betweentwo simplehypothesesBasic definitions

Bayes test

Most powerful test

Example

Test betweentwo composedhypothesesBasic definitions

UMP Test

Example

GLR test

Example

Solution

The Neyman-Pearson test between H1 : θ = 0 andH2(θ2) : θ = θ2 is given by

g∗(y1, . . . ,yn) =

H1 if 1√n

∑ni=1 yi < λα

H2(θ2) if 1√n

∑ni=1 yi ≥ λα

.

Under H1, 1√n

∑ni=1 yi ∼ N (0, 1) and λα = Φ−1(1− α)

where Φ is the cumulative function of the standardizedGaussian variable N (0, 1).

Since the decision function 1√n

∑ni=1 yi and the

threshold λα do not depend on θ2, the test g∗ is MP forall θ2 > 0 and, hence, it is an UMP test.

Page 43: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

MotivationMain objectives

Practical examples

Test betweentwo simplehypothesesBasic definitions

Bayes test

Most powerful test

Example

Test betweentwo composedhypothesesBasic definitions

UMP Test

Example

GLR test

Example

Generalized Likelihood Ratio test

DefinitionWe say that a test gGLR is a Generalized Likelihood Ratio(GLR) test for testing between H1 = θ : θ ∈ Θ1 andH2 = θ : θ ∈ Θ2 when

gGLR(y1, . . . ,yn) =

H1 if ΛGLR(y1, . . . ,yn) < λα

H2 if ΛGLR(y1, . . . ,yn) ≥ λα

with ΛGLR(y1, . . . ,yn) =supθ2∈Θ2

∏ni=1 fθ2(yi)

supθ1∈Θ1

∏ni=1 fθ1(yi)

.

RemarkThe optimality of the GLR test is established for certaincases (exponential families when n → +∞ for example) butit is not necessary optimal for all cases.

Page 44: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

MotivationMain objectives

Practical examples

Test betweentwo simplehypothesesBasic definitions

Bayes test

Most powerful test

Example

Test betweentwo composedhypothesesBasic definitions

UMP Test

Example

GLR test

Example

Location testing with Gaussian errors

Assume yi ∼ N (θ, 1).

The two hypotheses are H1 : θ = |θ1| ≤ a andH2 : θ = |θ2| ≥ b with 0 < a < b.

The pdf of a Gaussian variable N (θ, 1) isϕθ(x) = ϕ(x− θ) with

ϕ(x) =1√2π

exp (−x2

2).

QuestionFind the GLR test.

Page 45: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

MotivationMain objectives

Practical examples

Test betweentwo simplehypothesesBasic definitions

Bayes test

Most powerful test

Example

Test betweentwo composedhypothesesBasic definitions

UMP Test

Example

GLR test

Example

Solution

2nlog ΛGLR(y1, . . . ,yn) =

2nlog

sup|θ2|≥b

∏ni=1

fθ2 (yi)

sup|θ1|≤a

∏ni=1

fθ1 (yi),

which leads to

2

nlog ΛGLR(y1, . . . ,yn)=

−(y − b)2 if |y| ≤ a

−(y − b)2 + (y − a)2 if a ≤ |y| ≤ b

(y − a)2 if |y| ≥ b

,

with y = 1n

∑ni=1 yi.

Since 2nlog ΛGLR(y1, . . . ,yn) is an increasing function

of |y|, it follows that:

gGLR(y1, . . . ,yn) =

H1 if y2 < λα

H2 if y2 ≥ λα.

When y1, . . . ,yn ∼ N (θ, 1), y2 ∼ χ2n(‖θ‖22), which leads

to λα = Ψ−1n,a2

(1− α) where Ψn,a2 is the cumulativefunction of a χ2 variable with n degrees of freedom andthe non-centrality parameter a2.

Page 46: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

MotivationMotivation

SPRTBasic definitions

Definition

Example

Optimality

Threshold selection

Changedetection:known changeMotivation

Basic definitions

CUSUM algorithm

Asymptotical bound

Optimality

Changedetection:unknownchangeMotivation

Unknown changetype

Weighting functions

Invariant principle

GLR algorithm

Part III

Sequential approaches

Page 47: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

MotivationMotivation

SPRTBasic definitions

Definition

Example

Optimality

Threshold selection

Changedetection:known changeMotivation

Basic definitions

CUSUM algorithm

Asymptotical bound

Optimality

Changedetection:unknownchangeMotivation

Unknown changetype

Weighting functions

Invariant principle

GLR algorithm

Outlines of Part III

8 Motivation

9 Sequential probability ratio test

10 Change detection: known change

11 Change detection: unknown change

Page 48: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

MotivationMotivation

SPRTBasic definitions

Definition

Example

Optimality

Threshold selection

Changedetection:known changeMotivation

Basic definitions

CUSUM algorithm

Asymptotical bound

Optimality

Changedetection:unknownchangeMotivation

Unknown changetype

Weighting functions

Invariant principle

GLR algorithm

Motivation

In the previous part, we have shown that it is possibleto minimize the error probabilities for a given samplesize n.

New problem: for given error probabilities, try tominimize the sample size or, equivalently, to make thedecision with as few observations as possible.

Sequential analysis is the theory of solving hypothesistesting problems when the sample size is not fixed apriori .

Page 49: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

MotivationMotivation

SPRTBasic definitions

Definition

Example

Optimality

Threshold selection

Changedetection:known changeMotivation

Basic definitions

CUSUM algorithm

Asymptotical bound

Optimality

Changedetection:unknownchangeMotivation

Unknown changetype

Weighting functions

Invariant principle

GLR algorithm

Basic definitions (1/2)

Definition (Stopping time)A random variable T is called a stopping time with respectto a process y1, . . . ,yn, . . . if T takes only integer valuesand if, for every n ≥ 1, the event T = n is determined by(y1, . . . ,yn).

ExampleThe first time at which the process y1, . . . ,yn, . . . visitsa set A is a stopping time.

The last time at which the process y1, . . . ,yn, . . . visitsa set A is NOT a stopping time.

Page 50: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

MotivationMotivation

SPRTBasic definitions

Definition

Example

Optimality

Threshold selection

Changedetection:known changeMotivation

Basic definitions

CUSUM algorithm

Asymptotical bound

Optimality

Changedetection:unknownchangeMotivation

Unknown changetype

Weighting functions

Invariant principle

GLR algorithm

Basic definitions (2/2)

Definition (Sequential test)A sequential test for testing between between simplehypotheses H1 = y1, . . . ,yn ∼ f1 andH2 = y1, . . . ,yn ∼ f2 is defined to be a pair (g, T ) whereT is a stopping time and g(y1, . . . ,yn) is a decision function.

Definition (Closed test)We say that a sequential test (g, T ) is closed if

P (T < +∞) = 1.

RemarkFor a closed test (g, T ), the mean number of observationsnecessary to decide between the two hypotheses is alwaysfinite: E1(T < +∞) < +∞ and E2(T < +∞) < +∞.

Page 51: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

MotivationMotivation

SPRTBasic definitions

Definition

Example

Optimality

Threshold selection

Changedetection:known changeMotivation

Basic definitions

CUSUM algorithm

Asymptotical bound

Optimality

Changedetection:unknownchangeMotivation

Unknown changetype

Weighting functions

Invariant principle

GLR algorithm

Sequential Probability Ratio Test (SPRT)

Definition (SPRT)The test (g, T ) is a Sequential Probability Ratio Test (SPRT)for testing between simple hypotheses H1 and H2 if wesequentially observe data y1, . . . ,yn and if, at time n, wemake one of the following decisions:

accept H1 when Sn ≤ −a ;

accept H2 when Sn ≥ b ;

continue to observe and to test when −a < Sn < b,

Sn =n∑

i=1

logf2(yi)

f1(yi)

and a, b are thresholds such that −∞ < −a < b < +∞.

RemarkThe SPRT is closed.

Page 52: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

MotivationMotivation

SPRTBasic definitions

Definition

Example

Optimality

Threshold selection

Changedetection:known changeMotivation

Basic definitions

CUSUM algorithm

Asymptotical bound

Optimality

Changedetection:unknownchangeMotivation

Unknown changetype

Weighting functions

Invariant principle

GLR algorithm

Sequential location testing

Assume yi ∼ N (θ, 1).

The two hypotheses are H1 : θ = 0 andH2 : θ = 2.

The pdf of a Gaussian variable N (θ, 1) isϕθ(x) = ϕ(x− θ) with

ϕ(x) =1√2π

exp (−x2

2).

QuestionFind the SPRT.

Page 53: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

MotivationMotivation

SPRTBasic definitions

Definition

Example

Optimality

Threshold selection

Changedetection:known changeMotivation

Basic definitions

CUSUM algorithm

Asymptotical bound

Optimality

Changedetection:unknownchangeMotivation

Unknown changetype

Weighting functions

Invariant principle

GLR algorithm

Solution

Sn =∑n

i=1 logϕ(yi−2)ϕ(yi)

= 2∑n

i=1(yi − 1).

Simulated data:

0 10 20 30 40 50 60-1

0

1

2

3

4

5

yi

i

Page 54: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

MotivationMotivation

SPRTBasic definitions

Definition

Example

Optimality

Threshold selection

Changedetection:known changeMotivation

Basic definitions

CUSUM algorithm

Asymptotical bound

Optimality

Changedetection:unknownchangeMotivation

Unknown changetype

Weighting functions

Invariant principle

GLR algorithm

Solution

Sn =∑n

i=1 logϕ(yi−2)ϕ(yi)

= 2∑n

i=1(yi − 1).

Simulated SPRT:

0 10 20 30 40 50 60-60

-40

-20

0

20

40

60

80

100

120

Sn

n

−a

b

acceptance zone of H1

acceptance zone of H2

Page 55: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

MotivationMotivation

SPRTBasic definitions

Definition

Example

Optimality

Threshold selection

Changedetection:known changeMotivation

Basic definitions

CUSUM algorithm

Asymptotical bound

Optimality

Changedetection:unknownchangeMotivation

Unknown changetype

Weighting functions

Invariant principle

GLR algorithm

Optimality of the SPRT

DefinitionDenote Kα1,α2

the class of all (sequential andnonsequential) tests (g, T ) such that

α1(g) ≤ α1 , α2(g) ≤ α2

E1(T ) < +∞ , E2(T ) < +∞,

where Ei(T ) is the mean number of observations under Hi.

Let (g, T ) ∈ Kα1,α2a SPRT test for testing between

hypotheses H1 and H2.

Theorem

For every test (g, T ) ∈ Kα1,α2, we have:

E1(T ) ≤ E1(T ) , E2(T ) ≤ E2(T ).

Page 56: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

MotivationMotivation

SPRTBasic definitions

Definition

Example

Optimality

Threshold selection

Changedetection:known changeMotivation

Basic definitions

CUSUM algorithm

Asymptotical bound

Optimality

Changedetection:unknownchangeMotivation

Unknown changetype

Weighting functions

Invariant principle

GLR algorithm

Threshold selection: Wald’s identity

TheoremThe error probabilities of (g, T ) verify:

logα2(g)

1− α1(g)≤ min0,−a , log

1− α2(g)

α1(g)≥ max0, b.

RemarkThe equalities hold for the SPRT when the excess over theboundary are small:

Pr1(ST =−a |H1 is accepted)≃Pr2(ST =b |H2 is accepted)≃1.

The thresholds may be chosen by using the followingapproximations: a ≃ log 1−α1

α2, b ≃ log 1−α2

α1.

Page 57: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

MotivationMotivation

SPRTBasic definitions

Definition

Example

Optimality

Threshold selection

Changedetection:known changeMotivation

Basic definitions

CUSUM algorithm

Asymptotical bound

Optimality

Changedetection:unknownchangeMotivation

Unknown changetype

Weighting functions

Invariant principle

GLR algorithm

Motivation

The aim is to detect the occurrence of a change assoon as possible, with a fixed rate of false alarm beforethe unknown change time t0.

Let y1,y2, . . . be a random sequence with pdf fθ(yk).Until the unknown time t0, the parameter is θ = θ1 andfrom t0 becomes θ = θ2.

Let ta be the alarm time (stopping time) at which adetection occurs.

For estimating the efficiency of the detection, it isconvenient to use the mean time between false alarmsand mean delay for detection.

Page 58: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

MotivationMotivation

SPRTBasic definitions

Definition

Example

Optimality

Threshold selection

Changedetection:known changeMotivation

Basic definitions

CUSUM algorithm

Asymptotical bound

Optimality

Changedetection:unknownchangeMotivation

Unknown changetype

Weighting functions

Invariant principle

GLR algorithm

Basic definitions (1/2)

It is assumed that the change time t0 is non-random.

Definition (Mean time between false alarms)We define mean time between false alarms as the followingexpectation:

T = Eθ1(ta)

where ta is the alarm time.

Definition

Let Kγ = ta : T = Eθ1(ta) ≥ γ the class of all sequentialalgorithms with a bounded mean time between false alarms.

Page 59: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

MotivationMotivation

SPRTBasic definitions

Definition

Example

Optimality

Threshold selection

Changedetection:known changeMotivation

Basic definitions

CUSUM algorithm

Asymptotical bound

Optimality

Changedetection:unknownchangeMotivation

Unknown changetype

Weighting functions

Invariant principle

GLR algorithm

Basic definitions (2/2)

Definition (Essential supremum)Let (yi)i∈I be a family of real-valued random variablesbounded by another variable. We say that y is an essentialsupremum for (yi)i∈I , which is denoted y = ess supIyi, if

∀i ∈ I,Pr(yi ≤ z) = 1 ⇔ Pr(y ≤ z) = 1.

Definition (Conditional mean delay)We define conditional mean delay for detection as:

Eθ1(ta − t0 + 1 | ta ≥ t0,y1, . . . ,yt0−1).

Definition (Worst mean delay)We define worst mean delay for detection as:

τ∗(ta) = supt0≥1

ess sup Eθ1(ta − t0 + 1 | ta ≥ t0,y1, . . . ,yt0−1).

Page 60: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

MotivationMotivation

SPRTBasic definitions

Definition

Example

Optimality

Threshold selection

Changedetection:known changeMotivation

Basic definitions

CUSUM algorithm

Asymptotical bound

Optimality

Changedetection:unknownchangeMotivation

Unknown changetype

Weighting functions

Invariant principle

GLR algorithm

CUmulated SUM (CUSUM)

Definition (CUSUM)The CUSUM algorithm ta is defined by:

ta = mink ≥ 1 : gk ≥ h

wheregk = Sk −mk,

Sk =

k∑

i=1

si =

k∑

i=1

logfθ2(yi)

fθ1(yi),

mk = min1≤j<k

Sj ,

and h is the threshold.

Page 61: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

MotivationMotivation

SPRTBasic definitions

Definition

Example

Optimality

Threshold selection

Changedetection:known changeMotivation

Basic definitions

CUSUM algorithm

Asymptotical bound

Optimality

Changedetection:unknownchangeMotivation

Unknown changetype

Weighting functions

Invariant principle

GLR algorithm

Intuitive derivation of the CUSUM

0 10 20 30 40 50-200

-150

-100

-50

0

50

100

0 10 20 30 40 50-3

-2

-1

0

1

2

3

4

5

6

yk

k k

Sk

h

Alarm time

mk

Eθ1(si) < 0 Eθ2(si) > 0

Page 62: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

MotivationMotivation

SPRTBasic definitions

Definition

Example

Optimality

Threshold selection

Changedetection:known changeMotivation

Basic definitions

CUSUM algorithm

Asymptotical bound

Optimality

Changedetection:unknownchangeMotivation

Unknown changetype

Weighting functions

Invariant principle

GLR algorithm

CUmulated SUM (CUSUM)

Definition (CUSUM recursive form)The CUSUM algorithm ta can be rewritten:

ta = mink ≥ 1 : Gk ≥ h

where

G0 = 0, Gk =

[

Gk−1 + logfθ2(yk)

fθ1(yk)

]+

,

x+ = max0, x.

Page 63: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

MotivationMotivation

SPRTBasic definitions

Definition

Example

Optimality

Threshold selection

Changedetection:known changeMotivation

Basic definitions

CUSUM algorithm

Asymptotical bound

Optimality

Changedetection:unknownchangeMotivation

Unknown changetype

Weighting functions

Invariant principle

GLR algorithm

CUmulated SUM (CUSUM)

Definition (Kullback-Leibler distance)The Kullback-Leibler distance between two probabilitydensities fθ1 and fθ2 is defined as:

1,2 =

logfθ1(y)

fθ2(y)fθ1(y)dy.

This distance is always positive and is zero only when thetwo densities are equal.

Theorem (Lorden)Let n(γ) = infta∈Kγτ∗(ta). Then

n(γ) =log γ

2,1(1 + o(1))

as γ → +∞, where o(1) stands for a negligible term such aso(1) → 0 as γ → +∞.

Page 64: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

MotivationMotivation

SPRTBasic definitions

Definition

Example

Optimality

Threshold selection

Changedetection:known changeMotivation

Basic definitions

CUSUM algorithm

Asymptotical bound

Optimality

Changedetection:unknownchangeMotivation

Unknown changetype

Weighting functions

Invariant principle

GLR algorithm

CUmulated SUM (CUSUM)

Theorem (Lorden)Let a CUSUM algorithm ta designed to verifyT = Eθ1(ta) = γ with γ > 0. Then we have the followingequality:

τ∗(ta) =log γ

2,1(1 + o(1))

as γ → +∞.

Theorem (Optimality of the CUSUM)The CUSUM algorithm is asymptotically optimal in the classKγ .

Page 65: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

MotivationMotivation

SPRTBasic definitions

Definition

Example

Optimality

Threshold selection

Changedetection:known changeMotivation

Basic definitions

CUSUM algorithm

Asymptotical bound

Optimality

Changedetection:unknownchangeMotivation

Unknown changetype

Weighting functions

Invariant principle

GLR algorithm

Motivation

In practice, the distribution after the change is rarelyknown.

Let y1,y2, . . . be a random sequence with pdf fθ(yk).Until the unknown time t0, the parameter is θ1 and fromt0 becomes θ2 ∈ Θ2 where the set Θ2 is known.

Three main solutions:Weighted likelihood ratio ;

Invariant likelihood ratio;

Generalized likelihood ratio.

Page 66: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

MotivationMotivation

SPRTBasic definitions

Definition

Example

Optimality

Threshold selection

Changedetection:known changeMotivation

Basic definitions

CUSUM algorithm

Asymptotical bound

Optimality

Changedetection:unknownchangeMotivation

Unknown changetype

Weighting functions

Invariant principle

GLR algorithm

Method of weighting functions

It is assumed that θ2 follows a distribution a priori:θ2 ∼ p(θ2).

After the change, the observations yt0 ,yt0+1, . . . followthe distribution:

fθ2(yk) =

Θ2

fθ2(yk)p(θ2)dθ2 , k ≥ t0

⇒ the hypotheses after the change becomes simple.

We can then apply the CUSUM algorithm.

Page 67: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

MotivationMotivation

SPRTBasic definitions

Definition

Example

Optimality

Threshold selection

Changedetection:known changeMotivation

Basic definitions

CUSUM algorithm

Asymptotical bound

Optimality

Changedetection:unknownchangeMotivation

Unknown changetype

Weighting functions

Invariant principle

GLR algorithm

Invariant principle

Certain problems are typically invariant with respect toa group of transformation.

The complexity of the hypotheses is then reduced byconsidering only the maximal invariant statistics.

An invariant statistic is a function of the observationssuch as:

the function is invariant with respect to the group oftransformations ;

all other invariant functions depend on this maximalinvariant.

The simplified problem is then solved by using classicaltools.

Page 68: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

MotivationMotivation

SPRTBasic definitions

Definition

Example

Optimality

Threshold selection

Changedetection:known changeMotivation

Basic definitions

CUSUM algorithm

Asymptotical bound

Optimality

Changedetection:unknownchangeMotivation

Unknown changetype

Weighting functions

Invariant principle

GLR algorithm

Example

Notation: Np(θ, Ip) denotes the p-dimensionalGaussian distribution with unit covariance matrix andmean θ ∈ R

p.

Problem: the observations y1,y2, . . . follow thedistribution Np(0, Ip) before the change and thedistribution Np(θ, Ip) after the change, with‖θ‖22 =

∑pi=1 θ

2i = c2, c > 0 known.

This problem is invariant with respect to the group ofp-dimensional rotations. The invariant statistics are‖y1‖22, ‖y2‖22, . . ..

These “simplified” observations follow a central χ2

distribution with p degrees of freedom before thechange and a χ2 distribution with p degrees of freedomand the non-centrality parameter c2 after the change.

Page 69: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

MotivationMotivation

SPRTBasic definitions

Definition

Example

Optimality

Threshold selection

Changedetection:known changeMotivation

Basic definitions

CUSUM algorithm

Asymptotical bound

Optimality

Changedetection:unknownchangeMotivation

Unknown changetype

Weighting functions

Invariant principle

GLR algorithm

GLR algorithm

It is based on the principle of the GLR test.

ta = mink ≥ 1 : gk ≥ h with

gk = max1≤j≤k

supθ∈Θ2

k∑

i=j

logfθ2(yi)

fθ1(yi).

The properties of optimality of this algorithm are notknown, except for certain cases (exponentialfamilies,. . . ).

Page 70: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

DOS attackdetectionAttack scheme

Detection scheme

Problem statement

LR CUSUMLR CUSUM

Optimality

NP-CUSUMPrinciple

NP-CUSUM

ExampleA poisson example

Comparison

Part IV

A case study

Page 71: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

DOS attackdetectionAttack scheme

Detection scheme

Problem statement

LR CUSUMLR CUSUM

Optimality

NP-CUSUMPrinciple

NP-CUSUM

ExampleA poisson example

Comparison

Outlines of Part IV

12 DOS attack detection

13 Multichannel parametric CUSUM

14 Multichannel non-parametric CUSUM

15 Practical example

Page 72: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

DOS attackdetectionAttack scheme

Detection scheme

Problem statement

LR CUSUMLR CUSUM

Optimality

NP-CUSUMPrinciple

NP-CUSUM

ExampleA poisson example

Comparison

Typical “SYN flooding” attack scheme

The SYN flooding attacks exploit the TCP’s three-wayhand-shake mechanism and its limitation in maintaininghalf-open connections.

ACK ???

SYN

SYN

SYN/ACK

TCP connection

Client Server

Timeout

Half-open connection

Page 73: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

DOS attackdetectionAttack scheme

Detection scheme

Problem statement

LR CUSUMLR CUSUM

Optimality

NP-CUSUMPrinciple

NP-CUSUM

ExampleA poisson example

Comparison

Typical detection scheme

It is aimed to detect Denial Of Services (DOS) attacks:SYN flooding attacks, UDP packet storm,. . . .

A DOS attack is generally characterized by an increaseof the number of packets of a particular size.

Principle of monitoring:To split packet sizes into a set of bins (or channels),To monitor these channels simultaneously,To detect a change in one of these channels.

Page 74: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

DOS attackdetectionAttack scheme

Detection scheme

Problem statement

LR CUSUMLR CUSUM

Optimality

NP-CUSUMPrinciple

NP-CUSUM

ExampleA poisson example

Comparison

Problem statement

Denote N the number of channel and yk(i), k ≥ 1, thenumber of packets measured in the i-th channel at timek.

Until the unknown time t0, each random value yk(i)follows a distribution Pθ0,i and from t0, there is achange in the distribution, Pθi , of only one of therandom variable, say the i-th channel.

It is assumed that each distribution Pθ0,i and Pθi admitsa pdf denoted fθ0,i and fθi .

Page 75: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

DOS attackdetectionAttack scheme

Detection scheme

Problem statement

LR CUSUMLR CUSUM

Optimality

NP-CUSUMPrinciple

NP-CUSUM

ExampleA poisson example

Comparison

LR-CUSUM

Definition (LR-CUSUM)The multichannel parametric CUSUM, simply calledLR-CUSUM, algorithm ta is defined by:

ta = min1≤i≤N

ta(i)

where ta(i) = mink ≥ 1 : Uk(i) ≥ hi,

Uk(i) = max1≤j≤k

Skj (i)

Skj (i) =

k∑

ℓ=j

sℓ(i) =

k∑

ℓ=j

logfθi(yℓ(i))

fθ0,i(yℓ(i)),

and hi is the threshold adapted to the i-th channel.

Page 76: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

DOS attackdetectionAttack scheme

Detection scheme

Problem statement

LR CUSUMLR CUSUM

Optimality

NP-CUSUMPrinciple

NP-CUSUM

ExampleA poisson example

Comparison

Criterion of optimality

Definition (False alarm rate)The False Alarm Rate (FAR) is defined by:

FAR(ta) =1

Eθ0 [ta].

Definition (Average detection delay)

When the hypothesis Ht0,i = a change occurs at time t0 inthe i-th channel is true, the speed of detection is measuredby the conditional Average Detection Delay (ADD):

ADDt0,i(ta) = Et0,i[ta−t0+1 | ta ≥ t0] , t0 ≥ 1, i = 1, . . . , N.

Page 77: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

DOS attackdetectionAttack scheme

Detection scheme

Problem statement

LR CUSUMLR CUSUM

Optimality

NP-CUSUMPrinciple

NP-CUSUM

ExampleA poisson example

Comparison

Optimality of the LR-CUSUM

Assume hi = h for all i = 1, . . . , N .

Denote Ii =∫

logfθi (y)

fθ0,i(y)fθi(y)dy.

Theorem

Suppose Eθi [logfθi (yℓ(i))

fθ0,i(yℓ(i))]2< +∞ for all i. Then:

For all t0 ≥ 1 and i = 1, . . . , N :

ADDt0,i(ta) ∼h

Iias h → +∞.

If h = log(Nγ), then FAR(ta) ≤ γ and

infτ :FAR(τ)≤γ

supt0≥1

ADDt0,i(τ) ∼|log γ|Ii

as γ → +∞.

Page 78: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

DOS attackdetectionAttack scheme

Detection scheme

Problem statement

LR CUSUMLR CUSUM

Optimality

NP-CUSUMPrinciple

NP-CUSUM

ExampleA poisson example

Comparison

Non-parametric change detection

When the distributions Pθi are unknown, the likelihoodratios are also unknown.

The quantities Skj (i) should be replaced by appropriate

score function V kj (i) such as Eθ0 [V

kj (i)] < 0 and

Eθi [Vkj (i)] > 0.

Typical DOS attacks lead to abrupt changes in themean values of the number of packets. Therefore, thedecision function should be sensitive to changes inmean values.

Page 79: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

DOS attackdetectionAttack scheme

Detection scheme

Problem statement

LR CUSUMLR CUSUM

Optimality

NP-CUSUMPrinciple

NP-CUSUM

ExampleA poisson example

Comparison

Notations and definitions

Let µi = E0[yk(i)] and θi = Eθi [yk(i)] denote thepre-change and post-change mean values in the i-thchannel by assuming µi < θi.

Definition (Score function)

The score functions V kj (i) are defined by

V kj (i) =

k∑

ℓ=j

wi(yℓ(i)− µi − ci,ℓ) , i = 1, . . . , N,

where wi > 0,ci,ℓ > 0 are tuning parameters.

It is assumed that ci,ℓ = ci for all ℓ.

Denote Vi(yℓ(i)) = wi(yℓ(i)− µi − ci). We have:

E0Vi(yℓ(i))=−wi ci < 0 and EθiVi(yℓ(i))=wi (θi−µi−ci) > 0.

for ci judiciously chosen (0 < ci < θi − µi).

Page 80: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

DOS attackdetectionAttack scheme

Detection scheme

Problem statement

LR CUSUMLR CUSUM

Optimality

NP-CUSUMPrinciple

NP-CUSUM

ExampleA poisson example

Comparison

Definition of NP-CUSUM

Definition (NP-CUSUM)The NP-CUSUM algorithm tv is defined by:

t′a = mink ≥ 1 : max1≤i≤N

Wk(i) ≥ h

whereWk(i) = max

1≤j≤kV kj (i)

and h is a threshold.

Page 81: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

DOS attackdetectionAttack scheme

Detection scheme

Problem statement

LR CUSUMLR CUSUM

Optimality

NP-CUSUMPrinciple

NP-CUSUM

ExampleA poisson example

Comparison

A poisson example

Assume the size of packet in the i-th channel followsthe poisson distribution P(µi) in the pre-change modeand P(θi) after the change occurs in the i-th channel:

Pr(yk(i) = m) =(µi)

m

m!e−µi , k < t0

Pr(yk(i) = m) =(θi)

m

m!e−θi , k ≥ t0.

It is assumed that θi, µi are known and θi > µi.

QuestionFind the LR-CUSUM ;

Show that the NP-CUSUM is asymptotically optimalwhen ci = εiθi where the variables εi need to bespecified.

Page 82: Statistical Decision Theory · The SNMP server maintains a database of management variables called the Management Information Base (MIB) variables. It is a widely deployed protocol

StatisticalDecisionTheory

L. Fillatre

DOS attackdetectionAttack scheme

Detection scheme

Problem statement

LR CUSUMLR CUSUM

Optimality

NP-CUSUMPrinciple

NP-CUSUM

ExampleA poisson example

Comparison

Comparison between the algorithms

The LR-CUSUM is based on the statistics

Sℓ(i) = yℓ(i) log(θi/µi)− (θi − µi).

The NP-CUSUM is based on the statistics

Vi(yℓ(i)) = wi(yℓ(i)− µi − εiθi).

It is straightforward to verify that the NP-CUSUMcoincides with the LR-CUSUM test if

εi =Qi − logQi − 1

Qi logQi, wi = logQi

with Qi = θi/µi, which proves that the NP-CUSUM isasymptotically optimal.