V-detector: a real-valued negative selection algorithm Zhou Ji St. Jude Childrens Research Hospital.
-
Upload
kayla-york -
Category
Documents
-
view
220 -
download
1
Transcript of V-detector: a real-valued negative selection algorithm Zhou Ji St. Jude Childrens Research Hospital.
![Page 1: V-detector: a real-valued negative selection algorithm Zhou Ji St. Jude Childrens Research Hospital.](https://reader035.fdocuments.us/reader035/viewer/2022081414/5514853c550346ea6e8b4bba/html5/thumbnails/1.jpg)
V-detector: a real-valued negative selection
algorithm
Zhou JiSt. Jude Children’s Research Hospital
![Page 2: V-detector: a real-valued negative selection algorithm Zhou Ji St. Jude Childrens Research Hospital.](https://reader035.fdocuments.us/reader035/viewer/2022081414/5514853c550346ea6e8b4bba/html5/thumbnails/2.jpg)
What is negative selection?
Biological background: T cells, thymus Major steps:
1. Generate candidates randomly
2. Eliminate those that recognize self samples
![Page 3: V-detector: a real-valued negative selection algorithm Zhou Ji St. Jude Childrens Research Hospital.](https://reader035.fdocuments.us/reader035/viewer/2022081414/5514853c550346ea6e8b4bba/html5/thumbnails/3.jpg)
Main steps
Generation detection
![Page 4: V-detector: a real-valued negative selection algorithm Zhou Ji St. Jude Childrens Research Hospital.](https://reader035.fdocuments.us/reader035/viewer/2022081414/5514853c550346ea6e8b4bba/html5/thumbnails/4.jpg)
What is matching rule?
When a sample and a detector are considered matching.
Matching rule plays an important role in negative selection algorithm. It largely depends on the data representation.
![Page 5: V-detector: a real-valued negative selection algorithm Zhou Ji St. Jude Childrens Research Hospital.](https://reader035.fdocuments.us/reader035/viewer/2022081414/5514853c550346ea6e8b4bba/html5/thumbnails/5.jpg)
In real-valued representation, detector can be visualized as hyper-sphere.Candidate 1: thrown-away; candidate 2: made a detector.
Match or not match?
![Page 6: V-detector: a real-valued negative selection algorithm Zhou Ji St. Jude Childrens Research Hospital.](https://reader035.fdocuments.us/reader035/viewer/2022081414/5514853c550346ea6e8b4bba/html5/thumbnails/6.jpg)
Main idea of V-detector
By allowing the detectors to have some variable properties, V-detector enhances negative selection algorithm from several aspects: It takes fewer large detectors to cover non-self region –
saving time and space Small detector covers “holes” better. Coverage is estimated when the detector set is generated.
The shapes of detectors or even the types of matching rules can be extended to be variable too.
![Page 7: V-detector: a real-valued negative selection algorithm Zhou Ji St. Jude Childrens Research Hospital.](https://reader035.fdocuments.us/reader035/viewer/2022081414/5514853c550346ea6e8b4bba/html5/thumbnails/7.jpg)
Main concept of Negative Selection and V-detector
Constant-sized detectors Variable-sized detectors
![Page 8: V-detector: a real-valued negative selection algorithm Zhou Ji St. Jude Childrens Research Hospital.](https://reader035.fdocuments.us/reader035/viewer/2022081414/5514853c550346ea6e8b4bba/html5/thumbnails/8.jpg)
Outline of the algorithm (generation of variable-sized detector set)
![Page 9: V-detector: a real-valued negative selection algorithm Zhou Ji St. Jude Childrens Research Hospital.](https://reader035.fdocuments.us/reader035/viewer/2022081414/5514853c550346ea6e8b4bba/html5/thumbnails/9.jpg)
Detector Set Generation Algorithm
Dreturn :20
maxT|D| Until:19
exit coverage) self maximum-1/(1 T if :18
1TT else :17
r radius and location xith detector w a is r x, where},,{DD then 0r if :16
:sr-drr then sr-d if :15
xand isbetween distanceEuclidean d :14
Sin severy for Repeat :13
:4 togo :12
return then )01/(1 tif :11
1t t :10
iddetector of radius theis )ir(d where then,)ir(ddd if :9
id oflocation theis )i x(d where x,and )idbetween x( distanceEuclidean dd :8
...} 2, 1,i .i{dDin idevery for Repeat :7
]01[ from sample random :6
inifiniter :5
0T :4
0 t :3
Repeat :2
D :1
coverage estimated :0c
radius self :
detector ofnumber maximum :maxT
samples self ofset :
),maxT Set(S,-Detector-V
rx
i
Dc
n, x
sr
S
ocs, r
D
mD
xDD
xisd
iisSis
nx
sr
m
S
srm,(S
return :9
|| Until:8
} { :7
2 togo ,srd if :6
and between distanceEuclidean :5
,...}2,1,{in every for Repeat :4
0] [1, from sample random :3
Repeat :2
D :1
radius self: :
detectors ofnumber :
samples self ofset :
),Set-Detector
Constant-sized detectors
Variable-sized detectors
![Page 10: V-detector: a real-valued negative selection algorithm Zhou Ji St. Jude Childrens Research Hospital.](https://reader035.fdocuments.us/reader035/viewer/2022081414/5514853c550346ea6e8b4bba/html5/thumbnails/10.jpg)
Screenshots of the software
Message view Visualization of data points and detectors
![Page 11: V-detector: a real-valued negative selection algorithm Zhou Ji St. Jude Childrens Research Hospital.](https://reader035.fdocuments.us/reader035/viewer/2022081414/5514853c550346ea6e8b4bba/html5/thumbnails/11.jpg)
Experiments and Results Synthetic Data
2D. Training data are randomly chosen from the normal region. Fisher’s Iris Data
One of the three types is considered as “normal”. Biomedical Data
Abnormal data are the medical measures of disease carrier patients.
Air Pollution Data Abnormal data are made by artificially altering the normal air
measurements Ball bearings:
Measurement: time series data with preprocessing - 30D and 5D
![Page 12: V-detector: a real-valued negative selection algorithm Zhou Ji St. Jude Childrens Research Hospital.](https://reader035.fdocuments.us/reader035/viewer/2022081414/5514853c550346ea6e8b4bba/html5/thumbnails/12.jpg)
Synthetic data - Cross-shaped self space Shape of self region and example detector coverage
(a) Actual self space (b) self radius = 0.05 (c) self radius = 0.1
![Page 13: V-detector: a real-valued negative selection algorithm Zhou Ji St. Jude Childrens Research Hospital.](https://reader035.fdocuments.us/reader035/viewer/2022081414/5514853c550346ea6e8b4bba/html5/thumbnails/13.jpg)
Synthetic data - Cross-shaped self
space Results
0
20
40
60
80
100
120
0.01 0.03 0.05 0.07 0.09 0.11 0.13 0.15 0.17 0.19
self radius
det
ecti
on
rat
e
0
10
20
30
40
50
60
70
80
90
fals
e a
larm
rat
e
Detection rate (99.99% coverage) Detection rate (99% coverage)False alarm rate (99% coverage) False alarm rate (99.99% coverage)
0
200
400
600
800
1000
1200
0.01 0.03 0.05 0.07 0.09 0.11 0.13 0.15 0.17 0.19
self radius
nu
mb
er o
f d
etec
tors
99.99% coverage 99% coverage
Detection rate and false alarm rate Number of detectors
![Page 14: V-detector: a real-valued negative selection algorithm Zhou Ji St. Jude Childrens Research Hospital.](https://reader035.fdocuments.us/reader035/viewer/2022081414/5514853c550346ea6e8b4bba/html5/thumbnails/14.jpg)
Error rates
0
5
10
15
20
25
30
35
40
45
0.01 0.03 0.05 0.07 0.09 0.11 0.13 0.15 0.17 0.19
self radius
err
or
rate
(p
erc
en
tag
e)
false negative (99% coverage) false positive (99% coverage)
![Page 15: V-detector: a real-valued negative selection algorithm Zhou Ji St. Jude Childrens Research Hospital.](https://reader035.fdocuments.us/reader035/viewer/2022081414/5514853c550346ea6e8b4bba/html5/thumbnails/15.jpg)
Synthetic data - Ring-shaped self space Shape of self region and example detector coverage
(a) Actual self space (b) self radius = 0.05 (c) self radius = 0.1
![Page 16: V-detector: a real-valued negative selection algorithm Zhou Ji St. Jude Childrens Research Hospital.](https://reader035.fdocuments.us/reader035/viewer/2022081414/5514853c550346ea6e8b4bba/html5/thumbnails/16.jpg)
0
20
40
60
80
100
120
0.01 0.03 0.05 0.07 0.09 0.11 0.13 0.15 0.17 0.19
self radius
det
ecti
on
rat
e
0
10
20
30
40
50
60
70
fals
e a
larm
rat
e
Detection rate (99.99% coverage) Detection rate (99% coverage)False alarm rate (99% coverage) False alarm rate (99.99% coverage)
0
200
400
600
800
1000
1200
0.01 0.03 0.05 0.07 0.09 0.11 0.13 0.15 0.17 0.19
self radius
nu
mb
er o
f d
etec
tors
99.99% coverage 99% coverage
Synthetic data - Ring-shaped self
space Results
Detection rate and false alarm rate Number of detectors
![Page 17: V-detector: a real-valued negative selection algorithm Zhou Ji St. Jude Childrens Research Hospital.](https://reader035.fdocuments.us/reader035/viewer/2022081414/5514853c550346ea6e8b4bba/html5/thumbnails/17.jpg)
Iris dataComparison with other methods: performance
Detection rate False alarm rate
Setosa 100% MILA 95.16 0
NSA (single level) 100 0
V-detector 99.98 0
Setosa 50% MILA 94.02 8.42
NSA (single level) 100 11.18
V-detector 99.97 1.32
Versicolor 100% MILA 84.37 0
NSA (single level) 95.67 0
V-detector 85.95 0
Versicolor 50% MILA 84.46 19.6
NSA (single level) 96 22.2
V-detector 88.3 8.42
Virginica 100% MILA 75.75 0
NSA (single level) 92.51 0
V-detector 81.87 0
Virginica 50% MILA 88.96 24.98
NSA (single level) 97.18 33.26
V-detector 93.58 13.18
![Page 18: V-detector: a real-valued negative selection algorithm Zhou Ji St. Jude Childrens Research Hospital.](https://reader035.fdocuments.us/reader035/viewer/2022081414/5514853c550346ea6e8b4bba/html5/thumbnails/18.jpg)
Iris dataComparison with other methods: number of detectors
mean max Min SD
Setosa 100% 20 42 5 7.87
Setosa 50% 16.44 33 5 5.63
Veriscolor 100% 153.24 255 72 38.8
Versicolor 50% 110.08 184 60 22.61
Virginica 100% 218.36 443 78 66.11
Virginica 50% 108.12 203 46 30.74
![Page 19: V-detector: a real-valued negative selection algorithm Zhou Ji St. Jude Childrens Research Hospital.](https://reader035.fdocuments.us/reader035/viewer/2022081414/5514853c550346ea6e8b4bba/html5/thumbnails/19.jpg)
Iris dataVirginica as normal, 50% points used to train
0
20
40
60
80
100
120
0.01 0.03 0.05 0.07 0.09 0.11 0.13 0.15 0.17 0.19
self radius
de
tec
tio
n r
ate
0
10
20
30
40
50
60
fals
e a
larm
ra
te
Detection rate (99.99% coverage) Detection rate (99% coverage)False alarm rate (99% coverage) False alarm rate (99.99% coverage)
0
200
400
600
800
1000
1200
0.01 0.03 0.05 0.07 0.09 0.11 0.13 0.15 0.17 0.19
self radius
nu
mb
er
of
de
tec
tors
99.99% coverage 99% coverage
Detection rate and false alarm rate Number of detectors
![Page 20: V-detector: a real-valued negative selection algorithm Zhou Ji St. Jude Childrens Research Hospital.](https://reader035.fdocuments.us/reader035/viewer/2022081414/5514853c550346ea6e8b4bba/html5/thumbnails/20.jpg)
Biomedical data
Blood measure for a group of 209 patients Each patient has four different types of
measurement 75 patients are carriers of a rare genetic
disorder. Others are normal.
![Page 21: V-detector: a real-valued negative selection algorithm Zhou Ji St. Jude Childrens Research Hospital.](https://reader035.fdocuments.us/reader035/viewer/2022081414/5514853c550346ea6e8b4bba/html5/thumbnails/21.jpg)
Biomedical data: results comparison
Training Data Algorithm Detection Rate False Alarm rate Number of Detectors
Mean SD Mean SD Mean SD
100% training MILA 59.07 3.85 0 0 1000* 0
NSA 69.36 2.67 0 0 1000 0
r=0.1 30.61 3.04 0 0 21.52 7.29
r=0.05 40.51 3.92 0 0 14.84 5.14
50% training MILA 61.61 3.82 2.43 0.43 1000* 0
NSA 72.29 2.63 2.94 0.21 1000 0
r = 0.1 32.92 2.35 0.61 0.31 15.51 4.85
r=0.05 42.89 3.83 1.07 0.49 12.28 4
25% training MILA 80.47 2.80 14.93 2.08 1000* 0
NSA 86.96 2.72 19.50 2.05 1000 0
r=0.1 43.68 4.25 1.24 0.5 12.24 3.97
r=0.05 57.97 5.86 2.63 0.77 8.94 2.57
![Page 22: V-detector: a real-valued negative selection algorithm Zhou Ji St. Jude Childrens Research Hospital.](https://reader035.fdocuments.us/reader035/viewer/2022081414/5514853c550346ea6e8b4bba/html5/thumbnails/22.jpg)
Biomedical data
0
10
20
30
40
50
60
70
80
90
100
0.01 0.03 0.05 0.07 0.09 0.11 0.13 0.15 0.17 0.19
self radius
de
tec
tio
n r
ate
0
10
20
30
40
50
60
fals
e a
larm
ra
te
Detection rate (99.99% coverage) Detection rate (99% coverage)False alarm rate (99% coverage) False alarm rate (99.99% coverage)
0
200
400
600
800
1000
1200
0.01 0.03 0.05 0.07 0.09 0.11 0.13 0.15 0.17 0.19
self radiusn
um
be
r o
f d
ete
cto
rs
99.99% coverage 99% coverage
Detection rate and false alarm rate Number of detectors
![Page 23: V-detector: a real-valued negative selection algorithm Zhou Ji St. Jude Childrens Research Hospital.](https://reader035.fdocuments.us/reader035/viewer/2022081414/5514853c550346ea6e8b4bba/html5/thumbnails/23.jpg)
Air pollution data Totally 60 original records. Each is 16 different measurements concerning air pollution. All the real data are considered as normal. More data are made artificially:
1. Decide the normal range of each of 16 measurements2. Randomly choose a real record3. Change three randomly chosen measurements within a larger
than normal range4. If some the changed measurements are out of range, the
record is considered abnormal; otherwise they are considered normal
Totally 1000 records including the original 60 are used as test data. The original 60 are used as training data.
![Page 24: V-detector: a real-valued negative selection algorithm Zhou Ji St. Jude Childrens Research Hospital.](https://reader035.fdocuments.us/reader035/viewer/2022081414/5514853c550346ea6e8b4bba/html5/thumbnails/24.jpg)
Air pollution data
0
20
40
60
80
100
120
0.01 0.03 0.05 0.07 0.09 0.11 0.13 0.15 0.17 0.19
self radius
de
tec
tio
n r
ate
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
fals
e a
larm
ra
te
Detection rate (99.99% coverage) Detection rate (99% coverage)False alarm rate (99% coverage) False alarm rate (99.99% coverage)
0
200
400
600
800
1000
1200
0.01 0.03 0.05 0.07 0.09 0.11 0.13 0.15 0.17 0.19
self radius
nu
mb
er
of
de
tec
tors
99.99% coverage 99% coverage
Detection rate and false alarm rate Number of detectors
![Page 25: V-detector: a real-valued negative selection algorithm Zhou Ji St. Jude Childrens Research Hospital.](https://reader035.fdocuments.us/reader035/viewer/2022081414/5514853c550346ea6e8b4bba/html5/thumbnails/25.jpg)
Ball bearing data
raw data: time series of acceleration measurements
Preprocessing (from time domain to representation space for detection)
1. FFT (Fast Fourier Transform) with Hanning windowing: window size 30
2. Statistical moments: up to 5th order
![Page 26: V-detector: a real-valued negative selection algorithm Zhou Ji St. Jude Childrens Research Hospital.](https://reader035.fdocuments.us/reader035/viewer/2022081414/5514853c550346ea6e8b4bba/html5/thumbnails/26.jpg)
Example of data (raw data of new bearings) --- first 1000 points
-60
-40
-20
0
20
40
60
80
1 33 65 97 129 161 193 225 257 289 321 353 385 417 449 481 513 545 577 609 641 673 705 737 769 801 833 865 897 929 961 993
![Page 27: V-detector: a real-valued negative selection algorithm Zhou Ji St. Jude Childrens Research Hospital.](https://reader035.fdocuments.us/reader035/viewer/2022081414/5514853c550346ea6e8b4bba/html5/thumbnails/27.jpg)
Example of data (FFT of new bearings) --- first 3 coefficients of the first 100 points
0
100
200
300
400
500
600
1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 70 73 76 79 82 85 88 91 94 97 100
coefficient 1 coefficient 2 coeffcient 3
![Page 28: V-detector: a real-valued negative selection algorithm Zhou Ji St. Jude Childrens Research Hospital.](https://reader035.fdocuments.us/reader035/viewer/2022081414/5514853c550346ea6e8b4bba/html5/thumbnails/28.jpg)
Example of data (statistical moments of new bearings) --- moments up to 3rd order of the first 100 points
-2000
-1000
0
1000
2000
3000
4000
5000
6000
1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 70 73 76 79 82 85 88 91 94 97 100
1st order 2nd order 3rd order
![Page 29: V-detector: a real-valued negative selection algorithm Zhou Ji St. Jude Childrens Research Hospital.](https://reader035.fdocuments.us/reader035/viewer/2022081414/5514853c550346ea6e8b4bba/html5/thumbnails/29.jpg)
Ball bearing’s structure and damage
Damaged cage
![Page 30: V-detector: a real-valued negative selection algorithm Zhou Ji St. Jude Childrens Research Hospital.](https://reader035.fdocuments.us/reader035/viewer/2022081414/5514853c550346ea6e8b4bba/html5/thumbnails/30.jpg)
Ball bearing data: resultsBall bearing conditions Total number of data points Number of detected
anomaliesPercentage detected
New bearing (normal) 2739 0 0%
Outer race completely broken 2241 2182 97.37%
Broken cage with one loose element 2988 577 19.31%
Damage cage, four loose elements 2988 337 11.28%
No evident damage; badly worn 2988 209 6.99%
Ball bearing conditions Total number of data points Number of detectedanomalies
Percentage detected
New bearing (normal) 2651 0 0%
Outer race completely broken 2169 1674 77.18%
Broken cage with one loose element 2892 14 0.48%
Damage cage, four loose elements 2892 0 0%
No evident damage; badly worn 2892 0 0%
Preprocessed with FFT
Preprocessed with statistical moments
![Page 31: V-detector: a real-valued negative selection algorithm Zhou Ji St. Jude Childrens Research Hospital.](https://reader035.fdocuments.us/reader035/viewer/2022081414/5514853c550346ea6e8b4bba/html5/thumbnails/31.jpg)
Ball bearing data: performance summary
Statistical Moments
77.18
Statistical Moments
21.22
FourierTransform97.37
FourierTransform37.68
FourierTransform3.65
Statistical Moments
00
20
40
60
80
100
120
Detection Rate for the WorstDamage
Detection Rate for AllDamages
False Alarm Rate
![Page 32: V-detector: a real-valued negative selection algorithm Zhou Ji St. Jude Childrens Research Hospital.](https://reader035.fdocuments.us/reader035/viewer/2022081414/5514853c550346ea6e8b4bba/html5/thumbnails/32.jpg)
New development of this work
A new algorithm to generate variable-sized detectors. Purpose: reduce the possible “false negative” at the
boundary of self region Why the issue exits: some self samples may be very close
to the boundary. Main idea: differentiate between “internal self samples” and
“boundary self samples” Solution: combine the advantage of the algorithms to
generate variable-sized and constant-sized detectors described previously.
![Page 33: V-detector: a real-valued negative selection algorithm Zhou Ji St. Jude Childrens Research Hospital.](https://reader035.fdocuments.us/reader035/viewer/2022081414/5514853c550346ea6e8b4bba/html5/thumbnails/33.jpg)
How much one sample tells
![Page 34: V-detector: a real-valued negative selection algorithm Zhou Ji St. Jude Childrens Research Hospital.](https://reader035.fdocuments.us/reader035/viewer/2022081414/5514853c550346ea6e8b4bba/html5/thumbnails/34.jpg)
Samples may be on boundary
![Page 35: V-detector: a real-valued negative selection algorithm Zhou Ji St. Jude Childrens Research Hospital.](https://reader035.fdocuments.us/reader035/viewer/2022081414/5514853c550346ea6e8b4bba/html5/thumbnails/35.jpg)
In term of detectors
![Page 36: V-detector: a real-valued negative selection algorithm Zhou Ji St. Jude Childrens Research Hospital.](https://reader035.fdocuments.us/reader035/viewer/2022081414/5514853c550346ea6e8b4bba/html5/thumbnails/36.jpg)
Comparing three methods
Constant-sized detectors V-detector New algorithm
Self radius = 0.05
![Page 37: V-detector: a real-valued negative selection algorithm Zhou Ji St. Jude Childrens Research Hospital.](https://reader035.fdocuments.us/reader035/viewer/2022081414/5514853c550346ea6e8b4bba/html5/thumbnails/37.jpg)
Comparing three methods
Constant-sized detectors V-detectors New algorithm
Self radius = 0.1
![Page 38: V-detector: a real-valued negative selection algorithm Zhou Ji St. Jude Childrens Research Hospital.](https://reader035.fdocuments.us/reader035/viewer/2022081414/5514853c550346ea6e8b4bba/html5/thumbnails/38.jpg)
Work ongoing
Estimate of coverage using formal statistics “point estimate” is the simplest method. Two types of statistical inference:
1. Confidence interval
2. Hypothesis testing
![Page 39: V-detector: a real-valued negative selection algorithm Zhou Ji St. Jude Childrens Research Hospital.](https://reader035.fdocuments.us/reader035/viewer/2022081414/5514853c550346ea6e8b4bba/html5/thumbnails/39.jpg)
Point estimate of proportion
![Page 40: V-detector: a real-valued negative selection algorithm Zhou Ji St. Jude Childrens Research Hospital.](https://reader035.fdocuments.us/reader035/viewer/2022081414/5514853c550346ea6e8b4bba/html5/thumbnails/40.jpg)
Summary
1. V-detector uses fewer detectors to obtain similar coverage.2. Smaller detectors are more acceptable if the total number of
detectors are largely controlled.3. Coverage estimate is superior to fixed number of detectors.4. V-detector can deal with high-dimensional data, including
time series, better.5. Self radius and estimated coverage are the two control
parameters in V-detector.6. Variable size, variable shape, variable matching rules, or
other variable properties of detectors provide encouraging opportunity to enhance negative selection mechanism.
![Page 41: V-detector: a real-valued negative selection algorithm Zhou Ji St. Jude Childrens Research Hospital.](https://reader035.fdocuments.us/reader035/viewer/2022081414/5514853c550346ea6e8b4bba/html5/thumbnails/41.jpg)
9-17-2004