Computational Statistics and Mathematics for Cyber Securitystatisticalcyber.com/talks/Marchette,...
Transcript of Computational Statistics and Mathematics for Cyber Securitystatisticalcyber.com/talks/Marchette,...
Computational Statistics and Mathematics forCyber Security
David J. Marchette
Sept 25–27, 2017
Acknowledgment: This work funded in part by the NSWC
In-House Laboratory Independent Research (ILIR) program.
NSWCDD-PN-17-00345 Distribution A: Approved for Public Release
IntroductionComputational Statistics
Machine LearningManifold Learning
Topological Data Analysis
Topics
1 Introduction
2 Computational Statistics
3 Machine Learning
4 Manifold Learning
5 Topological Data Analysis
Distribution A: Approved for Public Release NSWCDD-PN-17-00345
IntroductionComputational Statistics
Machine LearningManifold Learning
Topological Data Analysis
Topics
1 Introduction
2 Computational Statistics
3 Machine Learning
4 Manifold Learning
5 Topological Data Analysis
Distribution A: Approved for Public Release NSWCDD-PN-17-00345
IntroductionComputational Statistics
Machine LearningManifold Learning
Topological Data Analysis
Take-Away Points
Mathematics and statistics provide many tools for cybersecurity.
Simple can be powerful.
Complicated models or algorithms are not always necessary.Sometimes they are.
“Complicated” things become “simple” with familiarity.
High dimensional data is complicated, messy, and can foolyou.
Know your data!
If your results appear too good to be true, triple check them!
Distribution A: Approved for Public Release NSWCDD-PN-17-00345
IntroductionComputational Statistics
Machine LearningManifold Learning
Topological Data Analysis
Two Cultures
“There are two cultures in the use of statistical modeling to reach
conclusions from data. One assumes that the data are generated by a
given stochastic data model. The other uses algorithmic models and
treats the data mechanism as unknown.”*
There are many aspects of this dichotomy:
Modeling – algorithms.
Parametric – non-parametric.
Statistics – machine learning.
Inference – prediction.**
“Small” data – “big data”.
“Traditional” statistics – computational statistics.
*Leo Breiman, Statistical Science 2001, Vol. 16, No. 3, 199231
**Donoho, D. (2015, September). 50 years of Data Science. In Princeton NJ, Tukey Centennial Workshop.
http://www.economicsguy.com/wp-content/uploads/2016/06/50YearsDataScience.pdf accessed 8/8/2017
Distribution A: Approved for Public Release NSWCDD-PN-17-00345
IntroductionComputational Statistics
Machine LearningManifold Learning
Topological Data Analysis
The Illusion of Progress
“. . . [comparative studies] often fail to take into account important
aspects of real problems, so that the apparent superiority of more
sophisticated methods may be something of an illusion.”*
Simple models often produce essentially the same accuracy asmore complicated models. These can be easier to understand,fit, and may have fewer parameters to choose – possiblyresulting in lower variance.
The data you get is rarely (if ever) a true random draw fromthe distribution you will be running your trained/implementedalgorithm on.
This is particularly important in cyber security.By its nature, cyber security data is non-stationary, andtoday’s data may look very different from tomorrow’s.
*David Hand, Statistical Science 2006, Vol. 21, No. 1, 114
Distribution A: Approved for Public Release NSWCDD-PN-17-00345
IntroductionComputational Statistics
Machine LearningManifold Learning
Topological Data Analysis
The Illusion of Progress
When building a model, one makes assumptions, which areoften not testable, and which can impact the ultimateperformance.
Simpler models (may) have fewer assumptions.Non-parametric (may) be superior to parametric in that they(tend to) make fewer assumptions.
However, if the assumptions are true, parametric may besuperior.“Good” non-parametric algorithms would be “nearly as good”as the parametric, while allowing a hedge on the assumptions.
Hand suggests we spend less time developing the “next greatclassifier” and more time on methods that mitigate the aboveissues.
Distribution A: Approved for Public Release NSWCDD-PN-17-00345
IntroductionComputational Statistics
Machine LearningManifold Learning
Topological Data Analysis
Outline
Probability density estimation.
Kernel estimators.Streaming data.
Machine learning.
Nearest neighbors.Random forests.
Manifold learning.
Graphs.Spectral embedding.
Topological Data Analysis.
We’ll see how much of this we can cover today – see the paper.
Distribution A: Approved for Public Release NSWCDD-PN-17-00345
IntroductionComputational Statistics
Machine LearningManifold Learning
Topological Data Analysis
Topics
1 Introduction
2 Computational Statistics
3 Machine Learning
4 Manifold Learning
5 Topological Data Analysis
Distribution A: Approved for Public Release NSWCDD-PN-17-00345
IntroductionComputational Statistics
Machine LearningManifold Learning
Topological Data Analysis
The HistogramD
ensi
ty
−1 1 3
0.00
0.05
0.10
0.15
0.20
0.25
0.30
● ●● ● ●● ●●● ● ● ●● ●●● ●●● ●
−1 1 3
0.00
0.05
0.10
0.15
0.20
0.25
0.30
● ●● ●●● ●●● ● ● ●● ●●● ●●● ●
−1 1 30.
000.
050.
100.
150.
200.
250.
30
● ●● ●●● ●●● ●
Distribution A: Approved for Public Release NSWCDD-PN-17-00345
IntroductionComputational Statistics
Machine LearningManifold Learning
Topological Data Analysis
The Histogram – The Kernel EstimatorD
ensi
ty
−1 1 3
0.00
0.05
0.10
0.15
0.20
0.25
0.30
● ●● ● ●● ●●● ● ● ●● ●●● ●●● ●
−1 1 3
0.00
0.05
0.10
0.15
0.20
0.25
0.30
● ●● ●●● ●●● ● ● ●● ●●● ●●● ●
−1 1 30.
000.
050.
100.
150.
200.
250.
30
● ●● ●●● ●●● ●
Distribution A: Approved for Public Release NSWCDD-PN-17-00345
IntroductionComputational Statistics
Machine LearningManifold Learning
Topological Data Analysis
The Kernel Estimator
f (x) =1
n
n∑i=1
Kh(x − xi ) =1
nh
n∑i=1
φ
(x − xi
h
).
Easily extended to multivariate versions.
Note that this is an “average”.
Distribution A: Approved for Public Release NSWCDD-PN-17-00345
IntroductionComputational Statistics
Machine LearningManifold Learning
Topological Data Analysis
Network Flows
http://csr.lanl.gov/data/cyber1/
Distribution A: Approved for Public Release NSWCDD-PN-17-00345
IntroductionComputational Statistics
Machine LearningManifold Learning
Topological Data Analysis
Network Flows
Distribution A: Approved for Public Release NSWCDD-PN-17-00345
IntroductionComputational Statistics
Machine LearningManifold Learning
Topological Data Analysis
Streaming Data
Averages can be computed in a streaming fashion:
Xn =n − 1
nXn−1 +
1
nXn.
We can implement an exponential window:
Xn =N − 1
NXn−1 +
1
NXn = θXn−1 + (1− θ)Xn,
and apply this idea to the kernel estimator:
fn(x) = θfn−1(x) + (1− θ)φ
(x − Xn
h
).
θ controls how much of the past we “remember”. Note that wehave to set a grid of x points at which we want to compute f .
Distribution A: Approved for Public Release NSWCDD-PN-17-00345
IntroductionComputational Statistics
Machine LearningManifold Learning
Topological Data Analysis
Streaming Network Flows: log(#bytes) in a Flow
Distribution A: Approved for Public Release NSWCDD-PN-17-00345
IntroductionComputational Statistics
Machine LearningManifold Learning
Topological Data Analysis
Topics
1 Introduction
2 Computational Statistics
3 Machine Learning
4 Manifold Learning
5 Topological Data Analysis
Distribution A: Approved for Public Release NSWCDD-PN-17-00345
IntroductionComputational Statistics
Machine LearningManifold Learning
Topological Data Analysis
Machine Learning: Classification
Given {(xi , yi )}i=1,...,n ⊂ X × Y with xi corresponding toobservations (flows, programs, email, system calls, log files:“features”), and yi corresponding to class labels (e.g.“malware”, “benign”).
A classifier is a mapping g : X → Y .
Machine learning (pattern recognition, classification) isdesigning a function g from “training data” {(xi , yi )}i=1,...,n
for which “truth” is known.
We are given training data {(xi , yi )}i=1,...,n ⊂ X × Y , and will bepresented with a new x ∈ X for which the label is unknown. Wewish to infer the y associated with x .
Distribution A: Approved for Public Release NSWCDD-PN-17-00345
IntroductionComputational Statistics
Machine LearningManifold Learning
Topological Data Analysis
Nearest Neighbors
We are given training data {(xi , yi )}i=1,...,n ⊂ X × Y , and a newx ∈ X for which the label is unknown.
1 Find the closest xi to x:
y = yargmin d(x ,xi ).
2 We must select an appropriate distance (dissimilarity) d .
3 Alternative: We can compute the k closest, and vote: takethe majority class.
Distribution A: Approved for Public Release NSWCDD-PN-17-00345
IntroductionComputational Statistics
Machine LearningManifold Learning
Topological Data Analysis
Kaggle Malware
10, 868 examples of malware grouped into 9 malware“families”.*
Each file has been byte-dumped and tabulated:
We are using the frequency of times each value 0, . . . , 255occurs in the file.This seems really dumb (computer scientists laugh when I tellthis story).
We’ll look at the nearest neighbor classifier on these data.100 observations of each family are used for training (21observations from the family containing only 42 observations).Test on the remaining.
Remember: sometimes simple is good.
*https://www.kaggle.com/c/malware-classification
Distribution A: Approved for Public Release NSWCDD-PN-17-00345
IntroductionComputational Statistics
Machine LearningManifold Learning
Topological Data Analysis
Kaggle Malware: 1NN Performance
True Class
1291 116 6 2 53 12 75 1285 1525 1 4 1 6 5
33 2807 3 1 2 21 63 29 352 12 2 7 10
27 89 19 13 8 23 2155 166 2 10 554 2 32 2830 6 4 3 4 244 4528 243 3 15 920 124 137 7 12 18 709
Error: 16.2%. That is, 84% of the observations are correctlyclassified.
Distribution A: Approved for Public Release NSWCDD-PN-17-00345
IntroductionComputational Statistics
Machine LearningManifold Learning
Topological Data Analysis
Kaggle Malware: 1NN Performance – Why???
Text analogy: byte-count histogram is analogous to theword-count histogram used in text analysis.
Maybe this is more like a morpheme-count histogram.
Intuitively, a “family” shares a core of code (they aremodifications of the “mother” malware).
The bytes correspond to machine instructions – or at leastthey would if we were counting words instead of bytes.
Distribution A: Approved for Public Release NSWCDD-PN-17-00345
IntroductionComputational Statistics
Machine LearningManifold Learning
Topological Data Analysis
Kaggle Malware: Smoothed 1NN Performance
Using the kernel estimator instead of the histogram, oneobtains an error of 12.4%.
This is another place for computer scientists to laugh: bytesare not continuous, machine instruction codes are discrete.
. . . and yet it works.
Remember Hand’s paper.
Here is the point at which we need to better understand ourdata.
Unfortunately, we won’t be doing this today.
Distribution A: Approved for Public Release NSWCDD-PN-17-00345
IntroductionComputational Statistics
Machine LearningManifold Learning
Topological Data Analysis
Random Forests
We are given training data {(xi , yi )}i=1,...,n ⊂ X × Y , and a newx ∈ X for which the label is unknown.
The random forest is an ensemble of decision trees:1 Sample (with replacement) from the training data. Sample a
subset of the variables.2 Build a decision tree using the two samples – don’t bother
with any optimization or pruning.3 Repeat.
With a new observation, vote the trees.
Distribution A: Approved for Public Release NSWCDD-PN-17-00345
IntroductionComputational Statistics
Machine LearningManifold Learning
Topological Data Analysis
Benign vs Malicious
7738 observations of windows binaries: 2054 benign, 5684malicious.
Random forest performance: 0.65% error.
1.6% of benign misclassified.0.3% of malicious misclassified.
Nearest neighbor classifier is a little worse: overall error of1.1%.
Distribution A: Approved for Public Release NSWCDD-PN-17-00345
IntroductionComputational Statistics
Machine LearningManifold Learning
Topological Data Analysis
Know Your Data
The results demonstrate that there is something going on withthis byte-count approach.
Logically, the performance seems too good to be true, and yetit does seem to work.
The data are high dimensional (256), so maybe there is a“curse of dimensionality” thing going on here.Perhaps we are finding OS-specific things:
The data collected for the benign files may be a differentversion of the operating system than the malicious.We don’t have version information about the data (beyondthese are Windows files).
Worrisome fact: there are several different sets of benign (ormalicious) data. A classifier can be built to tell which set –which of the benign collections a file belongs to.
Distribution A: Approved for Public Release NSWCDD-PN-17-00345
IntroductionComputational Statistics
Machine LearningManifold Learning
Topological Data Analysis
Know Your Data?
●
●
●
●
●
●
●
●
●
●●●
●
●
● ●
●
●●
●
●
●
●
●●●
●
●
●●
●
●
●
●
●
●
●
●
●
●● ●
●
●
●
●
●●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●
●
●● ●
●
●
●
● ●●
●
●
●
●● ●
●
●
●●
●
●
●
●
●●
●
●
● ●
●
●●
●
●
●●●
●
●
●
●
●
●
●●
●
●
●
●●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●●
●
●●
●
●
●
●
●
●●
● ●
●
●
●
●
●
●
●
● ●●
●
●
●
●
●
●
●
●
●
●
●●
● ●
●●
●
●
●
●
●
●● ●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
● ●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●●
●
●●
●
●
●
●
●
●
●
●
● ●
●●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●●
●
●
●
●
●
●
●
● ●
●
●
●●●
●
●
●
●
●
●
●●
●●
●
●
●● ●●
●
●
●
●
●
●●
●● ●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●●●
●
●
●
●
●
●
●
● ●
●
●●●
●
●
●
●●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●●●
●
●
●●
●
●●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●●●
●
●
●
●
●●
●
●
●
●
●●
●●
●
●
●
●●
●
●●●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●●
●
●
●● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
● ●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
● ●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●●
●●
●
●
●●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●●
●
●
●
●
●
●
● ●●● ●
●
●
●
●
●●
●
●
●
● ●
●
●
●
●●
●
●●
●
●
●
●●
●
●
●
●
●
●●●
●
●
●
●
●●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●●
●
●
●●
●●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
● ●
●
●
●
●
●
●
●
●
●●
●
●
● ● ●●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●●
● ●
●
●
●
●
●
●
●
●
●
●
●
●●
●●●
● ●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●● ●
●
●
●
●
●
●●
●
●
●
●●
●●●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●● ●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●●
●
●●
●
●
●
●
●●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●●
●
●
●● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●●
●
●
●
●
●●
●
●
●
●
● ●●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●●
● ●
●
●●●
●
●
●
●
● ●
●
●●
●● ●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
● ●●
●
●
● ●●
●
●
●
●
●●
●
●
●
●
●
●●
●
● ●
●
●
●
●●
●
● ●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
● ●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●● ●
●●
●
●
● ●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
● ●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●●
●
●●
●
●
●●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●●
●
●
●
●
●●
●
●
●
● ●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
● ●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
● ●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
● ●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●
●
●
●
●
●
●
●●
●
●
●
●●●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●●
●
●
●
●
●
●
● ●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
● ●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
● ●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
● ●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
● ● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
● ●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
● ●
●
● ●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
● ●
●
●●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
● ●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●●
●
●
●
●
●●
●●
●
●
●●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
● ●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
● ●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
● ●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
● ●
●
●●
●
●
● ●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
● ●
●
●
●●
●
●
●●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●●
● ●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●●
●●
●
●
●●
●
●●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●●●
●●
●
●
●
●●
●
●
●●
● ●
●
●
●
●
●
●
●●
●●
●
●
●
●
●●
●
●
●●●
●
●
●
●●
●●
●
●●
●
●
●
●
●
●●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
● ●●
●
●
●
●
●
●
●●
●
●
●●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
● ●
●●●●●
●
●
●
●●●
●●●
●
●
●
●
●
●
●
●●●
●
●●●
●
●●
●
●
●
●
●●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
● ●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●● ●●
●
●●
●●
●
●● ●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●● ●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
● ●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
● ●
●
●●
●
●
●●
●
●
●●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●●●
●●●
●●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
● ●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●●
●
●
●●
●
●
●●●
●
●
●●
●
●
●
●●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●●
●
●●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●●
●●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●●
●●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●●
●
●
● ●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
● ●●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●●
●●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●●
●
●
●●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●●
●
●
●
●
●
●
●
●
● ●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●●●
●
● ●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●● ●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
● ●
●●
●
●
●
●
●
●
●
●
●
●
●● ●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●●
●
●
●
●●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
● ●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●● ●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●
●
●
●
●●
●
●
●●●
●
●
●●
●
● ●
● ●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●● ●
●
●●
● ●●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●●
●
●
●●
●
●
●
●
●
●
●●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●●
●
●
●
●●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●● ●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
● ●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●●
●
●
●
●●
●
● ●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●●
●
●●
● ●
●
●
●
●●●
●● ●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
● ●
●
●
●●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●●●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●●●●
●
●
●
●
●
●
●
●
●
●●
●●
●●●
●
●
●
●
●
●
●
●
●●●●●● ●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
● ●●
●
●●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●●
●●
●
●●
●●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
● ●
●
●●
●●
●
●
●
●
●
●
●
Maaten & Hinton (2008). Visualizing data using t-SNE. Journal of Machine Learning Research, 9, 2579-2605.
Distribution A: Approved for Public Release NSWCDD-PN-17-00345
IntroductionComputational Statistics
Machine LearningManifold Learning
Topological Data Analysis
Topics
1 Introduction
2 Computational Statistics
3 Machine Learning
4 Manifold Learning
5 Topological Data Analysis
Distribution A: Approved for Public Release NSWCDD-PN-17-00345
IntroductionComputational Statistics
Machine LearningManifold Learning
Topological Data Analysis
Manifold Learning
Hypothesis: high dimensional data“lives” on a lower dimensionalstructure.
Manifold learning is a set oftechniques to infer this structure,or to embed the data from the highdimensional space into a lowerdimensional space that respects thelocal structure.
Distribution A: Approved for Public Release NSWCDD-PN-17-00345
IntroductionComputational Statistics
Machine LearningManifold Learning
Topological Data Analysis
Multidimensional Scaling
Problem: Given a distance matrix (or dissimilarity matrix) D,find a set of points X ∈ Rd whose distance d(X ) bestapproximates D.
This is the problem solved by multidimensional scaling (MDS).
Different definitions of “best approximates” lead to differentalgorithms.
Classical MDS utilizes the eigenvector decomposition of (amodified version of) the distance matrix.
Some manifold learning algorithms compute a local distanceand use MDS, others computer eigenvectors of relatedmatrices. These are the algorithms I use most often.
Distribution A: Approved for Public Release NSWCDD-PN-17-00345
IntroductionComputational Statistics
Machine LearningManifold Learning
Topological Data Analysis
Basic Graph Theory
A graph is a set V of vertices, and E of pairs of vertices(edges).
The edges can be directed or undirected, and can haveweights.
In this talk they will be undirected.
The (graph) distance between two vertices is the length of theshortest path between them in the graph.
The adjacency matrix of a graph on n vertices is the n × nbinary matrix with a 1 in those positions corresponding to theedges of the graph.
The spectrum of a graph is the eigen decomposition of theadjacency matrix A, or more generally, of some function f (A).
Distribution A: Approved for Public Release NSWCDD-PN-17-00345
IntroductionComputational Statistics
Machine LearningManifold Learning
Topological Data Analysis
Graph Examples
ε-ball graph with ε = 0.25.
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
3-nearest neighbor graph.
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
Distribution A: Approved for Public Release NSWCDD-PN-17-00345
IntroductionComputational Statistics
Machine LearningManifold Learning
Topological Data Analysis
Basic Steps of Manifold Learning
Given data {x1, . . . , xn} ∈ Rp:1 Construct a graph whose vertices are the xi with edges
between “near” points.
k-nearest neighbor graph.ε-ball graph.Variations.
2 Compute the eigenvectors of:
The adjacency matrix.The Laplacian of the adjacency matrix.Scaled or modified versions of the above.
3 Set Z to the matrix with columns corresponding to the maineigenvectors. That is, the rows {z1, . . . , zn} are the“embedded” data.
Perform inference on Z .
Distribution A: Approved for Public Release NSWCDD-PN-17-00345
IntroductionComputational Statistics
Machine LearningManifold Learning
Topological Data Analysis
Manifold Learning
Compute ε-ball graph on the Kaggle training data.
Layout thegraph.
Embed usingscaled Laplacian.
Embed usingadjacencymatrix.
Embed usingMDS on graphdistance.
9
8
6
7
19
2
3
7
3
7
33
2
2
2
6
33
2
7
1
8
9
8
6
8
43
7
8
1
97
77 2
2
6
8
3
8
4
2
7
8
7
8
5
7
2
8
6
12
1
7
64
6
7
5
5
4
1
2
1
9
8
4
9
4
9
1
7
2
6
4
9
6
4
11
6
6
11
4
9 9
1
3
6
4
1
2
3
1
3
5
96
1
4
6
7
2
7
3 44
6
9
6
6
6
9
6
2
1
8
3
1
8
7
1
4 43
6
8
2
7
8
3
8
3
2
3
8
9
5
433
9
4
8
8 6
3
9
6
2
6 46
7
1
2
9
6
9
6
3
6
2
8
3
99
9
4
28
9
8
9
1
4
5
8
9
44
1
9
3
8
8
2
3
3
2
3
9
4
8
9
3
6
4
7
2
2
6
8
86
7
1
9
63
8
9
1
8
7
2
72
6
6
4
4
1
2
9
9
3
7
2
1
1
4
9
2
2
44
1
7
88
6
2
88
7
4
2
8
9
1
333 6
8
61
2
1
9
7
63
8
6
8
1
3
1
7
7 7
9
6
3
9
7
2
7
2
9
6
8
7
5
6
9
6
6
24
8
2
8 1
8
76
2
1
3
4
2
32
4
9
1
2
9
7
9
7
1
1
7
9
6
8
6
1
6
2
4
1
2
9
2
3
4
7
2
4
1
4
6
2
1
9
2
9
7 7
6
1
1
44
9
3
4
5
2
7
181
1
1
6
99
1
7
9
6
42
7 87
2
96
4
1
7
3
9
4
1
7
2
9
9
4
17
9
8
1
6
4
8
77
4
9
4
7
6
6
6
7
2
3
8
4
3
1
2
4
12
34
9
7
3
8
5
3
2
7
2
8
4
6
7
3
9
1
43
9
9
9
1
2
11
4
1
4
5
4
7
1
2
7
1
7
1
1
1
7
6
6
1
6
4
78
5
6
3
9
3
8
8
3
25
9
7
3
1
7
66
3
9
36
6
7
2
8
66
43
4
52
1
4
1
88
2
66
2
8
3
7
9
4
8
7
63
4
7
9
2
6
6
7
4
99
1
3
7
2
8
3
8
1
2
9
9
4
8
1
4
6
89
2
2
5
6
3 4
9
7
5
2
91
3
9
11
81
1
6
5
3333
1
4
1
8
4
8
1
8
42 1
4
9
5
2
2
8
78
7
6
9
8
7
11
6
3
9
88
8
7
9
44
7
9
8
3
2
1
2
2
7
3
1
9
8
3 3
1
6
3
1
9
9
6
2
2
22
4
33
254
8
3
4
79
7
99
7
2
7
2
3
1
99
6
3
68
6333
7
1
8
33
8
8
3
9
6
6
8
2
9
7
5
8
7
4
7
1
4
3
7
3
53
6
2
9
4
8
5
83
1
8
2
3
2
7
33
9
8
9
2
1
8
8
9
6
4
4
4
3
6
6
1
3
7
2
9
8
9
333
2
2
4
8
1
9
7
8
8
3
2
3
8
7
1
9
4
44
8 6
33
9
4
7
3 6
8
4
7
7
8
8
6
8 1
9
6
6
7
7
2
96
3
6
6
7
2
43
6
6
7
1 1
8
24
1
7
2 8
7
2
8
6
6
1
9
4
7
7
4
42
7
2
4
7
2
4
4
4
44
4
44
Distribution A: Approved for Public Release NSWCDD-PN-17-00345
IntroductionComputational Statistics
Machine LearningManifold Learning
Topological Data Analysis
Manifold Learning
Compute ε-ball graph on the Kaggle training data.
Layout thegraph.
Embed usingscaledLaplacian.
Embed usingadjacencymatrix.
Embed usingMDS on graphdistance.
9
8
6
7
1923
7
37
33
2 22
6
33
2 71
89 8
6
8
43
7
81
9777
2
26 8
3
8
4 2
787
85
7
2
8
61 21
764 6
7
5541
219 8
4
9
49 1
7
2
6
4
9
64
11
66
114
99
1
3
64
1 23
1
35
9
6
14
6
7
2
7
34 469
66
6
962
18
3
18 7
1
4 43
6
8
2
7
83
8
32
3
8
9
54339
4
88
6
3
9
626 46
7
12
96
9
6
36 2 8
39
9
94 28
9
8
9
14 5
8
9
44
193
8 8233
2
39
48
9
36
4
7
22688 6
7
1
9
6389 18
7
2
7
26
6
44 12
9 9
3
7
211
49
2244
1
7
8
8
62
8 8
7
42
8
91
333 6
8
612
19
7
63
8
6
81
3
1
7 77
963
97
27
29
68
7
56
9
66 24
82
8
18
7
6
21
34
23 24
91
29
797
11
7
96
86 1
62
41
29
23 4
7
24
146 2
1
92
97
7
61
14 4
9
3 452
7
1811 1
6
99
17
9
6
42
787
2 9641
7
3
9
4
1
7
29
9
41
7
98
164
8
7749
4
7
6
6
6
7
2
3
843
1 24 12
3 4
9 7
3
853 2
7
2
8
4
6
73
9
1
4 3
9 9
912
114
145
4
7
12
7
1
7
11 1
7
661
6 4
78
563
9
38
8
3 259
7
31
7
6
6
39
3 66
7
28
6
6
43 452 141
8 8
26 62
8
37
94
87
634
7
9266
7
4991
3
7
2
83
81
29
9
4
8
1
4
6
89
225 6
3 4
97
52
9 1
39
1181
1
653333
1
4
1
8
4
8
18
4214
9
5 22
8
7
8
7
69
8
7
11
6
39
8
8
8
7
9
44
7
98
3
2 12273
1
9
8
33
163
1996 2
222433 254
83 4
7 97
99
7
2
7
2
3
1
99
6
3
68
63337
18
33
88
3 9
6
6 829
7
5
87
47
1
4 37
3 53 6 2 94
8583
1823
2
7
339
8
9
21
88
9
644 4366 1
3
7
2
98 9
333 22
48
1 9
7
8 8
32
3
8
7
19
44 48633
9
4
73 6
8
4
77
88
68
1
9
66
77
2963
66
7
243
6
6
7
11
8
241
7
28
7
2 86
6
1
9
4
7
7442
7
24
7
244
444 444
Distribution A: Approved for Public Release NSWCDD-PN-17-00345
IntroductionComputational Statistics
Machine LearningManifold Learning
Topological Data Analysis
Manifold Learning
Compute ε-ball graph on the Kaggle training data.
Layout thegraph.
Embed usingscaled Laplacian.
Embed usingadjacencymatrix.
Embed usingMDS on graphdistance.
9867192
3
7
3
7
33
2226
3
3
271898
6
8
4
3
78197772268
3
8
4
27878572861217
6
4
6755412198
4
9
4
91726
4
96
4
1166114991
3
6412
3
1
3
5961
4
6727
3
4
46966
6
96218
3
1871
4
4
3
68278
3
8
3
2
3
8954
3
3
94886
3
962
6
4
6
7129
6
96
3
628
3
999
4
289891 45894
4
19
3
882
3
3
2
3
9
4
89
3
6
4
72268867196
3
8918727266441299
3
72114922 4
4
17886288742891
33
3
6
86121976
3
8681
3
177796
3
972729
6
875696
6
2
4
828187621
3
42
3
2491297971179
6
861624129
2
3
472
4
14
6
2192977611
44
9
3
4527181116991
7
9642787
2
96
4
17
3
9
4
17
2
99 41798164877494
766672
3
8
43
12
4
12
3
497
3
85
3
2728467
3
91
4
3
9991211
4
145
4
71271711176
6
1
6
4785
6
3
93
883
2597
3
1
7
663
9
3
6
672866
4
3
45214
1882
6
628
3
79487
6
3
479
26674991
3
728
3
81299
4
8146892256
3
4
975291
3
91181165
33
3
31
4
1848184214952287876987116
3
98887944798
3
21227
3
198
3
3
1
6
3
199
6
22224
3
3
25
4
8
3
4797997272
3
19963
68
6
33
3
718
3
3
88
3
96
6
8297587
4
71
4
3
7
3
5
3
6
29
4
8583
182
3
27
3
39892
1
88964
4
43
6
61
3
72989
33
3
22
4
819788
3
2
3
87194
4
4
8
6
3
3
9
4
7
3
6847788
6
8196677296
3
6
6
724
3
6671182417287286619
4
7744
272472
4
44
4
4
444
Distribution A: Approved for Public Release NSWCDD-PN-17-00345
IntroductionComputational Statistics
Machine LearningManifold Learning
Topological Data Analysis
Manifold Learning
Compute ε-ball graph on the Kaggle training data.
Layout thegraph.
Embed usingscaled Laplacian.
Embed usingadjacencymatrix.
Embed usingMDS on graphdistance.
9
8
6
7
19
2
3 7
373 3
2
22
6
33
2
71
8
9
8
6
8
43
7
8
1
9
777
2
26
8
3
8
4
2
78
7
8
5
72
8
6 1 2
1
7
64 6
7
55
4
1
2 1
9 8
4
9
4
91
7
2
6
4
9
6
4
1
1
6 6
11
4
9
9
1
3
64
12
3
1
3 5
9
61
4
6
7
2
73 4 4
6
9
66
6
9
62
1
8
3
1
8
7
1
44
3
6
8
2
7
8
3
8
32
3
8
9
54
33
9
4
8 8
6
3
9
62
646
7
1
2
9
69
6
3
62
8
3
99
94
2
8
9
8
9
1
45
8
9
44
1
93
8
8
23
3
2
3
9
48
93
6
4
72 2
6 886
7
1
9
63
89 1
8
72 7
2
6
6
4
41
2
99
3
7
2
1
14
9
2 244
1
7
8
8
62
8
8
7
4 2
8
9
1
33 36
8
61
2
19
7
63
8
6
8
1
3
1
7
77963
97
27
2
9
6
8
7
5
6
9
6
62
4
8
2
8
1
8
7
6
2
1
3
4
2
3 2
4
91
2
9
7
97
11
7
9
6
8
61
6
2
4
1
2
9
2
3
4
7
2
4
1
46 2
1
9
2
9
7
76
1
1
44
9
3
45 2
7
1
8
1
1
1
6
991
7
9
6
42
7872 9
6
4
1
7
3
9
4
1
7
2
9
94
1
7
9
8
1
6 4
8
774
9
4
7
6
6
6
7
2
3
8
4
3
124 12
3
4
973
8
53
2
7
2
84
6
7
3
9
1
4 3
9
9
9
1
2
1
14
1
4 54
7
1
2
7
1
7
1
11
7
66
1
64
785
63
938
8
3
25
9
7
3
1
7
66
3
9
36
6
7
2
8 6
6
434
52 14
1
88
2
662
8
3 7
94 8
7
63
4
7
926
6
7
4
99
1
3
7
283
81
29
9
4
8
1
4
6
8
9
2
2 5 6
34
9
7
5
2
9
1
39
11
8
1
1
65
33 3
3
1
4
1
8
4
8
1
8
4
214
9
52 2
8
7
8
7
6
9
8
7
11
6
3
9
8
8
8
7
9
44
7
9
8
3
21
2
2 73
1
9
8
3 3
1
63
19
9
6 222
2
4
3 3 254
8
3
4
79
7
9
9
7
2
7
2
3
1
99
6
3 6
8
63337
1
8
33
8
8
3 9
6
6 82 9
7
5
8
7
4
7
1
437
3
5362 94
8
58
3
1
8 2
3
2
7
33 9
8
9
2
1
8
8
9
64
4
4
36
61
3
7
2
9
89
33 3 2
2
4 8
1
9
7
8
8
32
3
8
7
1
944
48
633
9
4 7
36
8
4
7
7
8
8
68
1
9
66
77
2 963
6
6
7
2
43
6
6
7
1
1
82
41
7
2
8
7
2
86
6
1
9
4
7
7
44 2
72 4
7
24
4
4444 44
Distribution A: Approved for Public Release NSWCDD-PN-17-00345
IntroductionComputational Statistics
Machine LearningManifold Learning
Topological Data Analysis
Manifold Learning Discussion
Different embedding methods extract different informationabout the data.
These two dimensional plots are misleading in that there is noreason to assume the intrinsic dimensionality is 2.
Some care must be taken to ensure that the embeddingmethod can be applied to new data.
Distribution A: Approved for Public Release NSWCDD-PN-17-00345
IntroductionComputational Statistics
Machine LearningManifold Learning
Topological Data Analysis
Joint Embedding
DW =
(D1 WW D2
)Jointly embed D1 and D2
using DW , whereW = λD1 + (1− λ)D2.
Distribution A: Approved for Public Release NSWCDD-PN-17-00345
IntroductionComputational Statistics
Machine LearningManifold Learning
Topological Data Analysis
Topics
1 Introduction
2 Computational Statistics
3 Machine Learning
4 Manifold Learning
5 Topological Data Analysis
Distribution A: Approved for Public Release NSWCDD-PN-17-00345
IntroductionComputational Statistics
Machine LearningManifold Learning
Topological Data Analysis
Topological Data Analysis (TDA)
The basic idea is to use topological features – measures that areinvariant to smooth deformations – to learn about the structure ofthe data.
We will only be able to touch briefly on this subject.
See:
Carlsson, “Topology and Data”, Bulletin of the AmericanMathematical Society, 46, 2009, 255–308.Ghrist, “Elementary Applied Topology”, CreatespaceIndependent Publishing Platform, 2014.
Distribution A: Approved for Public Release NSWCDD-PN-17-00345
IntroductionComputational Statistics
Machine LearningManifold Learning
Topological Data Analysis
Simplices
● ● ●
● ●
●
●
●
●
● ●
A (geometric) simplex of dimension d is a set of d + 1 points inrelative position.
A 0 simplex is a point, a 1-simplex a line segment, a 2 simplex atriangle, and so on.
Distribution A: Approved for Public Release NSWCDD-PN-17-00345
IntroductionComputational Statistics
Machine LearningManifold Learning
Topological Data Analysis
Simplicial Complexes
● ● ●
● ●
●
●
●
●
● ●
●
●
● ●
●
A simplicial complex is a collection S of simplices that satisfies thefollowing conditions:
1 If σ ∈ S then so are the faces of σ.
2 If σ1, σ2 ∈ S are k simplices, then either they are disjoint or theyintersect in a lower dimensional simplex which is a face of both.
Distribution A: Approved for Public Release NSWCDD-PN-17-00345
IntroductionComputational Statistics
Machine LearningManifold Learning
Topological Data Analysis
Persistent Homology
We construct an ε-ball graph on the data, and from this weget a simplicial complex.
We compute a measure of the topology (the rank of theHomology, or the Betti number) – how many “d-dimensionalholes” are there?
Those structures that persist across ranges of ε are“interesting” and more likely to be real structure rather thannoise.
Distribution A: Approved for Public Release NSWCDD-PN-17-00345
IntroductionComputational Statistics
Machine LearningManifold Learning
Topological Data Analysis
Euler Characteristic
One defines the Euler characteristic as:
χ(X ) =n∑
j=0
(−1)jBettij(X ).
This is equivalent to the standard Euler characteristic onelearns in grade school, extended to general topological spacesand higher dimensions.
The “persistent” version is to compute this on the persistenthomologies from the ε-ball graphs.
Distribution A: Approved for Public Release NSWCDD-PN-17-00345
IntroductionComputational Statistics
Machine LearningManifold Learning
Topological Data Analysis
Persistent Euler Characteristics of Malware
Distribution A: Approved for Public Release NSWCDD-PN-17-00345
IntroductionComputational Statistics
Machine LearningManifold Learning
Topological Data Analysis
Discussion
Mathematics has many tools for the data analyst, in particularfor the analysis of cyber data.
These tools include:
Computational statistics.Machine learning.Graph theory.Manifold learning.Topological data analysis.
New applications of “pure” mathematics to data analysis aredeveloped every day, and these areas are all huge growth areasfor applied mathematicians.
Distribution A: Approved for Public Release NSWCDD-PN-17-00345