Post on 04-May-2020
PersLay: a Neural Network for persistence diagrams
6th SmartData@Polito Workshop
Jan. 30th, 2020.
DataShape - Inria Saclaytheo.lacombe@inria.fr
and related topics.
Higher Methods in Data Science and ML
M. Carriere, F. Chazal, Y. Ike, T. Lacombe,M. Royer, Y. Umeda
Preliminar ideas about TDA
TDA is about:• Providing topological descriptors
of a given object
• In a quantitative way.
Preliminar ideas about TDA
TDA is about:• Providing topological descriptors
of a given object
• In a quantitative way.
A global overview of TDA
TDA can be applied to any type of structured data.
e.g. Point clouds in Rd, Weighted graphs, time series...
The TDA pipeline: persistent homology
Reminder:We want to encode the topology of an object at all scales
The TDA pipeline: persistent homology
Reminder:We want to encode the topology of an object at all scales
Homology:Encodes the topological features (connected components, loop, holes...)
The TDA pipeline: persistent homology
Reminder:We want to encode the topology of an object at all scales
Homology:Encodes the topological features (connected components, loop, holes...)
β0 = 1 connected componentβ1 = 2 loopsβ2 = 1 hole
The TDA pipeline: persistent homology
Reminder:We want to encode the topology of an object at all scales
Homology:Encodes the topological features (connected components, loop, holes...)
Simplicial homology:Homology on a computer.
β0 = 1 connected componentβ1 = 2 loopsβ2 = 1 hole
The TDA pipeline: persistent homology
All we need is an increasing sequence of topological spaces.
Reminder:We want to encode the topology of an object at all scales
Example: Given a function f : X→ R, consider Ft := f−1((−∞, t])
filtration sublevel sets of f
Persistence: looking at scales
x ∈ Ft ⇔ f(x) ≤ t
The TDA pipeline: persistent homology
All we need is an increasing sequence of topological spaces.
Reminder:We want to encode the topology of an object at all scales
Example: Given a function f : X→ R, consider Ft := f−1((−∞, t])
Example: Given X = x1 . . . xn ∈ Rd =: X, take f(x) = distX(x)
filtration sublevel sets of f
Persistence: looking at scales
x ∈ Ft ⇔ f(x) ≤ t
The TDA pipeline: persistent homology
All we need is an increasing sequence of topological spaces.
Reminder:We want to encode the topology of an object at all scales
Example: Given a function f : X→ R, consider Ft := f−1((−∞, t])
Example: Given X = x1 . . . xn ∈ Rd =: X, take f(x) = distX(x)
filtration sublevel sets of f
Persistence: looking at scales
x ∈ Ft ⇔ f(x) ≤ t
The TDA pipeline: persistent homology
• Compute at all scale the homology of the current topological space
• Record births and deaths of topological features (loop, holes, etc.)
births
dea
ths
• Print the output as the persistence diagram of the filtration
t = 0
Diagram in dim 1 (loops)
t
t
0
The TDA pipeline: persistent homology
• Compute at all scale the homology of the current topological space
• Record births and deaths of topological features (loop, holes, etc.)
births
dea
ths
• Print the output as the persistence diagram of the filtration
Diagram in dim 1 (loops)
t = 1t
t
0 1
The TDA pipeline: persistent homology
• Compute at all scale the homology of the current topological space
• Record births and deaths of topological features (loop, holes, etc.)
births
dea
ths
• Print the output as the persistence diagram of the filtration
t = 1.5
Diagram in dim 1 (loops)
t
t
0 1
The TDA pipeline: persistent homology
• Compute at all scale the homology of the current topological space
• Record births and deaths of topological features (loop, holes, etc.)
births
dea
ths
• Print the output as the persistence diagram of the filtration
t = 2
Diagram in dim 1 (loops)
t
t
0 1
2
2
The TDA pipeline: persistent homology
• Compute at all scale the homology of the current topological space
• Record births and deaths of topological features (loop, holes, etc.)
births
dea
ths
• Print the output as the persistence diagram of the filtration
t = 3
Diagram in dim 1 (loops)
t
t
0 1
2
2
3
The TDA pipeline: persistent homology
• Compute at all scale the homology of the current topological space
• Record births and deaths of topological features (loop, holes, etc.)
births
dea
ths
• Print the output as the persistence diagram of the filtration
t = 5
Diagram in dim 1 (loops)
t
t
0 1
2
2
3
5
The TDA pipeline: persistent homology
• Compute at all scale the homology of the current topological space
• Record births and deaths of topological features (loop, holes, etc.)
births
dea
ths
• Print the output as the persistence diagram of the filtration
Diagram in dim 1 (loops)
t
t
0
The TDA pipeline: persistent homology
Input item + filtration f
Sequence of increasing topo. spaces
Persistence diagram
Summary:
The TDA pipeline: persistent homology
Input item + filtration f
Sequence of increasing topo. spaces
Persistence diagram
Summary:
Persistence Diagrams as topological features
Given (X1 . . . Xn), consider (µ1 . . . µn) their resp. diagrams.
Persistence Diagrams as topological features
Given (X1 . . . Xn), consider (µ1 . . . µn) their resp. diagrams.
• Can we use the (µi)is as inputs in a ML task?
→ You can compare PDs with a metric dp (theoretically motivated)
→ PDs are “point clouds” with (possibly) different cardinalities
Persistence Diagrams as topological features
Given (X1 . . . Xn), consider (µ1 . . . µn) their resp. diagrams.
• Can we use the (µi)is as inputs in a ML task?
→ You can compare PDs with a metric dp (theoretically motivated)
Pbm: The space of PDs equipped with dp is ill-structured
no addition
no scalar multiplication
hard to define mean/barycenter
→ PDs are “point clouds” with (possibly) different cardinalities
Persistence Diagrams as topological features
Given (X1 . . . Xn), consider (µ1 . . . µn) their resp. diagrams.
• Can we use the (µi)is as inputs in a ML task?
→ You can compare PDs with a metric dp (theoretically motivated)
Pbm: The space of PDs equipped with dp is ill-structured
no addition
no scalar multiplication
hard to define mean/barycenter
Idea: Embed PDs into linear spaces viaa feature map Φ
→ PDs are “point clouds” with (possibly) different cardinalities
Persistence Diagrams as topological features
• Two principal approaches: finite dim vectorizationor infinite dim ones (kernels)
Persistence Diagrams as topological features
• Two principal approaches: finite dim vectorizationor infinite dim ones (kernels)
Do not scale well on large sets of large diagrams
Persistence Diagrams as topological features
Persistence Images [Adams et al. 2017]
Compute PD Rotate PD Compute persistencesurface
Discretize
Persistence Diagrams as topological features
Persistence Images [Adams et al. 2017]
Compute PD Rotate PD Compute persistencesurface
Discretize
ρµ(·) =∑p∈µ
wt(p) · exp
(−‖ · −p‖
2
2σ2
)
weight function: wt((x, y)) = 1
ty
Persistence Diagrams as topological features
Persistence Images [Adams et al. 2017]
Compute PD Rotate PD Compute persistencesurface
Discretize
For each pixel P , compute I(P ) =∫∫Pρµ
Concatenate all I(P ) into a single vector PI(µ) (2D image)
Discretize plane into grid of pixels:
Persistence Diagrams as topological features
Persistence Images [Adams et al. 2017]
Compute PD Rotate PD Compute persistencesurface
Discretize
Prop: [Adams et al. 2017] (stability)
• ‖PI(µ)− PI(ν)‖2 ≤√dC(w, φp) d1(µ, ν)
Persistence Diagrams as topological features
Persistence Images [Adams et al. 2017]
Compute PD Rotate PD Compute persistencesurface
Discretize
Observations: Depends on hyperparams (grid size)
but also trainable params (σ, wt...)
Persistence Diagrams as topological features
Rotate PD
Compute rank function
Use boundariesof rank function
Persistence Landscape [Bubenik 2015]
Persistence Diagrams as topological features
Observation: These vectorizations are of the form
Φ(µ) = op{φθ(p), p ∈ µ}
Where θ is a set of trainable parameters, and φθ : R2 → Rd.
Why?
where op =∑,max, top− k, etc.
Persistence Diagrams as topological features
Observation: These vectorizations are of the form
Φ(µ) = op{φθ(p), p ∈ µ}
Where θ is a set of trainable parameters, and φθ : R2 → Rd.
Why?
Idea: Say a PD is a point cloud µ = {p1 . . . pn} ⊂ R2.
We want a function Φ that is permutation invariant
where op =∑,max, top− k, etc.
Φ(p1 . . . pn) = Φ(pπ(1) . . . pπ(n))
for any permutation π ∈ Sn
Deep Sets [Zaheer et al. 2017]
[Zaheer et al. 2017] Φ is permutation invariant iff ∃ρ, φ s.t.:
Φ({p1 . . . pn}) = ρ
(n∑i=1
φ(pi)
),
Question: What are the permutation invariant functions?
provided (pi) ⊂ K compact (or countable).
Reminder: Neural networks
Rd0Rd1
Rdn
Rdn+1
Inp
ut
Ou
tpu
tf(0)θ0
f(1)θ1
f(n−1)θn−1
f(n)θn· · ·
NN with depth n ∈ N∗xi
lab
ely i
Reminder: Neural networks
Rd0Rd1
Rdn
Rdn+1
Inp
ut
Ou
tpu
tf(0)θ0
f(1)θ1
f(n−1)θn−1
f(n)θn· · ·
NN with depth n ∈ N∗
Final predictor: Fθ = f(n)θn◦ · · · ◦ f (0)
θ0
Goal: Minimize `(θ) =∑i ‖Fθ(xi)− yi‖22 w.r.t. θ
xi
lab
ely i
Reminder: Neural networks
Rd0Rd1
Rdn
Rdn+1
Inp
ut
Ou
tpu
tf(0)θ0
f(1)θ1
f(n−1)θn−1
f(n)θn· · ·
NN with depth n ∈ N∗
Final predictor: Fθ = f(n)θn◦ · · · ◦ f (0)
θ0
Goal: Minimize `(θ) =∑i ‖Fθ(xi)− yi‖22 w.r.t. θ
Requirement: Know ∇f (i)θi
for each i. ⇒ chain rule.
xi
lab
ely i
Reminder: Neural networks
Rd0Rd1
Rdn
Rdn+1
Inp
ut
Ou
tpu
tf(0)θ0
f(1)θ1
f(n−1)θn−1
f(n)θn· · ·
NN with depth n ∈ N∗
Final predictor: Fθ = f(n)θn◦ · · · ◦ f (0)
θ0
Goal: Minimize `(θ) =∑i ‖Fθ(xi)− yi‖22 w.r.t. θ
Requirement: Know ∇f (i)θi
for each i. ⇒ chain rule.
Observation: If f(0)θ0
is perm. inv., so is the NN.
xi
lab
ely i
Application to persistence diagrams
Permutation invariant layers generalize several TDA approaches
→ persistence lanscapes → silhouettes → Betti curves
Application to persistence diagrams
Permutation invariant layers generalize several TDA approaches
→ persistence lanscapes → silhouettes → Betti curves
Weight functionPoint transformation
PersLay(µ) = ρ (op{w(p) · φ(p)}p∈µ)
Permutation-invariantoperation
Application to persistence diagrams
Permutation invariant layers generalize several TDA approaches
→ persistence lanscapes → silhouettes → Betti curves
Weight functionPoint transformation
PersLay(µ) = ρ (op{w(p) · φ(p)}p∈µ)
Permutation-invariantoperation
Our contribution: Implementing (and sharing) it.
Application to persistence diagrams
Goal: classify orbits of linked twisted map
Orbits described by (depending on parameter r):{xn+1 = xn + r yn(1− yn) mod 1
yn+1 = yn + r xn+1(1− xn+1) mod 1
Label = 2
Label = 1 Label = 5
Label = 4Label = 3
(flow dynamics)
Application to persistence diagrams
Goal: classify orbits of linked twisted map
Label = 2
Label = 1 Label = 5
Label = 4Label = 3
(flow dynamics)
Take home messages
• Topological Data Analysis builds topological summary
of a given object: the persistence diagram.
• PDs are not straightforward to use in ML pipelines
• A workaround is to vectorize our PDs, adaptativelyto a given learning task (e.g. classification).
• The “Deep Sets” architecture can be used to incorporate PDsin NN architecture while encompassing existing vectorizations.
Take home messages
• Topological Data Analysis builds topological summary
of a given object: the persistence diagram.
• PDs are not straightforward to use in ML pipelines
• A workaround is to vectorize our PDs, adaptativelyto a given learning task (e.g. classification).
Supplementary material
• https://github.com/MathieuCarriere/perslay→ soon integrated to the Gudhi library (hopefully).
• The “Deep Sets” architecture can be used to incorporate PDsin NN architecture while encompassing existing vectorizations.
• PersLay: A Neural Network Layer for Persistence Diagramsand New Graph Topological Signatures, AISTATS 2020