Computational Topology - Mapperkbuchin/teaching/2IMA00/2018/Slides/Mapp… · I Feature selection...
Transcript of Computational Topology - Mapperkbuchin/teaching/2IMA00/2018/Slides/Mapp… · I Feature selection...
Computational Topology - Mapper
Jiaqi Ni
Eindhoven University of Technology
June 14, 2018
Outline
Introduction
Mapper in the continuous setting
Mapper in practice
Parameters of Mapper in practice
Applications
Summary
Introduction
Mapper in the continuous setting
Mapper in practice
Parameters of Mapper in practice
Applications
Summary
Introduction
I Mapper is a computational method for extracting simpledescriptions of high dimensional data sets in the form ofsimplicial complexes.
Recap about Reeb Graph
Definition: The Reeb graph of f is the set of contours R(f).
Recap about Reeb Graph
We can get similar result as Reeb Graph with Mapper.
Recap about Reeb Graph
We can also get the more different results from Reeb Graph withMapper.
Introduction
Mapper in the continuous setting
Mapper in practice
Parameters of Mapper in practice
Applications
Summary
Cover of space
If the set X is a topological space, then a cover C of X is acollection of subsets U of X whose union is the whole spaceX. In this case we say that C covers X, or that the sets Ucover X.
Topological Space X Cover of Space X
Cover of space
If the set X is a topological space, then a cover C of X is acollection of subsets U of X whose union is the whole spaceX. In this case we say that C covers X, or that the sets Ucover X.
Topological Space X Cover of Space X
Cover of space
If Y is a subset of X, then a cover of Y is a collection ofsubsets of X whose union contains Y,
i.e., C is a cover of Y if Y ⊆⋃α∈C
Uα
Cover of space
If Y is a subset of X, then a cover of Y is a collection ofsubsets of X whose union contains Y,
i.e., C is a cover of Y if Y ⊆⋃α∈C
Uα
Cover refinement
I A refinement of a cover C of a topological space X is a newcover D of X such that every set in D is contained in someset in C.
I Formally: D = {Vβ∈B} is a refinement of C = {Uα∈A}when ∀β ∃α Vβ ⊆ Uα
Space X Cover of Space X Refinement of Cover
Cover refinement
I A refinement of a cover C of a topological space X is a newcover D of X such that every set in D is contained in someset in C.
I Formally: D = {Vβ∈B} is a refinement of C = {Uα∈A}when ∀β ∃α Vβ ⊆ Uα
Space X Cover of Space X Refinement of Cover
Cover refinement
I A refinement of a cover C of a topological space X is a newcover D of X such that every set in D is contained in someset in C.
I Formally: D = {Vβ∈B} is a refinement of C = {Uα∈A}when ∀β ∃α Vβ ⊆ Uα
Space X Cover of Space X Refinement of Cover
Mapper in the continuous setting
Input:
I Continuous function(filter) f : X→ RI Cover C of im(f) by open intervals: im(f ) ⊆
⋃c∈C
c
Method:
I Compute pullback cover U of X: U = f −1(c)c∈CI Refine U by separating each of its elements into its various
connected components → connected cover VI The Mapper is the nerve of V:
I 1 vertex per element V ∈ VI 1 edge per intersection V ∪ V ′ 6= ø, V ,V ′ ∈ VI 1 k-simplex per (k + 1)-fold intersection,⋃k
i=0 Vi 6= ø,V0,V1...Vk ∈ V
Example of Mapper in the continuous setting
Example of Mapper in the continuous setting
Example of Mapper in the continuous setting
Introduction
Mapper in the continuous setting
Mapper in practice
Parameters of Mapper in practice
Applications
Summary
Mapper in practice
Input:
I Point cloud P with distance matrix
I Continuous function(filter) f : P → RI Cover C of im(f) by open intervals: im(f ) ⊆
⋃c∈C
c
Method:
I Compute pullback cover U of X: U = f −1(c)c∈CI Refine U by applying clustering algorithm(with distance
threshold δ) → connected cover VI The Mapper is the nerve of V:
I 1 vertex per element V ∈ VI 1 edge per intersection V ∪ V ′ 6= ø, V ,V ′ ∈ VI 1 k-simplex per (k + 1)-fold intersection,⋃k
i=0 Vi 6= ø,V0,V1...Vk ∈ V
Example of Mapper in practice
Example of Mapper in practice
Introduction
Mapper in the continuous setting
Mapper in practice
Parameters of Mapper in practice
Applications
Summary
Parameters of Mapper in practice
I Filter f : P → R
I Cover C of im(f) by open intervals:
I Clustering algorithm
Parameters of Mapper in practice
I Filter f : P → R
I Cover C of im(f) by open intervals:
I Clustering algorithm
Parameters of Mapper in practice
I Filter f : P → R
I Cover C of im(f) by open intervals:
I Clustering algorithm
Parameters of Mapper in practice - Filter functions
I The outcome of Mapper is highly dependent on the functionchosen to partition (filter) the data set and the choice offunctions depends mostly on the dataset.
I Possible functions:I DensityI EccentricityI Graph LaplaciansI sum/average/max/minI x/y- axis projection
Filter function examples
Filter function examples
Filter function examples
Filter function examples
Parameters of Mapper in practice - Cover
I Uniform cover II resolution / granularity: r (diameter of intervals)I gain: g (percentage of overlap)
I Example:
I Modification of r and g can highly effect the result.
Parameters of Mapper in practice - Cover
I Uniform cover II resolution / granularity: r (diameter of intervals)I gain: g (percentage of overlap)
I Example:
I Modification of r and g can highly effect the result.
Parameters of Mapper in practice - Cover
I Uniform cover II resolution / granularity: r (diameter of intervals)I gain: g (percentage of overlap)
I Example:
I Modification of r and g can highly effect the result.
Cover examples
Cover examples
Cover examples
Cover examples
Mapper for Y-shape point cloud data
Mapper for Y-shape point cloud data
Parameters of uniform Cover
Parameter r:
I Small r : fine cover, Mapper close to Reeb Graph, butsensitive to δ.
I Large r : rough cover, less sensitive to δ, but Mapper far fromReeb Graph.
Parameter g:
I Large g(close to 1): more points inside intersections, lesssensitive to δ but far from Reeb Graph.
I Small g(close to 0): controlled Mapper dimension, close toReeb Graph.
Parameters of Mapper in practice - Clustering algorithm
Single-linkage clustering is one of several methods of hierarchicalclustering.
I Based on grouping clusters in bottom-up fashion(agglomerative clustering).
I At each step combining two clusters that contain the closestpair of elements not yet belonging to the same cluster as eachother.
Example of Single-linkage clustering
Example of Single-linkage clustering
Example of Single-linkage clustering
Example of Single-linkage clustering
Example of Single-linkage clustering
Example of Single-linkage clustering
Example of Single-linkage clustering
Example of Clustering algorithm with different parameters
Example of Clustering algorithm with different parameters
Example of Clustering algorithm with different parameters
Example of Clustering algorithm with different parameters
Parameters of graph neighborhood size
Parameter δ:
I Large δ: fewer nodes, clean Mapper but far from ReebGraph(more straight lines).
I Small δ: presence of topological structure but lots of nodes(noisy).
Higher Dimensional Parameter Spaces
I We use 1 function and let R to be our 1-dimensionalparameter space.
I We can use M functions and let RM to be our M-dimensionalparameter space, remain to find a covering of anM-dimensional hypercube which is defined by the ranges ofthe M functions.
Higher Dimensional Parameter Spaces
I We use 1 function and let R to be our 1-dimensionalparameter space.
I We can use M functions and let RM to be our M-dimensionalparameter space, remain to find a covering of anM-dimensional hypercube which is defined by the ranges ofthe M functions.
Example of parameter space R2
I Assume we have a point could dataset P (2-Dim) as following.
I Assume we have two filter functions f : P → R, g : P → R,and f = f −1 and g = g−1.
Example of parameter space R2
I Moreover, assume we have the following cover C , which isalso the cover of P since f = f −1 and g = g−1.
Example of parameter space R2
I Moreover, assume we have the following cover C , which isalso the cover of P since f = f −1 and g = g−1.
I Assume the clustering algorithm group every points in eachrectangle as one cluster.
Example of parameter space R2
I Moreover, assume we have the following cover C , which isalso the cover of P since f = f −1 and g = g−1.
I Assume the clustering algorithm group every points in eachrectangle as one cluster.
Example of parameter space R2
I Moreover, assume we have the following cover C , which isalso the cover of P since f = f −1 and g = g−1.
I Assume the clustering algorithm group every points in eachrectangle as one cluster.
I Whenever clusters corresponding to any n vertices have nonempty intersection, add a corresponding n-1 simplex.
Example of parameter space R2
I Two clusters intersection = 1 edge.
Example of parameter space R2
I Three clusters intersection = 1 triangle.
Example of parameter space R2
I Four clusters intersection = 1 tetrahedron.
Example of parameter space R2
I Final simplical complex.
Higher Dimensional Parameter Spaces
Mapper to the parameter space RM can be extended in a similarfashion (by finding a covering of an M-dimensional hypercubewhich is defined by the ranges of the M functions).
Introduction
Mapper in the continuous setting
Mapper in practice
Parameters of Mapper in practice
Applications
Summary
Mapper in Applications
Most commonly used in:
I Clustering
I Feature selection (flares, loops)
Applications to Medical science data
145 patients who had diabetes, for each patient, six quantitieswere measured:
I Age
I Relative weight
I Fasting plasma glucose
I Area under the plasma glucose curve for the three hourglucose tolerance test (OGTT)
I Aarea under the plasma insulin curve for the (OGTT)
I Steady state plasma glucose response
This creates a 6 dimensional data set.
Applications to Medical science data
I Applying projection pursuit methods to obtain a projectioninto three dimensional Euclidean space
We want to use Mapper as an automatic tool for detectingsuch flares in the data.
Applications to Medical science data
I Applying projection pursuit methods to obtain a projectioninto three dimensional Euclidean space
We want to use Mapper as an automatic tool for detectingsuch flares in the data.
Applications to Medical science data
I Left: 3 intervals, 50% overlap.
I Right: 4 intervals, 50% overlap.I For each output:
I Left flare: adult onset Right flare: juvenile onsetI Distance function: L2-distanceI Filter function: density kernel with e=130,000
Mapper in Applications
I Innate and adaptive T cells in asthmatic patients:Relationship to severity and disease mechanisms, Hinks et al.,J. Allergy Clinical Immunology, 2015
I Topological Data Analysis for Discovery in Preclinical SpinalCord Injury and Traumatic Brain Injury, Nielson et al., Nature,2015
I Using Topological Data Analysis for Diagnosis PulmonaryEmbolism, Rucco et al., arXiv preprint, 2014
I CD8 T-cell reactivity to islet antigens is unique to type 1while CD4 T-cell reactivity exists in both type 1 and type 2diabetes, Sarikonda et al., J. Autoimmunity, 2013
I Extracting insights from the shape of complex data usingtopology, Lum et al., Nature, 2013
I Topological Methods for Exploring Low-density States inBiomolecular Folding Pathways, Yao et al., J. ChemicalPhysics, 2009
Introduction
Mapper in the continuous setting
Mapper in practice
Parameters of Mapper in practice
Applications
Summary
Summary
I Mapper: a computational method which retrieves ahigher-level understanding of the structure of data.
I Mapper in continuous setting.
I Mapper in practiceI Parameters of Mapper in practice
I filter function.I covering algorithm.I clustering algorithm.
I Applications
Sources
I [SMG07] G. Singh, F. M’emoli, G. Carlsson, TopologicalMethods for the Analysis of High Dimensional Data Sets and3D Object Recognition, Eurographics Symposium onPoint-Based Graphics 2007.
I Examples and images from Tutorial of topological dataanalysis part 3(Mapper algorithm):https://www.slideshare.net/Eniod/tutorial-of-topological-data-analysis-part-3mapper-algorithm
I Examples and images from Introduction to Topological DataAnalysis:https://www.slideshare.net/hendrikarisma/introduction-to-topological-data-analysis-59759836
I Examples and images from KeplerMapper:https://mlwave.github.io/kepler-mapper/