Invariance and Stability of Deep Convolutional Representationslcarin/Liqun1.11.2019.pdf ·...
Transcript of Invariance and Stability of Deep Convolutional Representationslcarin/Liqun1.11.2019.pdf ·...
Invariance and Stability of Deep ConvolutionalRepresentations
Alberto Bietti, Julien Maira
Univ. Grenoble Alpes, Inria
Presented by Liqun Chen
Jan 11th, 2017
1
Outline
1 Introduction
2 Notation and basic mathematical tools
3 Construction of the Multilayer Convolutional Kernel Network (CKM)Patch extraction operatorKernel mapping operatorPooling operatorMultilayer construction
4 Stability to deformations
5 Link with CNN
2
Introduction
Outline
1 Introduction
2 Notation and basic mathematical tools
3 Construction of the Multilayer Convolutional Kernel Network (CKM)Patch extraction operatorKernel mapping operatorPooling operatorMultilayer construction
4 Stability to deformations
5 Link with CNN
3
Introduction
Introduction
Motivation
Understanding the geometry of these functional spaces is afundamental question.
Representations that are stable to small deformations can robustmodels that may exploit these invariances complexity.
Related work
scattering transform is a recent attempt to characterize convolutionalmultilayer architectures based on wavelets.
scattering transform networks do not involve “learning”, since thefilters of the networks are pre-defined.
4
Introduction
Contribution of this work
This paper studies the translation-invariance properties of the kernelrepresentation and its stability to the action of diffeomorphisms, obtainingsimilar guarantees as the scattering transform, while preserving signalinformation.
5
Notation and basic mathematical tools
Outline
1 Introduction
2 Notation and basic mathematical tools
3 Construction of the Multilayer Convolutional Kernel Network (CKM)Patch extraction operatorKernel mapping operatorPooling operatorMultilayer construction
4 Stability to deformations
5 Link with CNN
6
Notation and basic mathematical tools
Notation and basic mathematical tools (I)
1 A positive definite kernel K that operates on a set X implicitly defines areproducing kernel Hilbert space (RKHS) H of functions from X to R, alongwith a mapping φ : X → H;
2 A predictive model associates to every point z in X and a label in R. Itconsists of a linear function f in H such that f(z) = 〈f, φ(z)〉H, where φ(z)is the data representation.
3 Given two points z, z′ ∈ X , Cauchy-Schwarz’s inequality allows to controlthe variation of the model f : | f(z)− f(z′) |≤‖ f ‖H‖ φ(z)− φ(z′) ‖H .If z and z′ are close to each other under RKHS norm, the model should output
similar predictions, when the model f has reasonably small norm in H
7
Notation and basic mathematical tools
Notation and basic mathematical tools (II)
1 a signal x is a function in L2(Ω,H), where Ω is a subset of Rdrepresenting spatial coordinates
2 Given a linear operator T : L2(Ω,H)→ L2(Ω,H′), the operator normis defined as ‖ T ‖L2(Ω,H)→L2(Ω,H′):= sup‖x‖L2(Ω,H)≤1 ‖ Tx ‖L2(Ω,H′)
3 For simplicity, | · | is the Euclidean norm on Rd, ‖ · ‖ is the Hilbertspace norm.
8
Construction of the Multilayer Convolutional Kernel Network(CKM)
Outline
1 Introduction
2 Notation and basic mathematical tools
3 Construction of the Multilayer Convolutional Kernel Network (CKM)Patch extraction operatorKernel mapping operatorPooling operatorMultilayer construction
4 Stability to deformations
5 Link with CNN
9
Construction of the Multilayer Convolutional Kernel Network(CKM)
model
10
Construction of the Multilayer Convolutional Kernel Network(CKM)
Framework of the model
As shown in Figure 1, a new map xk is built from the previous one xk–1 byapplying successively three operators that perform patch extraction (Pk),kernel mapping (Mk) in a new RKHS Hk, and linear pooling (Ak),respectively. When going up in the hierarchy, the points xk(u) carryinformation from larger signal neighborhoods centered at u in Ω with moreinvariance, as we will formally show.
11
Construction of the Multilayer Convolutional Kernel Network(CKM) Patch extraction operator
Patch extraction operator
12
Construction of the Multilayer Convolutional Kernel Network(CKM) Patch extraction operator
Patch extraction operator
Given the layer xk–1, we consider a patch shape Sk, defined as acompact centered subset of Ω, e.g., a box,
we define the Hilbert space Pk := L2(Sk,Hk–1) equipped with thenorm ‖z‖2 =
∫Sk‖z(u)‖2dνk(u), where dνk is the normalized uniform
measure on Sk for every z in Pk.
we define the (linear) patch extraction operatorPk : L2(Ω,Hk–1)→ L2(Ω,Pk) such that for all u in Ω,
Pkxk–1(u) = (v 7→ xk–1(u+ v))v∈Sk∈ Pk.
Note that by equipping Pk with a normalized measure, by Fubini’stheorem, ‖Pkxk–1‖ = ‖xk–1‖ and hence Pkxk–1 is in L2(Ω,Pk).
13
Construction of the Multilayer Convolutional Kernel Network(CKM) Kernel mapping operator
Kernel mapping operator
14
Construction of the Multilayer Convolutional Kernel Network(CKM) Kernel mapping operator
Kernel mapping operator
Then, we map each patch of xk–1 to a RKHS Hk using the kernel mappingφk : Pk → Hk associated to a positive definite kernel Kk that operates onpatches.
We can define the non-linear pointwise operator Mk such that for all u in Ω,
MkPkxk–1(u) := φk(Pkxk–1(u)) ∈ Hk.
In this paper, it uses homogeneous dot-product kernels of the form:
Kk(z, z′) = ‖z‖‖z′‖κk(〈z, z′〉‖z‖‖z′‖
)= 〈φk(z), φk(z′)〉, (1)
where κk(u) =∑∞j=0 bjuj with bj ≥ 0 and κk(1) = 1, which ensures that
‖MkPkxk–1(u)‖ = ‖Pkxk–1(u)‖ and that ‖MkPkxk–1‖ is in L2(ω,Hk).
15
Construction of the Multilayer Convolutional Kernel Network(CKM) Kernel mapping operator
Kernel mapping operator
Convolutional Kernel Networks approximation
Approximate φk(z) by projection on span(φk(z1), ..., φk(zp))
Leads to tractable, p-dimensional representation ψk(z)
Anchor points z1, . . . , zp can be learned from data (K-means or backprop)
16
Construction of the Multilayer Convolutional Kernel Network(CKM) Pooling operator
Pooling operator
17
Construction of the Multilayer Convolutional Kernel Network(CKM) Pooling operator
Pooling operator
The last step to build the layer xk consists of pooling neighboring values toachieve local shift-invariance.
We apply a linear convolution operator Ak with a Gaussian filter of scale σk,hσk
(u) := σ−dk h(u/σk), where h(u) = (2π)−d/2 exp(−|u|2/2).
Then, for all u in Ω,
xk(u) = AkMkPkxk–1(u) =
∫Rd
hσk(u− v)MkPkxk–1(v)dv ∈ Hk, (2)
Applying Schur’s test, we can obtains ‖Ak‖ ≤ 1. Thus, xk is in L2(Ω,Hk), with‖xk‖ = ‖AkMkPkxk–1‖ ≤ ‖MkPkxk–1‖.
18
Construction of the Multilayer Convolutional Kernel Network(CKM) Pooling operator
Recap
19
Construction of the Multilayer Convolutional Kernel Network(CKM) Multilayer construction
Multilayer construction
Finally, we obtain a multilayer representation by composing multiple timesthe previous operators. In order to increase invariance with each layer, thesize of the patch Sk and pooling scale σk grow exponentially with k, withσk and the patch size supc∈Sk
|c| of the same order. With n layers, themaps xn may then be written
φn(x0) := xn = AnMnPnAn–1Mn–1Pn–1 · · · A1M1P1x0 ∈ L2(Ω,Hn).(3)
20
Stability to deformations
Outline
1 Introduction
2 Notation and basic mathematical tools
3 Construction of the Multilayer Convolutional Kernel Network (CKM)Patch extraction operatorKernel mapping operatorPooling operatorMultilayer construction
4 Stability to deformations
5 Link with CNN
21
Stability to deformations
Stability to deformations: Definition
C1 diffeomorphism: τ : Ω→ Ω
action operator: Lτx(u) = x(u− τ(u))
Representation Φ(·) is stable if:
‖Φ(Lτx)− Φ(x)‖ ≤ (c1‖∇τ‖∞ + c2‖τ‖∞)‖x‖,
here c1, c2 are two constants, ∇τ is the Jacobian,‖∇τ‖∞ = supu∈Ω‖∇τ(u)‖, ‖τ‖∞ = supu∈Ω|τ(u)|.
translation invariance: c2 → 0
22
Stability to deformations
Stability results
Theorem
Let Φ(x) be a representation given by Φ(x) = Φn(A0x). If ‖∇τ‖∞ ≤ 12 ,
we have:
‖Φ(Lτx)− Φ(x)‖ ≤ (c1(1 + n)‖∇τ‖∞ +c2
σn‖τ‖∞)‖x‖ .
Here we assume that the input signal x0 = A0x, where A0 is the initialpooling operator which is used to control the high frequencies. σn is theparameter controls the pooling layer (reminder: it will grow exponentiallywith the number of layers n).
23
Link with CNN
Outline
1 Introduction
2 Notation and basic mathematical tools
3 Construction of the Multilayer Convolutional Kernel Network (CKM)Patch extraction operatorKernel mapping operatorPooling operatorMultilayer construction
4 Stability to deformations
5 Link with CNN
24
Link with CNN
Link with CNN
CNN map construction:
CNN function fσ, input image x0 ∈ L2(Ω,Rp0) with p0 channels.
feature maps represented at layer k as a function zk ∈ L2(Ω,Rp0)
a set of filters (wik)i=1,...,pk , activation function δ
intermediate feature maps (before pooling operation) zk:
zik = nk(u)δ(〈wik, Pkzk−1(u)〉/nk(u)).
Here Pk is the patch extractor, nk(u) = ‖Pkxk−1(u)‖
Homogenerous activations: i.e., δ : z → ‖z‖δ(〈g, z〉/‖z‖) for all g in Pk
25