Advanced Topics in Equivariance: Lecture 3 Machine ... · Machine Learning Summer School, Indonesia...

44
MLSS INDO August 2020 Daniel Worrall 1 Advanced Topics in Equivariance: Lecture 3 Machine Learning Summer School, Indonesia 2020 Daniel Worrall Legong Dance

Transcript of Advanced Topics in Equivariance: Lecture 3 Machine ... · Machine Learning Summer School, Indonesia...

Page 1: Advanced Topics in Equivariance: Lecture 3 Machine ... · Machine Learning Summer School, Indonesia 2020 Daniel Worrall Legong Dance. Daniel Worrall MLSS INDO August 2020 2 This lecture:

MLSS INDO August 2020Daniel Worrall 1

Advanced Topics in Equivariance: Lecture 3 Machine Learning Summer School, Indonesia 2020

Daniel Worrall

Legong Dance

Page 2: Advanced Topics in Equivariance: Lecture 3 Machine ... · Machine Learning Summer School, Indonesia 2020 Daniel Worrall Legong Dance. Daniel Worrall MLSS INDO August 2020 2 This lecture:

MLSS INDO August 2020Daniel Worrall 2

This lecture: EquivarianceGrey Box ModelsConvolutionsInvariance and EquivarianceGroup TheoryConvolution Generalizations

Group Convolutions, Steerable Convolutions, Semigroup Correlation, Gauge Convolution

Page 3: Advanced Topics in Equivariance: Lecture 3 Machine ... · Machine Learning Summer School, Indonesia 2020 Daniel Worrall Legong Dance. Daniel Worrall MLSS INDO August 2020 2 This lecture:

MLSS INDO August 2020Daniel Worrall

CNNs are responsible for many successes

3

ImageNet [figure from Wikipedia] Dilated Residual Networks, Yu et al. (2017)

Page 4: Advanced Topics in Equivariance: Lecture 3 Machine ... · Machine Learning Summer School, Indonesia 2020 Daniel Worrall Legong Dance. Daniel Worrall MLSS INDO August 2020 2 This lecture:

MLSS INDO August 2020Daniel Worrall

Inductive biases and overfitting

4

Remove degrees of freedom that do not aid generalization error.

• What assumptions can we make?• How we encode these assumptions?• What framework do we use?

Space of modellable functions

Constrained space

True function

F1F2

F3

F

Page 5: Advanced Topics in Equivariance: Lecture 3 Machine ... · Machine Learning Summer School, Indonesia 2020 Daniel Worrall Legong Dance. Daniel Worrall MLSS INDO August 2020 2 This lecture:

MLSS INDO August 2020Daniel Worrall

Polynomial example

5

Original quintic polynomial

variance

bias2

bias2 + variance

Bias—variance decomposition

Too flexible

Space of modellable functions

Constrained space

True function

F1F2

F3

Fp(w|⌧ ) = N (w|0, 102I)Unit Gaussian

wMAP = argmaxw

p(w|D, ⌧ )

p(w|D,�2, ⌧ ) =p(D|w,�2)p(w|⌧ )

p(D|⌧ )

Polynomial fitting to a quintic. What order do we use?

y = w0 + w1x...+ w5x5 + ✏

Try multiple orders

� log p(D|⌧ )- evidence

- log-lik. MAP� log p(D|wMAP)

Log-likelihood and evidence

Too rigid

Page 6: Advanced Topics in Equivariance: Lecture 3 Machine ... · Machine Learning Summer School, Indonesia 2020 Daniel Worrall Legong Dance. Daniel Worrall MLSS INDO August 2020 2 This lecture:

MLSS INDO August 2020Daniel Worrall

Inductive Biases

6

Black box modelsWhite box models Grey box modelsMost ML models sit here.

Regression • Polynomial order• Likelihood and prior distribution

Classification • Sigmoid function• Linear features• ML or MAP?

Deep learning • Number of layers• Nonlinearities• Optimizer• Structure: Convolutions • …

Page 7: Advanced Topics in Equivariance: Lecture 3 Machine ... · Machine Learning Summer School, Indonesia 2020 Daniel Worrall Legong Dance. Daniel Worrall MLSS INDO August 2020 2 This lecture:

MLSS INDO August 2020Daniel Worrall 7

EquivarianceBatik making

Page 8: Advanced Topics in Equivariance: Lecture 3 Machine ... · Machine Learning Summer School, Indonesia 2020 Daniel Worrall Legong Dance. Daniel Worrall MLSS INDO August 2020 2 This lecture:

MLSS INDO August 2020Daniel Worrall

Convolutions

8

For images we use convolutions.

Convolutions use:1. Translational weight-tying

2. Small reception fields

Drop-in replacement for the linear layer.

Indeed, it is just a linear layer!

e.g. LeCun et al. (1991)

Standard convolution

[F ⇤W](x) =X

y2Z2

F(y)W(y � x)

Filter Feature map

x �(·)

b1

+

W1

Layer

z1⇤

Convolutions can be “reshaped” into matrix-vector products, revealing the weight-tying and locality

Locality

Weight-tying

Page 9: Advanced Topics in Equivariance: Lecture 3 Machine ... · Machine Learning Summer School, Indonesia 2020 Daniel Worrall Legong Dance. Daniel Worrall MLSS INDO August 2020 2 This lecture:

MLSS INDO August 2020Daniel Worrall

Convolutions are special

9

https://www.healthypawspetinsurance.com/Images/V3/DogAndPuppyInsurance/Dog_CTA_Desktop_HeroImage.jpghttp://www.gatsby.ucl.ac.uk/~turner/teaching/4g3/2013/stats-recep-field.pdf

Fewer learnable parameters

Parameter efficient: same function — fewer parametersStatistically efficient: need less dataComputationally efficient: need less GPU time

Why learn translational weight-tying when we can build it in?

What about rotation, scaling, warps, shears, color jitter, blur, …?

Locality of pixel statistics Property of the data

Locality

Translational invariance of classification Property of the taskWeight-tying

Page 10: Advanced Topics in Equivariance: Lecture 3 Machine ... · Machine Learning Summer School, Indonesia 2020 Daniel Worrall Legong Dance. Daniel Worrall MLSS INDO August 2020 2 This lecture:

MLSS INDO August 2020Daniel Worrall

Convolutional variants

10

Group convolution

Regular-representation conv.

Steerable convolution

Semigroup correlation

Gauge convolution

Unitary convolution

Tangent space convolutions

•Harmonic network•Spherical convolution•Steerable SE(3) network•Tensor Field Network

•Deep Scale Space

•Icosahedral CNN•Gauge Equivariant Mesh Convolution

•Translational convolution•Group CNN•HexaConv•CubeNet•Scale Convolution

• Learning to convolve

•LieConv

Page 11: Advanced Topics in Equivariance: Lecture 3 Machine ... · Machine Learning Summer School, Indonesia 2020 Daniel Worrall Legong Dance. Daniel Worrall MLSS INDO August 2020 2 This lecture:

MLSS INDO August 2020Daniel Worrall

Symmetry: Invariance

11

Classification Signal discovery/detection

Symmetry is a property of tasks, not of data!

f(I) = f(T✓[I])

Set of input transformations leaving invariantS✓[f ](I) = f(T✓[I])

Transformation

Image

Function/feature mapping

Disentangling (cocktail party)Dynamical Systems

T✓[I](x) = I(x� ✓)

T✓[I](x) = I(R�1✓ x)

T [I] = (I� µ)/.��1

Translation

Rotation

Data whitening

Notation

Page 12: Advanced Topics in Equivariance: Lecture 3 Machine ... · Machine Learning Summer School, Indonesia 2020 Daniel Worrall Legong Dance. Daniel Worrall MLSS INDO August 2020 2 This lecture:

MLSS INDO August 2020Daniel Worrall

Symmetry: Equivariance

12

Convolution (and correlation)

[I ⇤W](x� ✓)| {z }S✓ [I⇤W](x)

= [T✓[I] ⇤W](x)

S✓ = IdInvariance

S✓[f ](I) = f(T✓[I])S✓[f ](I) = f(T✓[I])S✓[f ](I) = f(T✓[I])S✓[f ](I) = f(T✓[I])

Different representations of same transformation

Transformation in feature space

Theorem [Kondor & Trivedi (2018), Cohen et al. (2020a), Bekkers (2020)]: Group convolution is the only linear map* equivariant to group-structured** transformations

*Linear, bounded map between measurable homogeneous spaces **Countable groups and Lie groups

Feature space

Input space

f(Tg[I])= S✓[f(I)]

f(Tg[I])= S✓[f(I)]S✓[f ](I) = f(T✓[I])

Commutative diagram

Page 13: Advanced Topics in Equivariance: Lecture 3 Machine ... · Machine Learning Summer School, Indonesia 2020 Daniel Worrall Legong Dance. Daniel Worrall MLSS INDO August 2020 2 This lecture:

MLSS INDO August 2020Daniel Worrall

S✓[f ](I) = f(T✓[I])S✓[f ](I) = f(T✓[I])

Symmetry: Equivariance

13

S✓[f ](I) = f(T✓[I])S✓[f ](I) = f(T✓[I])

Different representations of same transformation

Transformation in feature space

Commutative diagram

Feature space

Input space

Sophisticated invariance: Equivalence classes preserved

S✓[f ](I) = f(T✓[I])

f(Tg[I])= S✓[f(I)]

f(Tg[I])= S✓[f(I)]

Page 14: Advanced Topics in Equivariance: Lecture 3 Machine ... · Machine Learning Summer School, Indonesia 2020 Daniel Worrall Legong Dance. Daniel Worrall MLSS INDO August 2020 2 This lecture:

MLSS INDO August 2020Daniel Worrall

Translational Equivariance Proof

14

[I ⇤W](x) =X

y2Z2

I(y)W(y � x)

[Tt[I] ⇤W](x) =X

y2Z2

Tt[I](y)W(y � x)

=X

y2Z2

I(y � t)W(y � x)

=X

y02Z2

I(y0)W(y0 � (x� t))

= [I ⇤W](x� t) = Tt[I ⇤W](x)

[Tt[I] ⇤W](x) =X

y2Z2

Tt[I](y)W(y � x)

=X

y2Z2

I(y � t)W(y � x)

=X

y02Z2

I(y0)W(y0 � (x� t))

= [I ⇤W](x� t) = Tt[I ⇤W](x)

Tt[I](x) = I(x� t)Hints

[Tt[I] ⇤W](x) =X

y2Z2

Tt[I](y)W(y � x)

=X

y2Z2

I(y � t)W(y � x)

=X

y02Z2

I(y0)W(y0 � (x� t))

= [I ⇤W](x� t) = Tt[I ⇤W](x)

Write out convolution in full[Tt[I] ⇤W](x) =X

y2Z2

Tt[I](y)W(y � x)

=X

y2Z2

I(y � t)W(y � x)

=X

y02Z2

I(y0)W(y0 � (x� t))

= [I ⇤W](x� t) = Tt[I ⇤W](x)

Expand translation operator

[Tt[I] ⇤W](x) =X

y2Z2

Tt[I](y)W(y � x)

=X

y2Z2

I(y � t)W(y � x)

=X

y02Z2

I(y0)W(y0 � (x� t))

= [I ⇤W](x� t) = Tt[I ⇤W](x)

Change of variablesy0 = y � t

[Tt[I] ⇤W](x) =X

y2Z2

Tt[I](y)W(y � x)

=X

y2Z2

I(y � t)W(y � x)

=X

y02Z2

I(y0)W(y0 � (x� t))

= [I ⇤W](x� t) = Tt[I ⇤W](x) Oh look, it’s a convolution!

Page 15: Advanced Topics in Equivariance: Lecture 3 Machine ... · Machine Learning Summer School, Indonesia 2020 Daniel Worrall Legong Dance. Daniel Worrall MLSS INDO August 2020 2 This lecture:

MLSS INDO August 2020Daniel Worrall

A Little Group Theory

15

Abstraction

3. Identity: “do nothing” transformatione � g = g � e = g

1. Closure: compositions well-behaved

g � h 2 G

0�

�90�

90� 90� 90�

90�

180� 180�

I

90�Th[I]

135�Tg[Th[I]]

Tg�h[I] 225�

Action/Transformation

(g � h) � k = g � (h � k) = g � h � k2. Associativity: brackets unnecessary

g � g�1 = g�1 � g = e4. Invertibility: transformation reversible

CompositionSet of transformations

Group theoryMathematical abstraction modeling compositional structure.

Can be used to model transformations

Page 16: Advanced Topics in Equivariance: Lecture 3 Machine ... · Machine Learning Summer School, Indonesia 2020 Daniel Worrall Legong Dance. Daniel Worrall MLSS INDO August 2020 2 This lecture:

MLSS INDO August 2020Daniel Worrall

Convolutional variants: Regular-representation convolutions

16

Group convolution

Regular-representation conv.

Steerable convolution

Semigroup correlation

Gauge convolution

Unitary convolution

Tangent space convolutions

•Harmonic network•Spherical convolution•Steerable SE(3) network•Tensor Field Network

•Deep Scale Space

•Icosahedral CNN•Gauge Equivariant Mesh Convolution

•Translational convolution•Group CNN•HexaConv•CubeNet•Scale Convolution

• Learning to convolve

•LieConv

Page 17: Advanced Topics in Equivariance: Lecture 3 Machine ... · Machine Learning Summer School, Indonesia 2020 Daniel Worrall Legong Dance. Daniel Worrall MLSS INDO August 2020 2 This lecture:

MLSS INDO August 2020Daniel Worrall

The Regular Group Convolution

17

Standard convolution

[F ⇤W](x) =X

y2Z2

F(y)W(y � x)

Inner product

Translations

Group convolution

[F ⇤W](x) =X

y2YF(y)W(T �1

x [y])

Inner product

Transformations

Tx[y] = y + x

Hints

Tx[y] = Rxy Tx[y] = Rxy + tx

Translation Rotation Roto-translation

is transformation parameterxAdjusted domain

At each location multiple transformed filter copies

Page 18: Advanced Topics in Equivariance: Lecture 3 Machine ... · Machine Learning Summer School, Indonesia 2020 Daniel Worrall Legong Dance. Daniel Worrall MLSS INDO August 2020 2 This lecture:

MLSS INDO August 2020Daniel Worrall

Group Transformation Examples

18

Scalings

Translation

Reflections

Roto-translationRotation

Occlusions

Non-example

Page 19: Advanced Topics in Equivariance: Lecture 3 Machine ... · Machine Learning Summer School, Indonesia 2020 Daniel Worrall Legong Dance. Daniel Worrall MLSS INDO August 2020 2 This lecture:

MLSS INDO August 2020Daniel Worrall

SO(2) Rotation

19

Group convolution

[F ⇤W](x) =X

y2YF(y)W(T �1

x [y])Ac

tivat

ion

spac

e

Input

[TR [F] ⇤W](R✓) = [F ⇤W](R�1 R✓)

Activ

atio

n sp

ace

Input

[F ⇤W](R✓) =X

y2YF(y)W(TR�1

✓[x])

Notice index of convolution

Rotating kernel

Page 20: Advanced Topics in Equivariance: Lecture 3 Machine ... · Machine Learning Summer School, Indonesia 2020 Daniel Worrall Legong Dance. Daniel Worrall MLSS INDO August 2020 2 This lecture:

MLSS INDO August 2020Daniel Worrall

SE(2) Roto-translation

20

Group convolution

[F ⇤W](x) =X

y2YF(y)W(T �1

x [y])Ac

tivat

ion

spac

e

Input

Rotating + translating kernel

[F ⇤W](R✓,x) =X

y2Z2

F(y)W(R�1✓ (y � x))

Activ

atio

n sp

ace

Input

[TR ,z [F] ⇤W](R✓,x) = [F ⇤W](R✓� ,R�1 (x� z))

Page 21: Advanced Topics in Equivariance: Lecture 3 Machine ... · Machine Learning Summer School, Indonesia 2020 Daniel Worrall Legong Dance. Daniel Worrall MLSS INDO August 2020 2 This lecture:

MLSS INDO August 2020Daniel Worrall

Sample efficiency

21

Bekkers et al., 2018

Page 22: Advanced Topics in Equivariance: Lecture 3 Machine ... · Machine Learning Summer School, Indonesia 2020 Daniel Worrall Legong Dance. Daniel Worrall MLSS INDO August 2020 2 This lecture:

MLSS INDO August 2020Daniel Worrall

Sample efficiency

22

RL sample hungry: exploit symmetries

Transformation equivariant policies via MDP homomorphism theory

Cartpole

GridWorld

Pong

Van der Pol et al. (2020)

Page 23: Advanced Topics in Equivariance: Lecture 3 Machine ... · Machine Learning Summer School, Indonesia 2020 Daniel Worrall Legong Dance. Daniel Worrall MLSS INDO August 2020 2 This lecture:

MLSS INDO August 2020Daniel Worrall

Convolutions are special

23

Group convolution

Regular-representation conv.

Steerable convolution

Semigroup correlation

Gauge convolution

Unitary convolution

Tangent space convolutions

•Harmonic network•Spherical convolution•Steerable SE(3) network•Tensor Field Network

•Deep Scale Space

•Icosahedral CNN•Gauge Equivariant Mesh Convolution

•Translational convolution•Group CNN•HexaConv•CubeNet•Scale Convolution

• Learning to convolve

•LieConv

Page 24: Advanced Topics in Equivariance: Lecture 3 Machine ... · Machine Learning Summer School, Indonesia 2020 Daniel Worrall Legong Dance. Daniel Worrall MLSS INDO August 2020 2 This lecture:

MLSS INDO August 2020Daniel Worrall

Z ⇡

�⇡F (�� ✓)e�◆m� d� =

Z ⇡�✓

�⇡�✓F (�0)e�◆m(�0+✓) d�0

= e�◆m✓

Z ⇡

�⇡F (�0)e�◆m�0

d�0

Fourier Shift Theorem m 2 Z

◆ =p�1

Steerable Convolutions

24

Can be generalized to other transformation groups

F {T✓[F ]} (m) = e�◆m✓F {F} (m) = S✓[F ] {F} (m)

F {F} (m) =

Z ⇡

�⇡F (�)e�◆m� d�

Another equivariant mapping is the Fourier transform

EquivariantInvariant

K KK KK KK KK KWm(r;�) = R(r)e◆(m�+�)

Learnable Learnable

Fixed

Inner product!

Z ⇡

�⇡F (�� ✓)e�◆m� d� =

Z ⇡�✓

�⇡�✓F (�0)e�◆m(�0+✓) d�0

= e�◆m✓

Z ⇡

�⇡F (�0)e�◆m�0

d�0

Z ⇡

�⇡F (�� ✓)e�◆m� d� =

Z ⇡�✓

�⇡�✓F (�0)e�◆m(�0+✓) d�0

= e�◆m✓

Z ⇡

�⇡F (�0)e�◆m�0

d�0

Page 25: Advanced Topics in Equivariance: Lecture 3 Machine ... · Machine Learning Summer School, Indonesia 2020 Daniel Worrall Legong Dance. Daniel Worrall MLSS INDO August 2020 2 This lecture:

MLSS INDO August 2020Daniel Worrall

Lack of Rotation Equivariance

25

Page 26: Advanced Topics in Equivariance: Lecture 3 Machine ... · Machine Learning Summer School, Indonesia 2020 Daniel Worrall Legong Dance. Daniel Worrall MLSS INDO August 2020 2 This lecture:

MLSS INDO August 2020Daniel Worrall

Steerable Equivariance

26

Page 27: Advanced Topics in Equivariance: Lecture 3 Machine ... · Machine Learning Summer School, Indonesia 2020 Daniel Worrall Legong Dance. Daniel Worrall MLSS INDO August 2020 2 This lecture:

MLSS INDO August 2020Daniel Worrall

Comparison

27

Page 28: Advanced Topics in Equivariance: Lecture 3 Machine ... · Machine Learning Summer School, Indonesia 2020 Daniel Worrall Legong Dance. Daniel Worrall MLSS INDO August 2020 2 This lecture:

MLSS INDO August 2020Daniel Worrall

Convolutional variants: Semigroup Correlations

28

Group convolution

Regular-representation conv.

Steerable convolution

Semigroup correlation

Gauge convolution

Unitary convolution

Tangent space convolutions

•Harmonic network•Spherical convolution•Steerable SE(3) network•Tensor Field Network

•Deep Scale Space

•Icosahedral CNN•Gauge Equivariant Mesh Convolution

•Translational convolution•Group CNN•HexaConv•CubeNet•Scale Convolution

• Learning to convolve

•LieConv

Page 29: Advanced Topics in Equivariance: Lecture 3 Machine ... · Machine Learning Summer School, Indonesia 2020 Daniel Worrall Legong Dance. Daniel Worrall MLSS INDO August 2020 2 This lecture:

MLSS INDO August 2020Daniel Worrall

Semigroups: How not to rescale

29

Original Subsample Bandlimit + subsamplex2

Page 30: Advanced Topics in Equivariance: Lecture 3 Machine ... · Machine Learning Summer School, Indonesia 2020 Daniel Worrall Legong Dance. Daniel Worrall MLSS INDO August 2020 2 This lecture:

MLSS INDO August 2020Daniel Worrall

Semigroups: How not to rescale

30

Bandlimit + subsamplex4

Original Subsample

Page 31: Advanced Topics in Equivariance: Lecture 3 Machine ... · Machine Learning Summer School, Indonesia 2020 Daniel Worrall Legong Dance. Daniel Worrall MLSS INDO August 2020 2 This lecture:

MLSS INDO August 2020Daniel Worrall

Semigroups: How not to rescale

31

Bandlimit + subsampleOriginal Subsamplex4

Page 32: Advanced Topics in Equivariance: Lecture 3 Machine ... · Machine Learning Summer School, Indonesia 2020 Daniel Worrall Legong Dance. Daniel Worrall MLSS INDO August 2020 2 This lecture:

MLSS INDO August 2020Daniel Worrall

Semigroups: How to rescale

32

Subsample

Bandlimit

s =1

8

does not exist!!!!T �1s [F](x)

Blur appropriately Subsample

Ts[F](x) = Blurs[F](s�1x)

Abstraction1. Closure: compositions well-behaved

g � h 2 G

90� 90� 90�

180� 180�

I

90�Th[I]

135�Tg[Th[I]]

Tg�h[I] 225�

Action/Transformation

(g � h) � k = g � (h � k) = g � h � k2. Associativity: brackets unnecessary

3. Identity: “do nothing” transformatione � g = g � e = g

0�

�90�

90�

g � g�1 = g�1 � g = e4. Invertibility: transformation reversible

Page 33: Advanced Topics in Equivariance: Lecture 3 Machine ... · Machine Learning Summer School, Indonesia 2020 Daniel Worrall Legong Dance. Daniel Worrall MLSS INDO August 2020 2 This lecture:

MLSS INDO August 2020Daniel Worrall

The Semigroup Correlation

33

[I ⇤W](x) =X

y2YI(y)W(Tx�1[y])

e.g. Cohen & Welling (2016)

Group convolution

2) Use non-invertible transformations

e.g. Worrall & Welling (2019)

Semigroup correlation [I ?W](x) =X

y2YTx[I](y)W(y)

[I ⇤W](x) =X

y2YI(y)Tx�1[W](y)Unitary Group Convolution

e.g. Diaconu & Worrall (2019)

1) Rescaling ≠ shuffling pixel locations

Page 34: Advanced Topics in Equivariance: Lecture 3 Machine ... · Machine Learning Summer School, Indonesia 2020 Daniel Worrall Legong Dance. Daniel Worrall MLSS INDO August 2020 2 This lecture:

MLSS INDO August 2020Daniel Worrall

The Semigroup Correlation

34

blur dilated convolution

[I ?W](k, t) =X

y2Z2

[G2k ⇤Z2 I](2ky + t)W(y)

[I ?W](k, t) =X

y2Z2

[G2k ⇤Z2 I](2ky + t)W(y)

[I ?W](k, t) =X

y2Z2

[G2k ⇤Z2 I](2ky + t)W(y)

k = 1

k = 0

k = 2

k = 3 [I ?W](k, t) =X

y2Z2

[G2k ⇤Z2 I](2ky + t)W(y)

Lift

Deep scale-spaceScale-space

Page 35: Advanced Topics in Equivariance: Lecture 3 Machine ... · Machine Learning Summer School, Indonesia 2020 Daniel Worrall Legong Dance. Daniel Worrall MLSS INDO August 2020 2 This lecture:

MLSS INDO August 2020Daniel Worrall

Convolutional variants: Gauge Equivariance

35

Group convolution

Regular-representation conv.

Steerable convolution

Semigroup correlation

Gauge convolution

Unitary convolution

Tangent space convolutions

•Harmonic network•Spherical convolution•Steerable SE(3) network•Tensor Field Network

•Deep Scale Space

•Icosahedral CNN•Gauge Equivariant Mesh Convolution

•Translational convolution•Group CNN•HexaConv•CubeNet•Scale Convolution

• Learning to convolve

•LieConv

Page 36: Advanced Topics in Equivariance: Lecture 3 Machine ... · Machine Learning Summer School, Indonesia 2020 Daniel Worrall Legong Dance. Daniel Worrall MLSS INDO August 2020 2 This lecture:

MLSS INDO August 2020Daniel Worrall

Gauge Convolutions

36

An alternative route to equivariance is to consider local gauge transformations

vv(1) = v(1)1 e(1)1 + v(1)2 e(1)2

v(2) = v(2)1 e(2)1 + v(2)2 e(2)2

v(1) = v(1)1 e(1)1 + v(1)2 e(1)2

v(2) = v(2)1 e(2)1 + v(2)2 e(2)2

v(1) = v(1)1 e(1)1 + v(1)2 e(1)2

v(2) = v(2)1 e(2)1 + v(2)2 e(2)2

v(1) = v(1)1 e(1)1 + v(1)2 e(1)2

v(2) = v(2)1 e(2)1 + v(2)2 e(2)2

Take inner product between signal and filter represented in all frame orientations

A frame is another name for coordinates. How to make frame independent?

v(1)1 e(1)1 + v(1)2 e(1)2 = v(2)1 e(2)1 + v(2)2 e(2)2

Like a group convolution, but…:• Defined locally• Can work on manifolds Cohen et al. (2019)

Need to map signal manifold → tangent space

manifold → tangent space

Local group convolution, with curvature correction due to manifold

Page 37: Advanced Topics in Equivariance: Lecture 3 Machine ... · Machine Learning Summer School, Indonesia 2020 Daniel Worrall Legong Dance. Daniel Worrall MLSS INDO August 2020 2 This lecture:

MLSS INDO August 2020Daniel Worrall

Convolutional variants

37

Group convolutionSteerable convolution

Semigroup correlation Gauge convolution

Regular-representation conv.

Page 38: Advanced Topics in Equivariance: Lecture 3 Machine ... · Machine Learning Summer School, Indonesia 2020 Daniel Worrall Legong Dance. Daniel Worrall MLSS INDO August 2020 2 This lecture:

MLSS INDO August 2020Daniel Worrall 38

Convolutions

Powerful inductive bias Linear layers + symmetry Fewer learnable parameters Many variants

This lecture: EquivarianceGrey Box ModelsConvolutionsInvariance and EquivarianceGroup TheoryConvolution Generalizations

Group Convolutions, Steerable Convolutions, Semigroup Correlation, Gauge Convolution

Page 39: Advanced Topics in Equivariance: Lecture 3 Machine ... · Machine Learning Summer School, Indonesia 2020 Daniel Worrall Legong Dance. Daniel Worrall MLSS INDO August 2020 2 This lecture:

MLSS INDO August 2020Daniel Worrall

References

39

TheoryGroup CNNs - Cohen & Welling (2016)Steerable CNNs - Cohen & Welling (2017)On the Generalization of Equivariance and Convolution in Neural Networks to the Action of Compact Groups - Kondor & Trivedi (2018)A General Theory of Equivariant CNNs on Homogeneous Spaces - Cohen et al. (2019)Gauge Equivariance Convolutional Networks - Cohen*, Weiler*, Kicanaoglu* et al. (2019)Deep Scale-spaces: Equivariance Over Scale - Worrall & Welling (2019)

E(2): Discrete subgroupsGroup CNNs - Cohen & Welling (2016)Exploiting cyclic symmetry in convolutional neural networks - Dieleman et al. (2016)Steerable CNNs - Cohen & Welling (2017)HexaConv - Hoogeboom*, Peters* et al. (2018)

Z2 - TranslationHandwritten digit recognition with a Back-Propagation network - Lecun et al. (1990)

SE(2): Discrete subgroupsLearning steerable filters for rotation equivariant CNNs - Weiler et al. (2018)Oriented response networks - Zhou et al. (2017)Roto-translation covariant convolutions networks for medical image analysis - Bekkers et al. (2018)Rotation equivariant vector field networks - Marcos et al. (2017)

SE(2): Full groupHarmonic Networks: Deep translation and rotation equivariance - Worrall et al. (2017)

SE(2): All subgroupsGeneral E(2)-Equivariant Steerable CNNs - Weiler & Cesa (2019)

Page 40: Advanced Topics in Equivariance: Lecture 3 Machine ... · Machine Learning Summer School, Indonesia 2020 Daniel Worrall Legong Dance. Daniel Worrall MLSS INDO August 2020 2 This lecture:

MLSS INDO August 2020Daniel Worrall

References

40

E(3): Discrete subgroups3D G-CNNs for pulmonary nodule detection - Winkels & Cohen (2018)

SE(3): Discrete subgroupsCubeNet: Equivariance to 3D rotation and translation - Worrall & Brostow (2018)

SE(3): Full groupN-body networks: a covariant hierarchical neural network architecture for learning atomic potentials - Risi (2018)Tensor Field networks: Rotation- and translation-equivariant neural networks for 3D point clouds - Thomas, Smidt* et al. (2018)3D Steerable CNNs: Learning Rotationally Equivariant Features in Volumetric Data - Weiler*, Geiger* et al. (2018)Clebsch-Gordan Nets: A Fully Fourier Space Spherical Convolutional Neural Networks - Kondor et al. (2018)Cormorant: Covariant molecular neural networks - Anderson et al. (2019)

SO(3) on the sphere: discrete subgroupSpherical CNNs - Cohen*, Geiger*, Köhler et al. (2018)

SO(3) on the sphere: isotropic subgroupLearning SO(3) Equivariant Representations with Spherical CNNs - Esteves et al. (2018)DeepSphere: Efficient spherical Convolutions Neural Networks with HEALPix samplings for cosmological applications - Perraudin et al. (2019)

SO(3) on the sphere: full groupSpherical CNNs on unstructured grids - Jiang et al. (2019)

Page 41: Advanced Topics in Equivariance: Lecture 3 Machine ... · Machine Learning Summer School, Indonesia 2020 Daniel Worrall Legong Dance. Daniel Worrall MLSS INDO August 2020 2 This lecture:

MLSS INDO August 2020Daniel Worrall

References

41

Scale semigroupDeep Scale-spaces: Equivariance Over Scale - Worrall & Welling (2019)

Scale groupScale equivariance in CNNs with vector fields - Marcos et al (2018)Scale steerable filters for locally scale-invariant neural networks - Ghosh & Gupta (2019)Scale-equivariant steerable networks - Sosnovik & Smeulders (2020)

Lie groups: B-Spline CNNs on Lie Groups - Bekkers (2019)Generalizing Convolutional Neural Networks for Equivariance to Lie groups on Arbitrary Continuous Data - Finzi et al. (2020)

PDEsPDE-based Group Equivariant Convolutional Neural Networks - Smets et al. (2020)Incorporating Symmetry into Deep Dynamics Models for Improved Generalization - Wang*, Walters* et al. (2020)

AttentionAffine Self Convolution - Diaconu & Worrall, (2019)Attentive Group Equivariant Convolution Networks - Romero et al., (2020)SE(3)-Transformers: 3D Roto-Translation Equivariant Attention Networks - Fuchs et al. (2020)

Reverse-complement symmetryAn equivariant Bayesian convolutional network predicts recombination hotspots and accurately resolves binding motifs - Brown & Lunter (2019)

Reinforcement LearningMDP Homomorphic Networks: Group Symmetries in Reinforcement Learning - van der Pol et al. (2020)

Page 42: Advanced Topics in Equivariance: Lecture 3 Machine ... · Machine Learning Summer School, Indonesia 2020 Daniel Worrall Legong Dance. Daniel Worrall MLSS INDO August 2020 2 This lecture:

MLSS INDO August 2020Daniel Worrall 42

AppendixWayang Kulit Show

Page 43: Advanced Topics in Equivariance: Lecture 3 Machine ... · Machine Learning Summer School, Indonesia 2020 Daniel Worrall Legong Dance. Daniel Worrall MLSS INDO August 2020 2 This lecture:

MLSS INDO August 2020Daniel Worrall

Group Equivariance Proof

43

HintsHomomorphism property Inversion Composition

[Tz[F] ⇤W](x) =X

y2YTz[F](y)W(T �1

x [y])

=X

y2YF(Tz�1 [y])W(Tx�1 [y])

=X

y02YF(y0)W(T �1

x [Tz[y0]])

=X

y02YF(y0)W(Tx�1z[y

0])

=X

y02YF(y0)W(T �1

z�1x[y0]) = [F ⇤W](z�1x) = Tz[F ⇤W](x)

[Tz[F] ⇤W](x) =X

y2YTz[F](y)W(T �1

x [y])

=X

y2YF(Tz�1 [y])W(Tx�1 [y])

=X

y02YF(y0)W(T �1

x [Tz[y0]])

=X

y02YF(y0)W(Tx�1z[y

0])

=X

y02YF(y0)W(T �1

z�1x[y0]) = [F ⇤W](z�1x) = Tz[F ⇤W](x)

[Tz[F] ⇤W](x) =X

y2YTz[F](y)W(T �1

x [y])

=X

y2YF(Tz�1 [y])W(Tx�1 [y])

=X

y02YF(y0)W(T �1

x [Tz[y0]])

=X

y02YF(y0)W(Tx�1z[y

0])

=X

y02YF(y0)W(T �1

z�1x[y0]) = [F ⇤W](z�1x) = Tz[F ⇤W](x)

[Tz[F] ⇤W](x) =X

y2YTz[F](y)W(T �1

x [y])

=X

y2YF(Tz�1 [y])W(Tx�1 [y])

=X

y02YF(y0)W(T �1

x [Tz[y0]])

=X

y02YF(y0)W(Tx�1z[y

0])

=X

y02YF(y0)W(T �1

z�1x[y0]) = [F ⇤W](z�1x) = Tz[F ⇤W](x)

[Tz[F] ⇤W](x) =X

y2YTz[F](y)W(T �1

x [y])

=X

y2YF(Tz�1 [y])W(Tx�1 [y])

=X

y02YF(y0)W(T �1

x [Tz[y0]])

=X

y02YF(y0)W(Tx�1z[y

0])

=X

y02YF(y0)W(T �1

z�1x[y0]) = [F ⇤W](z�1x) = Tz[F ⇤W](x)

[Tz[F] ⇤W](x) =X

y2YTz[F](y)W(T �1

x [y])

=X

y2YF(Tz�1 [y])W(Tx�1 [y])

=X

y02YF(y0)W(T �1

x [Tz[y0]])

=X

y02YF(y0)W(Tx�1z[y

0])

=X

y02YF(y0)W(T �1

z�1x[y0]) = [F ⇤W](z�1x) = Tz[F ⇤W](x)

[Tz[F] ⇤W](x) =X

y2YTz[F](y)W(T �1

x [y])

=X

y2YF(Tz�1 [y])W(Tx�1 [y])

=X

y02YF(y0)W(T �1

x [Tz[y0]])

=X

y02YF(y0)W(Tx�1z[y

0])

=X

y02YF(y0)W(T �1

z�1x[y0]) = [F ⇤W](z�1x) = Tz[F ⇤W](x)

Write out convolution in full

Expand transformation operator

Change of variables

Rewriting: use the hints

y0 = Tz�1 [y] () y = Tz[y0]

Tz�1 [y] = T �1z [y] (xy)�1 = y�1x�1 TxTy = Txy

It’s a convolution!

Page 44: Advanced Topics in Equivariance: Lecture 3 Machine ... · Machine Learning Summer School, Indonesia 2020 Daniel Worrall Legong Dance. Daniel Worrall MLSS INDO August 2020 2 This lecture:

MLSS INDO August 2020Daniel Worrall

Semigroup Equivariance Proof

44

Write out convolution in full

Semigroup composition

It’s a convolution!

[Tz[F] ?W](x) =X

y2YTx[Tz[F]](y)W(y)

=X

y2YTxz[F](y)W(y)

= [F ?W](xz) = Tz[F ?W](x) Define this as left action on activations