SurReal: Fréchet Mean and Distance Transform for Complex ...stellayu/... · 1 New complex-valued...

1
SurReal: Fréchet Mean and Distance Transform for Complex-Valued Deep Learning Rudrasis Chakraborty, Jiayun Wang, Stella X. Yu UC Berkeley / ICSI Overview 1 New complex-valued deep learning theory that handles scaling ambiguity with equivariance and invariance properties on a manifold. 2 Sur-real experimental validation with significant performance gain (94% 98%) at a fraction (8%) of the baseline model size. New Deep Learning Theory 1 New manifold representation R + × SO(2). a + ib F 7( r , R(θ )) , r =( a + ib )= p a 2 + b 2 θ = arg( a + ib )= atan2( b , a ) R(θ )= cos(θ ) - sin(θ ) sin(θ ) cos(θ ) . 2 New group action G for complex scaling: the product of planar rotation and scaling. 3 New convolution that is equivariant to G : weighted Fréchet Mean filtering: wFM ({z i } , { w i })= arg min mC K X i =1 w i d 2 (z i , m) , where K X i =1 w i = 1, w i (0, 1], i d (z 1 , z 2 )= v u t log 2 r 2 r 1 + klogm R -1 1 R 2 k 2 . 4 New fully connected layer operator that is invariant to G : distance to the wFM. m = wFM({t i }, { v i }) u i = d (t i , m). 5 New nonlinear activation function: ReLU in the tangent space, log/exp maps back to manifold. ( r , R) 7(exp (ReLU (log( r ))) , expm (ReLU (logm (R))) Schematic of Our CNN wFM C t 1 t d t u y c y 1 Invariant layer FC layer Conv-layers Conv-layers Cascaded Conv-layers x 1;1 x 7;8 wFM wFM (fX g ; ) x 1;1 x 7;8 Equivariance of Fréchet Mean Filtering Invariance of Distance Transform Rotation and scaling wFM(z;w ) z 1 z 4 g:z 1 g:z 2 g:wFM(z;w ) z 2 z 3 z 5 g:z 3 g:z 5 g:z 4 d 1 d 2 d 3 d 4 d 5 d 1 d 2 d 3 d 4 d 5 Tangent ReLU for the Complex Plane Our Results on MSTAR 100 × 100 × 1 49 × 49 × 16 23 × 23 × 32 11 × 11 × 64 4 × 4 × 128 256 FC FC 512 CBRM CBRM CBRM CBRM CBRM Softmax layer FC 10 100 × 100 × 1 CCtR 20 × 20 × 50 CCtR CCtR 4 × 4 × 100 dFM 200 Softmax layer FC 10 0% 20% 40% 60% 80% 100% (a, b) r (a, b, r) (r, θ) z (a, b) z raw data unit-norm data 98.2% 97.0% z z 93.5% 96.9% 94.5% 89.8% 46.0% Our Results on RadioML 128 × 2 CBRM CBRM 130 × 256 FC FC 11 Softmax layer 256 128 × 1 CCtR 25 × 64 CCtR CCtR 5 × 128 dFM 640 Softmax layer FC 11

Transcript of SurReal: Fréchet Mean and Distance Transform for Complex ...stellayu/... · 1 New complex-valued...

Page 1: SurReal: Fréchet Mean and Distance Transform for Complex ...stellayu/... · 1 New complex-valued deep learning theory that handles scaling ambiguity with equivariance and invariance

SurReal: Fréchet Mean and Distance Transform for Complex-Valued Deep LearningRudrasis Chakraborty, Jiayun Wang, Stella X. Yu

UC Berkeley / ICSI

Overview

1 New complex-valued deep learning theory thathandles scaling ambiguity with equivarianceand invariance properties on a manifold.

2 Sur-real experimental validation withsignificant performance gain (94%→ 98%) ata fraction (8%) of the baseline model size.

New Deep Learning Theory

1 New manifold representation R+× SO(2).

a+ i bF7→ (r, R(θ )) ,

r = (a+ i b) =p

a2+ b2

θ = arg(a+ i b) = atan2(b, a)

R(θ ) =

cos(θ ) − sin(θ )sin(θ ) cos(θ )

.

2 New group action G for complex scaling: theproduct of planar rotation and scaling.

3 New convolution that is equivariant to G:weighted Fréchet Mean filtering:

wFM ({zi} , {wi}) = argminm∈C

K∑

i=1

wid2 (zi,m) ,

whereK∑

i=1

wi = 1, wi ∈ (0,1],∀i

d(z1,z2) =

log2

r2

r1

+ ‖logm�

R−11 R2

‖2.

4 New fully connected layer operator that isinvariant to G: distance to the wFM.

m= wFM({ti}, {vi})ui = d(ti,m).

5 New nonlinear activation function: ReLU in thetangent space, log/exp maps back to manifold.

(r, R) 7→ (exp (ReLU (log(r))) , expm (ReLU (logm (R))) 2005/06/28ver : 1.3sub f i gpackage

Schematic of Our CNN

wFM

C

t1

td

tu

yc

y1

Invariant layer

FC

layer

Conv-layers

Conv-layers

Cascaded Conv-layers

x1;1

x7;8

wFM

wFM (fXg ; )

x1;1

x7;8

Equivariance of Fréchet Mean FilteringInvariance of Distance Transform

Rotation and scaling

wFM(z; w)

z1

z4

g:z1

g:z2

g:wFM(z; w)

z2z3

z5

g:z3

g:z5

g:z4

d1

d2d3

d4 d5

d1

d2

d3

d4

d5

Tangent ReLU for the Complex Plane

Our Results on MSTAR

100× 100× 1 49× 49× 1623× 23× 32

11× 11× 64

4× 4× 128

256

FCFC

512

CBRMCBRM

CBRMCBRM

CBRM

Softmax layer

FC

10

100× 100× 1

CCtR

20× 20× 50

CCtRCCtR

4× 4× 100

dFM

200

Softmax layer

FC

10

0%

20%

40%

60%

80%

100%

(a, b) r (a, b, r) (r, θ) z (a, b) z

raw data unit-norm data

98.2% 97.0%

z z

93.5%96.9%94.5%89.8% 46.0%

Our Results on RadioML

128× 2

CBRM

CBRM

130× 256

FC

FC

11

Softmax layer

256

128× 1

CCtR

25× 64

CCtRCCtR

5× 128

dFM

640

Softmax layer

FC

11