Toru Tamaki, Miho Abe, Bisser Raytchev, Kazufumi Kaneda 19 th Nov. 2010.

33
SOFTASSIGN AND EM-ICP ON GPU Toru Tamaki, Miho Abe, Bisser Raytchev, Kazufumi Kaneda 19 th Nov. 2010
  • date post

    19-Dec-2015
  • Category

    Documents

  • view

    216
  • download

    0

Transcript of Toru Tamaki, Miho Abe, Bisser Raytchev, Kazufumi Kaneda 19 th Nov. 2010.

SOFTASSIGN AND EM-ICP ON GPUToru Tamaki, Miho Abe, Bisser Raytchev, Kazufumi Kaneda

19th Nov. 2010

Contribution of this talk

Fast GPU implementations of registration algorithms for 3D point sets. Softassign [Gold et al., 1998] EM-ICP [Granger et al., 2002] (Weighted) Horn’s method [Horn, 1987]

So, what is “registartion” ?

What is “Registration” or “Alignment” ?

A set of images

Image registration

3D registration algorithm

Input Two point sets: and

Output Rotation matrix Translation vector

X Y

and

Algorithms for registration

Horn’s method• Corresponding point sets are

given.• Estimate R and t.

ICP (Iterative closest point)• Unknown correspondence.• Fast, standard.• Easily fail due to local minimum.• A lot of variants follow.

Softassign• Unknown correspondence.• Robust.• Very slow because of iterations.

EM-ICP• Unknown correspondence.• Robust.• Very slow because of iterations.

Registration algorithm

Algorithms for registration

Horn’s method• Corresponding point sets are

given.• Estimate R and t.

ICP (Iterative closest point)• Unknown correspondence.• Fast, standard.• Easily fail due to local minimum.• A lot of variants follow.

Softassign• Unknown correspondence.• Robust.• Very slow because of iterations.

EM-ICP• Unknown correspondence.• Robust.• Very slow because of iterations.

Registration algorithm

Horn’s method: correspondence is known.

𝑋 𝑌

X Y

?

Unknown correspondence

X Y

Known correspondence𝒙 1 𝒚 1𝒙 2 𝒚 2⋮⋮

𝑇𝑇

𝑇𝑇

𝒙 1=(𝑥1𝑥 ,𝑥1 𝑦 ,𝑥1𝑧)𝑇

Horn’s method: correspondence is known.

𝑋 𝑌

𝒙 1 𝒚 1𝒙 2 𝒚 2⋮⋮

𝑇𝑇

𝑇𝑇

𝒙 𝒚Compute centers

�̂� 𝑌

Centering

𝑋− 𝒙𝑌 − 𝒚

�̂� 𝑌𝑆¿

𝐾¿

Computer 1st Eigenvector : quaternion

Convert to

𝒕=𝒙−𝑅 𝒚1 2

3

4

5

Algorithms for registration

Horn’s method• Corresponding point sets are

given.• Estimate R and t.

ICP (Iterative closest point)• Unknown correspondence.• Fast, standard.• Easily fail due to local minimum.• A lot of variants follow.

Softassign• Unknown correspondence.• Robust.• Very slow because of iterations.

EM-ICP• Unknown correspondence.• Robust.• Very slow because of iterations.

Registration algorithm

ICP: correspondence is unknown.

𝑋 𝑌

𝒙 1 𝒚 1𝒙 2 𝒚 2

⋮⋮

𝑇𝑇

𝑇𝑇

Find closest(nearest) pointto in

𝑌 ∗

𝒚 𝑖

𝒚 𝑖

Put the pointto

ICP: correspondence is unknown.

𝑋 𝑌

𝒙 1 𝒚 1𝒙 2 𝒚 2

⋮⋮

𝑇𝑇

𝑇𝑇

Find closest(nearest) pointto in

𝑌 ∗

𝒚 𝑗

𝒚 𝑖

Put the pointto

𝒚 𝑗

Horn’s methodwith and

Estimate and

ICP: correspondence is unknown.

𝑋 𝑅𝑌 +𝒕

𝒙 1 𝒚 1𝒙 2 𝒚 2

⋮⋮

𝑇𝑇

𝑇𝑇

Find closest(nearest) pointto in

𝑌 ∗

𝒚 𝑗

𝒚 𝑖

Put the pointto

𝒚 𝑗

Horn’s methodwith and

Estimate and

Repeat

Fast, but easy to faildue to hard correspondence.

Algorithms for registration

Horn’s method• Corresponding point sets are

given.• Estimate R and t.

ICP (Iterative closest point)• Unknown correspondence.• Fast, standard.• Easily fail due to local minimum.• A lot of variants follow.

Softassign• Unknown correspondence.• Robust.• Very slow because of iterations.

EM-ICP• Unknown correspondence.• Robust.• Very slow because of iterations.

Registration algorithm

GPU!

GPU!

Softassign: soft correspondence.

𝑋

𝑌

𝒙 𝑖

𝒚 𝑗

𝑚𝑖𝑗

𝑚𝑖𝑗=¿∨𝒙 𝑖− (𝑅 𝒚 𝑗+𝒕 )∨¿

𝑀WeightedHorn’s methodwith and

Estimate and

Repeat

GPU!

Each row and columnshould be normalized to 1by Shinkhorn iterations

Shinkhorn iterations

𝑀

Each row and columnshould be normalized to 1by Shinkhorn iterations

𝑚𝑖𝑗

sum up to 1

sum up to 1sum up to 1

sum up to 1

Repeat row and column normalization until converge.

Shinkhorn iterations

𝑀

Each row and columnshould be normalized to 1by Shinkhorn iterations

𝑚𝑖𝑗

sum

up to

1

sum

up to

1

sum

up to

1⋮

sum

up to

1

Repeat row and column normalization until converge.

Shinkhorn.GPU (row normalization)

𝑀

Each row and columnshould be normalized to 1by Shinkhorn iterations

𝟏

𝑹𝑀❑

Using sgemv of CUBLAS

Shinkhorn.GPU (row normalization)

𝑀

Each row and columnshould be normalized to 1by Shinkhorn iterations

𝑹𝑀❑

Using CUDA kernel

Row-wisedivision

Column normalization is done by the same way.

Weighted Horn’s method

�̂� 𝑌𝑆¿ �̂� 𝑌𝑆¿ 𝑀

3 3

Normal version Weighted version

Using CUBLAS sgemv twice.

Centering.GPU (weighted version)

𝑋

𝑹𝑀❑ 𝟏

𝑋

∗∗

CUDAkernel

CUBLASsasum

𝑹𝑀❑ 𝟏

CUBLASsasum

𝒙

Weightedcenter

Same as for

Weightedsum

Pipeline of Softassing.GPU

𝑋𝑌

CPU GPU

𝑋

𝑌

𝑀

�̂� 𝑌𝑆¿ 𝑀

Compute with CUDA kernel

Shinkhorn.GPU

Centering.GPU

𝑆

Weighted Horn’s method

𝐾

and

SolveEigenvalueproblem

𝒙 ,𝒚

Algorithms for registrationH

o

r

n

s

m

e

t

h

o

d

C

o

rr

e

s

p

o

n

d

i

n

g

p

o

i

n

t

s

e

t

s

a

r

e

g

i

v

e

n

.

E

s

ti

m

a

t

e

R

a

n

d

t.

ICP (Iterative closest point) Unknown correspondence.Fast, standard.Easily fail due to local minimum.A lot of variants follow.

Softassign

Unknown correspondence.Robust.Very slow because of iterations.

Registration algorithm

GPU!

GPU!

EM-ICP: soft correspondence.

𝑌

𝑋

𝒚 𝑖

𝒙 𝑗

𝑑𝑖𝑗

𝑑𝑖𝑗=¿∨𝒙 𝑗− (𝑅 𝒚 𝑖+𝒕 )∨¿

𝐴WeightedHorn’s methodwith and

Estimate and

Repeat

𝑋 ′

𝒙 ′ 𝑖

Pseudo correspondence

GPU!

Each row is normalized once.

Row normalization on GPU

𝐴

𝟏

𝑪

Using sgemv of CUBLAS

Not normalized yet.

Row normalization on GPU

𝐴

Using CUDA kernel

Row-wisedivision

+sqrt

𝑪Now normalized.√

Computing weights

𝐴

𝟏

𝝀

Using sgemv of CUBLAS

Now normalized.√

Pseudo correspondence

𝑋

𝐴

𝑋 ′

CUBLASsgemv

Centering: same with Softassing.GPU

Now normalized.√

Weighted Horn’s method

�̂� ′ 𝑌𝑆¿

3

Weighted version

0

0𝜆1𝜆2⋱

𝝀�̂� ′

CUDAkernel

�̂�

’ 𝑌𝑆¿

CUBLASsgemm

3

Weighted version (2 steps)

(not efficient)

Pipeline of EM-ICP.GPU

𝑋𝑌

CPU GPU

𝑋

𝑌

𝐴

Compute with CUDA kernel

Row normalization on GPU

Centering.GPU

𝑆

2 step weighted Horn’s method

𝐾

and

SolveEigenvalueproblem

𝒙 ,𝒚

𝝀�̂� ′

�̂�

�̂� ′𝑌

𝑆

¿

Computing time over different number of points

Successfully aligned5000 points less than 7 seconds.

Slightly fast, but failed.

GPU: GeForce8800GT CPU: Intel Core2 Quad + OpenMP (4 cores)

Summary

Implemented 3D registration algorithms on a GPU are: Softassign, EM-ICP, Weighted Horn’s method.

EM-ICP.GPU is able to align 5000 points within 7 seconds, 60 times faster than EM-ICP.CPU, more robust than ICP.CPU.

Code, binary, and movies are available at: http://home.hiroshima-u.ac.jp/tamaki/study/cuda_softassign_emicp/

Limitations

Number of points Should be less than 8000 for

GeForce8800GT with 512MB memory. More memory, more points.

Stopping condition requires to store whole matrix or , and

compare with previous ones: inefficient. Hence, currently, number of iterations is

fixed.

Algorithms for registrationH

o

r

n

s

m

e

t

h

o

d

C

o

rr

e

s

p

o

n

d

i

n

g

p

o

i

n

t

s

e

t

s

a

r

e

g

i

v

e

n

.

E

s

ti

m

a

t

e

R

a

n

d

t.

ICP (Iterative closest point) Unknown correspondence.Fast, standard.Easily fail due to local minimum.A lot of variants follow.

Softassign

Unknown correspondence.Robust.Very slow because of iterations.

Registration algorithm