On the Continuity of Rotation Representations in …On the Continuity of Rotation Representations in...

1
On the Continuity of Rotation Representations in Neural Networks Yi Zhou* 1 Connelly Barnes* 2 Jingwan Lu 2 Jimei Yang 2 Hao Li 1, 3, 4 1 University of Southern California 2 Adobe Research 3 USC Institute for Creative Technologies 4 Pinscreen Continuous representations in neural networks: - The mapping g in the below diagram needs to be continuous. - In other words, g is an embedding (X is homeomorphic to a subset of R) - Discontinuous representations are hard for neural networks to approximate. Commonly used rotation representations are discontinuous . Quaternion Axis Angle Euler Angles Rodriguez ... 3D Rotation SO(3) Note: For 3D rotations, all representations are discontinuous in the real Euclidean spaces of four or fewer dimensions. Why does my network still occasionally produce big errors even with abundant weights and extensive training? X - Be careful when using 4D or fewer dimensional rotation representations because they have discontinuities which manifest as degenerations (e.g. Gimbal lock or sign flips) when dealing with the whole 3D rotation space. If you must use a discontinuous representation then try to parameterize it so your training set avoids the discontinuities. Let’s find continuous 3D rotation representations. IK results for the two frames with highest pose error from the test set for the network trained using quaternions, and the corresponding results on the same frames for the network trained on the 6D representation. Sanity Test: Inverse Kinematics Test: 6D representation for 3D rotations Mapping from SO(3) (matrix) to a 6D representation (Stiefel Manifold): Mapping from 6D representation to SO(3) (Gram-Schmidt-like orthogonalization): 5D representation for 3D rotations Eliminating one dimension of by stereographically projecting [a13, a21, a22, a23] T to 3D. a11 a12 a13 a21 a22 a23 a11 a12 c1 c2 c3 6D 5D An illustration of stereographic projection in 2D. We are given as input a point p on the unit sphere S 1 . We construct a ray from a fixed projection point N0 = (0, 1) through p and find the intersection of this ray with the plane y = 0. The resulting point p' is the stereographic projection of p. - For the general case of the n dimensional rotation group SO(n), please check our paper. We also discuss other groups such as the orthogonal group and similarity transforms. IK results for the two worst frames for the 6D representation network, and the corresponding results for the quaternion network. - A 2D example: Polar angle representation N(v) = v / || v || (Normalization) Worst two frames using quaternions. Worst two frames using 6D representations.

Transcript of On the Continuity of Rotation Representations in …On the Continuity of Rotation Representations in...

Page 1: On the Continuity of Rotation Representations in …On the Continuity of Rotation Representations in Neural Networks Yi Zhou*1 Connelly Barnes*2 Jingwan Lu2 Jimei Yang2 Hao Li1, 3,

On the Continuity of Rotation Representations in Neural NetworksYi Zhou*1 Connelly Barnes*2 Jingwan Lu2 Jimei Yang2 Hao Li1, 3, 4

1University of Southern California 2Adobe Research 3USC Institute for Creative Technologies 4Pinscreen

Continuous representations in neural networks:

- The mapping g in the below diagram needs to be continuous.- In other words, g is an embedding (X is homeomorphic to a subset of R)

- Discontinuous representations are hard for neural networks to approximate.

Commonly used rotation representations are

discontinuous.

QuaternionAxis Angle

Euler AnglesRodriguez

...

3D RotationSO(3)

Note: For 3D rotations, all representations are discontinuous in the real Euclidean spaces of four or fewer dimensions.

Why does my network still occasionally produce big errors even with abundant weights and extensive training?

X- Be careful when using 4D or fewer dimensional rotation representations

because they have discontinuities which manifest as degenerations (e.g. Gimbal lock or sign flips) when dealing with the whole 3D rotation space. If you must use a discontinuous representation then try to parameterize it so your training set avoids the discontinuities.

Let’s find continuous 3D rotation representations.

IK results for the two frames with highest pose error from the test set for the network trained using quaternions, and the corresponding results on the same frames for the network trained on the 6D representation.

Sanity Test:

Inverse Kinematics Test:

6D representation for 3D rotationsMapping from SO(3) (matrix) to a 6D representation (Stiefel Manifold):

Mapping from 6D representation to SO(3) (Gram-Schmidt-like orthogonalization):

5D representation for 3D rotationsEliminating one dimension of by stereographically projecting[a13, a21, a22, a23]T to 3D.

a11 a12 a13 a21 a22 a23

a11 a12 c1 c2 c3

6D

5D

An illustration of stereographic projection in 2D. We are given as input a point p on the unit sphere S1. We construct a ray from a fixed projection point N0 = (0, 1) through p and find the intersection of this ray with the plane y = 0. The resulting point p' is the stereographic projection of p.

- For the general case of the n dimensional rotation group SO(n), please check our paper. We also discuss other groups such as the orthogonal group and similarity transforms.

IK results for the two worst frames for the 6D representation network, and the corresponding results for the quaternion network.

- A 2D example:Polar anglerepresentation

N(v) = v / || v ||(Normalization)

Worst two frames using quaternions. Worst two frames using 6D representations.