Jeffs Talk

Using Galois Theory to Prove Structure form Motion Algorithms are OptimalByDavid Nister, Richard Hartley and Henrik Stewenius

1. Five-point Calibrated Relative OrientationTwo calibrated cameras(i.e., intrinsic parameters are known)Problem: Determine the relative orientation of the camera R w.r.t camera L from five corresponding image points

L, R two camera projection matrices and the world frame is the camera frame of L L = K [ I | 0 ], R = K [ R | -R T ]K, K the calibration matrices (3x3 and upper triangular). R, T are the rotation and translations between the two camera frames.Five-point Calibrated Relative OrientationabTplpr

S has rank two.E is the essential matrix that is the product of R and S.Given two corresponding image points, a, b, multiplied by the inverses of K, K yield pr pl . This places one constraint on E.Knowing E, we can recover R but not T (Why?)

E is defined only up to a scale, and it has rank two ( i.e., det( E ) = 0). It has five degrees of freedom.We need five corresponding image points.Furthermore, Es two nonzero singular values are equal:Given two views, the problem is to determine the essential matrix E between the two calibrated cameras from corresponding image points:

The five points provide a system of five linear equations A E = 0.A (is 5x 9) has a four dimensional null space and the solutionE = x X + y Y + z Z + w W for some x, y, z, w. ( Set w=1 or 0)Putting E back into the two constraints gives a set of 10 cubic equations in x, y, z.This 10 cubic equations can be written as the following linear system:

Where GG is a 10x10 matrix in z. For example,det (GG) = 0 gives a tenth degree polynomial in z.

Question: Can the problem be solved (generally) with polynomials of lower degrees?

Given a pair of corresponding image points a=(xL, yL), b=(xR, yR) and camera matrices (L, R), find a 3D point P that minimizes the L2 re-projection error:ab

Finding the optimal P requires the solution of a sixth degree polynomial (This is explained on page 317 in the book by Hartley and Zisserman). Question: Can the problem be solved (generally) with polynomials of lower degrees? The answers to both questions are NO. In this way, the current solutions to these two problems are optimal.Computer vision stops here The rest is abstract algebra.

A group G is a set with an operation (product) GxG ->G. The operation is associative a (b c) = (a b) c,identity: there is an e such that a e = e a= a for all a,Inverse: for each a, there is a-1 such that a a-1 = a-1 a = e.The operation is commutative if a b = b a for all a, b in G. In this case, G is called an abelian group.A subgroup S of G is a subset of G that is itself a group (with respect to Gs operation).

The set of integers with addition as the operation is a group. It is not a group w.r.t. multiplication (no inverses). This is an example of an abelian group.The set of nonsingular nxn matrices with multiplication is a group. This is an example of a non-abelian group.The group ( Zn ,+ ) : elements are 0, 1, 2, ., n-1, a+b = c (mod n).(Zn ,+) is an example of cyclic group. It can be generated by one element a and every other element in the group is a multiple of a.Homomorphism f between groups G and H is a mapping between G and H that preserves their respective operations:f(ab) = f(a) f(b)f(eG) = f(eH)Two groups are isomorphic if there is a bijective (one-to-one and onto) homomorphism between them.If A and B are groups, their cartesian product AxB is also a group.

Permutation group: The set of permutations of n objects is a (non-abelian) group Sn.Sn has n! elements.A permutation is a transposition if it permutes two objects only.Every permutation can be written as a product of an even number or odd number of transpositions. An, the alternating group, is the subgroup of Sn consists of even permutations.S3

A subgroup N of G is called normal if for every g in G and h in N, there is h in N such that g h g -1 = h, or g h = h g.Every subgroup of an abelian group is normal. Given a normal subgroup N, we can partition G into disjoint subsets such that g and g are in the same subset if there is h in N andg = g h ( g and g are connected through N).NGThe set of subsets has a group structure and it is the quotient group ( G/N ) of G by N. xyxyxyx = x n; y= y m => x y = x n y m=x y w n, m, w are in N

Take the group Z of integers, and the subgroup 7Z (integers which are multiples of 7). 7Z is a normal subgroup of Z because Z is abelian.Z / 7Z is the group Z7 discussed before.An is the only normal subgroups of S n for n > 4.The kernel Ker f of a homomorphism f between two groups G, H, is the subset of G such that f(x) = eHKer f is a normal subgroup of G.If f: G H is surjective (onto), then H is isomorphic to the quotient group G / ker f.For example, take the homomorphism f : Z Z7 : f(x) = x % 7.What is the ker f ?

A field K is a set with two associative and commutative operations (+, x) with identities 0 and 1. The two operations satisfy the distributive law:a (b + c ) = a b + a c;Every element in K has (+) -inverse and every element except 0 in k has ( x )-inverse ( thanks to the distributive law). Examples: Q (the field of rational numbers) R (the field of real numbers) and C (the field of complex numbers)( ab = a (b + 0) = a b + a 0. Therefore a 0 = 0 )

A field L is called a field extension (L : K) of K if K is a subfield of L.Key idea: Treat L as a vector space (with multiplication) with coefficients in K.Vector space: we can add and subtract vectors and for every real number a, av is another vector.L : K: we can add and subtract elements in L and for every element k in K and x in L, kx is another element in L.The degree of extension [ L : K ] is the dimension of L as a vector space with coefficients in K.

C is a degree-two extension of R. Take the field Q of rational numbers and the irrational number u=sqrt(2). Q(u) denotes the smallest field (in R) containing both Q and u.What is the dimension of [ Q(u) : Q ]?Two, because u is a root of the polynomial x2 2 =0 (with coefficients in Q).Every element in Q(u) is of the form a + b u. Q(u) is closed under multiplications and additions(a + b u) (a + b u) = A + B u.Is it closed under taking quotient (inverse)?

In fact, elements in Q(u) are f(u), f is a polynomial in Q[x].Let q(x)=x2 2 and p(x) any polynomial in Q[x]. We know that p(x) = w(x) q(x) + r(x), where the remainder r(x) has degree strictly smaller than the degree of q(x). That is, q(x) is a linear polynomial. p(u) = w(u) q(u) + r(u) = r(u).

Let r(x) be any linear polynomial, since r(x) and q(x) are relatively prime, there are polynomials A(x), B(x) such that

A(x) r(x) + B(x) q(x) = 1, or A(u) r(u) = 1. That is, r(u) has an inverse.

Theorem 1. Let P(x) be an irreducible polynomial in Q[x] and u is a root of P(x). [ Q(u) : Q ] = the degree of P(x).Theorem 2. If L : Q be a finite degree field extension of Q, then every element in L is algebraic over Q. That is, for every u in L, there is a polynomial p(x) in Q[x] such that p(u) =0.Theorem 3. If M:L and L: K are two field extensions of finite degrees, the extension M : K has degree [ M:L ] [L : K]. MC is a degree-2 extension of R. [R : Q] is infinite.There are elements u in R (transcendental numbers) that do not satisfy p(u) = 0 for any polynomial with coefficients in Q.

It is impossible to use ruler and compass construction to trisect any angle (in fact, 60o cant be trisected).It is impossible to use ruler and compass construction to duplicate a cube of side length 1.These answer two famous problems raised by ancient Greeks.A real number r is said to be constructible if we can locate r on the x or y axis using only ruler and compass. 01All integers are constructible.

If a and b are constructible, so are 1/a and ab.1Y= a XbaabY=ax-111/aaSo all rational numbers are constructible. In particular, if a number u is constructible, we can construct any number in Q(u).

We introduce a new number by taking the intersection of a circle with a line. Suppose F is a field whose elements are constructible.a x + b y = cx2 + y2 + dx + ey +f = 0.The X coordinates of the two intersection points satisfy a quadratic equationX2 + A X + F = 0A, B, a, b, c, d, e, f are in F.( X + aa )2 + bb = 0That is, X is in the field F( u ), u = sqrt( -bb ), u a square root of an element in F.

vQTherefore, if a number v is constructible, then, there exists a sequence of field extensions:Q < Q (u1) < Q(u1, u2 ) < < Q(u1, u2, , un) such that Q(u1, u2, ui) = Q(u1, u2, u i-1) (u i) u i 2 is in Q(u1, u2, , u i-1)v and Q(v) is in Q(u1, u2, ui) RSuppose we can trisect 60o angle using ruler and compass.This means that v =cos 200 is constructible.[ Q(u1, u2, , un):Q ] = 2p (why?)

We have the formula: cos 3a = 4 cos3 a 3 cos a. That is = 4v 3 3 v. Or a is the root to the polynomial 8x 3 - 6x -1, an irreducible polynomial in Q[x].This is a contradiction! We have the inclusions Q < Q(v) < Q (u1, u2, , un).[ Q (u1, u2, , un):Q ] = [Q(v):Q ] [ Q (u1, u2, , un):Q(v) ] .What is the degree [ Q (u1, u2, , un) :Q] ?What is the degree [ Q(v) : Q ] ?Where is the contradiction?

If we can duplicate the unit cube, then, s, the side length of a cube of volume 2, is a root to the polynomial X 3 2, which is also an irreducible polynomial.

Note that this does not imply that we cannot trisect any angle. Just that there is not a general method that can trisect every angle.We can easily trisect 135o angle using ruler and compass: Construct the right angle first and then bisect it.

What does it mean (mathematically) that a polynomial equation p(x) =0 can be solved by a formula?Definition Let p(x) be a polynomial. Its splitting field SF(p) is the smallest field in C that contains all roots of p(x)=0. That is, in the splitting field SF(p), the polynomial splits into linear factors:p(x) = (x a1) ( x a 2) . (x an), ai in SF(p).Suppose you have a formula for solving a polynomial equation a x 5 + bx 4 + c x 3 + d x 2 + ex + f = 0.You may not able to evaluate this formula in Q.

Let u1 = (ab + c)1/3 , u2 = (ab + be + cd ), u3 = (d u1)1/5.The formula can be evaluated in the field Q(u1, u2, u3), i.e., the splitting field SF (p) < Q (u1, u2, u3).Q(u1) : Q u13 in QQ(u1, u2) : Q(u1) u24 in Q(u1)Q(u1, u2, u3): Q(u1, u2) u35 in Q(u1, u2)This is an example of radical extension.

A polynomial p(x) is solvable by a formula if its splitting field SF( p ) is contained in a radical extension of Q.such that

Q(u1, u2, u i) = Q(u1, u2, u i-1) (u i) ui m is in Q(u1, u2, , u i-1) for some m.Definition A field extension E : Q is called a radical extension if there is a tower of field extensions:

E = Q(u1, u2, , un) : Q(u1, u2, .., u n-1) : Q(u1) : Q

We have a tenth-degree polynomial p(x). Let E : QE = Q(u1, u2, , un) : Q(u1, u2, .., u n-1) : Q(u1) : Qbe a field extension such that1. Q(u1, u2, u i) = Q(u1, u2, u i-1) (u i) 2. ui m is in Q(u1, u2, , u i-1) for some m or u i is a root of a polynomial with degree < 10. Question: Is the splitting field SF(p) of p contained in such an extension E?If there is a general algorithm involving only polynomials of degree < 10, then, SF(p) should be contained in such an extension field E for any 10th degree polynomial arising from a five-point calibrated relative orienation.

A specific type of field extensions E: E1 : E2 : : Q, specified by the problemA field extension L : Q defined by an instance of the problem (e.g., 60o angle)Want to know if L can be included in E, i.e., does L < E? For the two classical problems, we use the simplest invariant for field extension, the degree, to show that the inclusion is not possible.For other problems (including the ones studied in the paper), more refined invariant (galois group) is required. To show these negative results, one comes up with one instance of L :Q and show that L cannot be included in E using the invariant.

Let L be a field. An automorphism s of L is a bijective mapping s : L -> L that preserves the field structure:s (a b) = s(a) s(b)s ( a+ b) = s(a )+s(b) s ( 0 ) =0, s ( 1 ) =1.The set of automorphisms of L form a group under composition.The galois group Gal( L/K ) of L over K consists of automorphisms of L that fix elements of K. That is, s in Gal( L/K ) if and only if s ( x ) = x for every x in K. s(xy) = x s(y) for every x in K, y in L (Another notation for Gal (L/K) is AutK L. ) Analogy with linear algebra: Gal(L/K) is a generalization of the concept of linear maps.

Consider the extension F(u) : F, where u is a root of a polynomial p(x) with coefficients in F. Note that any s in Gal( F(u)/ F ) is determined by its action on u. This is because F(u) has a basis consists of powers of u. Furthermore, 0 = s ( 0 ) = s ( p(u) ) = p (s(u) ).That is, s(u) is a root of the polynomial p(x) as well !! If F(u) contains all the roots of p(x), then s simply permutes the roots.Examples: C = R ( i ). The Galois group Gal( C/R) is Z2.Q(sqrt(2) ) : Q. The Galois group is again Z2.Q(sqrt(2), sqrt(3)) = Q(sqrt(2)) (sqrt(3)). The Galois group is Z2 + Z2since 1, sqrt(2), sqrt(3) sqrt(6), form a basis. Any s in theGalois group is determined by its values s (sqrt(2)), s(sqrt(3)).

Definition Let P(x) be a polynomial with rational coefficients and SF(P) its splitting field. The Galois group for P(x) is the group Gal (SF(P) / Q ).Assume that P(x) is of degree n and it has n distinct roots. SF(P) contains every root of P(x), and it is generated by these roots:SF(P) = Q ( u1, u2, , un), where ui are roots of P(x).Therefore, Gal (SF(P) / Q ) can be considered as a subgroup of the permutation group Sn on n objects.

There are many polynomials with maximal possible Galois group, the permutation group.

What is the Galois group for the polynomial x2 2?The Galois group of the polynomial x3 4x + 2 can be shown to be S3.What about the Galois group for the polynomial x3 -1 ?Is S3 or something else (Z2)? The three roots are (1, a, a2 )What about the Galois group for the polynomial x4 -1 ?What about the Galois group for the polynomial x5 -1 ?

Let F be a field containing all the m-th roots of unity: p(x) = xm 1 =0 a0,a1, a2, a3, . . . , a m-1What is the Galois group Gal ( F(u) / F), where u is a root of the polynomial q(x) = um - f for some element f in F?F(u) is a splitting field for q(x) because the roots of q(x) are all in F(u) ua0, ua1, ua2, ua3, . . . , ua m-1Gal ( F(u) / F) is Zm and in particular, it is abelian.

Let G be a Galois group of a field extension L : K. Let E be an intermediate field : L : E : K, and H a subgroup of G.

Let E denote the subset of G consists of elements fixing E : E is a subgroup of G.Let H denote the subset of L consists of elements fixed by H: H is a subfield of L.

Lets show the following:Let E denote the subset of G consists of elements fixing E. E is a subgroup of G.Proof: 1. The set E is closed under composition. If r and s are elements in E, is rs in E?If s in E, is its inverse in E?Is the identity e in E?We haveE > E andH > H.

Gal (E1 /K) = Gal (L/K) / Gal (L/ E1)Given a field extension L : K and intermediate fields E1 and E2 (disregarding some technical details)The structure of the field extension L : K is encoded in the galois group AutK L

Q E1 E2 . EE is the radical extension containing the splitting field for P(x) and G the galois group of the extension E: QG G1 G2 . . . GN eThe quotient Gi / G i+1 is abelian for all i and each G i is a normal subgroup of Gi-1.G is a called a solved group.ESF(P)QIf the polynomial P(X) is solvable by radicals (formula), thenSF(P) is contained in E and the Galois group Gal (P) is a quotient group of G. Gal(P) must be solvable as well.

A polynomial p(x) is solvable by a formula if its splitting field SF( p ) is contained in a radical extension of Q.such that

Q(u1, u2, u i) = Q(u1, u2, u i-1) (u i) ui m is in Q(u1, u2, , u i-1) for some m.Definition A field extension E : Q is called a radical extension if there is a tower of field extensions:

E = Q(u1, u2, , un) : Q(u1, u2, .., u n-1) : Q(u1) : Q

Let F be a field containing all the m-th roots of unity: p(x) = xm 1 =0 a0,a1, a2, a3, . . . , a m-1What is the Galois group Gal ( F(u) / F), where u is a root of the polynomial q(x) = um - f for some element f in F?F(u) is a splitting field for q(x) because the roots of q(x) are all in F(u) ua0, ua1, ua2, ua3, . . . , ua m-1Gal ( F(u) / F) is Zm and in particular, it is abelian.

The groups S n and A n are known to be not solvable precisely for n > 4. Therefore, if the Galois group of a polynomial P(x) is Pn or An, for n > 4, there is no way that every root of P(x) can be computed by additions, multiplications, subtractions, divisions and taking radicals using coefficients of P(x). That is, one cannot have a general formula for solving the roots of polynomials with degrees > 4. Note that this does not forbid formulae for some restricted classes of polynomials.

Lemma 4.7 Let Fp, Fq be the splitting fields for two polynomials p, q, respectively over a base field F. Fpq the smallest field containing both Fp and Fq. Then Gal ( Fpq / Fp ) is isomorphic to a normal subgroup of Gal ( Fq / F) FFqFpFpqArrows are inclusionsGal ( F pq / F )Gal ( Fpq / Fq )= Gal( F q / F)Gal ( Fpq / Fp ) is a normal subgroup of Gal ( F pq / F )Every element of Gal ( F pq / F p ) survives the quotient:If s in Gal (Fpq / F p) maps to e in Gal (Fq / F), s must be in Gal (Fpq / Fq). This implies that s is the identity.

Consider a sequence of field extension (minus some details)F0 < F1 < < F N-1 < F N .Let P be a polynomial of degree n > 4 over F0 with Galois group Sn (or An). If FN is the first field in this sequence containing one of the roots of P, then it contains all the roots of P. Furthermore, G(FN / FN-1) has a quotient group isomorphic to Sn or An.

F0 F1 F2 . . . FN-1 F N .F0(P) F1(P) F2(P) . . . FN-1(P) F N(P)Let Fi(P) be the splitting field of P(x) over Fi.By Lemma 4.7, Gal (Fi (P) / Fi) is isomorphic to a normal subgroup of Gal(F i-1 (P) / F i-1)In particular, Gal (FN (P) / FN) and Gal (FN-1 (P) / FN-1) is isomorphic to a normal subgroup of Gal (F0(P) / F0) = S n. So they are either the identity, An or Sn.If Fn contains one root, then it contains every root.F n-1 does not contain any root, therefore, Gal(F n-1 (P) / F n-1 ) cannot be identity.One more step ( FN-1 -> FN-1(P) F N ) QED

We have a tenth-degree polynomial p(x). Let E : QE = Q(u1, u2, , un) : Q(u1, u2, .., u n-1) : Q(u1) : Qbe a field extension such that1. Q(u1, u2, u i) = Q(u1, u2, u i-1) (u i) 2. ui m is in Q(u1, u2, , u i-1) for some m or u i is a root of a polynomial with degree < 10. Question: Is the splitting field SF(p) of p contained in such an extension E? No, if the Galois group of p is S10.If there is a general algorithm involving only polynomials of degree < 10, then, SF(p) should be contained in such an extension field E for any 10th degree polynomial arising from a five-point calibrated relative orientation.

Consider a sequence of field extension (minus some details)F0 < F1 < < F N-1 < F N .Let P be a polynomial of degree n > 4 over F0 with Galois group Sn (or An). If FN is the first field in this sequence containing one of the roots of P, then it contains all the roots of P. Furthermore, G(FN / FN-1) has a quotient group isomorphic to Sn or An.Proof of Impossibility: Let FN be the splitting field of a tenth-degree polynomial P(x) whose Galois group is S10. We got F N by adjoining a root of a polynomial with degree < 10 to F N-1. Gal (FN/ F N-1) is a subgroup of Sn, for n< 10. By Theorem 4.8, it has a quotient group that is S10. This is a contradiction.

What is left is to come up with a five-point calibrated relative orientation problem that will require us to solve for the roots of a tenth-degree polynomial P(x) such that the Galois group of P(x) is S10.Use a computer program (MAGMA), it can be checked that the Galois group of this polynomial is S10 !

Jeffs Talk

Documents

Transcript of Jeffs Talk