Computing the determinant and the characteristic polynomial of a matrix via solving linear systems...

5

Click here to load reader

Transcript of Computing the determinant and the characteristic polynomial of a matrix via solving linear systems...

Page 1: Computing the determinant and the characteristic polynomial of a matrix via solving linear systems of equations

Information Processing North-Holland

(1988) 71-75 24June 1988

COMPUTING THE DETERMINANT AND THE CHARACTERISTIC POLYNOMIAL OF A MA= VIA SOLVING LINEAR SYSTEMS OF EQLJATIONS

Victor PAN *

Department of Mathematics, Lehman College, The City University of New York, Bronx, NY 1046&1590, U.S.A., and Department of Computer Science, State IJniversity of New York at Albany, Albany, NY 12222, U.S.A.

Communicated by E.C.R. Hehntr Received 1 October 1987 Revised 22 February 1988

We combine Cramer’s rule, p-adic lifting, and rational interpolation in an unusual way, in order to reduce the problems of computing the determinant, det A, and all other coefficients of the characteristic polynomial, det( Al - A), of a matrix A to inverting A and/or to solving linear systems of equations. Such a reduction enables LS to apply Hensel’s effective p-adic lifting to the evaluation of det A and det(XI - A); from that practically important point of view, no comparable alternative is known. On the theoretical side, no other ways of reduction of computing det A to solving few linear systems are known.

Keywords: Determinant, characteristic polynomial

1. Introduction-Problem, general approach, and the algorithm

We combine Cramer’s rule, p-adic lifting, and rational interpolation in an unusual way, in order to reduce computing the determinant, det A, and all (/her coefficients of the characteristic poly- nomial, det( XI - A ), of a matrix A to inverting A and/or to solving linear systems of equations. Such a reduction enables us to apply Hensel’s effective p-adic lifting to the two former prob- lems; from that practically important point of view, no comparable alternative is known. On the theoretical side, no other ways of reduction of computing det A to solving few linear systems are known.

More generally, let us compute the coefficients of det( A - A E ), that is, of a polynomial of degree p(E) in X where p(E) = rank E. Then,

E = i u(i)(u(i))’ 0)

* The research reported here was supported by the National Science Foundation under Grant No. DCR-8507573.

for p = p(E) and for some vectors u(i), u(i) of length n and, actually, equation (1) implies that rank E is p as long as the vectors in each set {u(i)1 and {u(i)} are linearly independent. det A is the X-free term of that polynomial for any choice of the matrix E, and the polynomial turns into (-l)ndet(XI-A) for E=I (in which case p(E) = n and the leading coefficient of det( A - XE) is (-1)“).

I et u be a vector of length n, let A, E be a pair of n x n matrices, and let x(X) = (A - X E )- b be the solution to the linear system

(d -XE)x(X)=u, (2)

where x(X) = [q(X)]. Then, by Cramer’s rule, xi(X)=det B,(A -XE, u)/det(A -XE) for all i where Bi( W, u) is the matrix formed by replacing the ith column of a matrix W by a vector u. This implies the following fact.

Fact. LA xi(h)= Si(X)/p,(X), where S,(h) and p,(A) are polynomials in X. Then,

p,(x)gcd(det B,( A - XE, u), det(A - AE))

= det( A - XE)gCd(Si( A), pi(h))*

0020-0190/88/$3.50 0 1988, Elsevier Science Publishers B.V. (North-Holland) 71

Page 2: Computing the determinant and the characteristic polynomial of a matrix via solving linear systems of equations

Volume 28. Number 2 iNFCRM_ATION PROCESSING LETTERS 24June1988

Here and hereafter, gcd stands for the greatest common divisor.

Remark. Here we assume that the gcd’s are de- fined within constant factors or, for integer matrices A and E and for an integer vector u, we will let the coefficients of pi(X), of si( X), and of the gcd’s also be integers and will maximize the absolute values of the leading integer coefficients of the gcd’s (compare [S], pp. 406-4091).

The above fact suggests the followir 4 algorithm for computing a divisor of the polynomial det(A - XE).

Algorithm A Inprrt: n x n matrices A and E. Initialize: choose a vector u of length n. Stage 1. Fix an integer i, 1~ i 4 n, and com-

pute the coefficients of the two polynolmials S,(h) and pi(h) whose ratio Si( X)/p,( X) equals the i th entry of the solution vector x(X) to the system of equation (2).

S&zge 2. Compute the coefficients of the manic polynomial d,(X) = gCd(si(X), pi(A)).

Stage 3. Compute and output the coefficients of the polynomial pi( X)/di( X).

2. p-adic lifting: An auxiliary algorithm

At Stage 1 of Algorithm A, we will rely on the following p-adic lifting algorithm of [6].

Algorithm B: pladic lifting process for linear sys- tems WI)

hput: two positive integers H = 2h and p, an

I: X n matrix A (filled with integers and such that det A # 0 mod p), and an integer vector u of length n.

Stage 0 (initialization). Compute S(0) = A-' mod p,

x1 = S(O)u mod p, q =Ax, mod p2.

Stage j, j= 1, 2 ,..., H- 1. Compute

w =u- ~~ mod P’+‘,

$= (S(O)y/pj) mod p2,

y+ 1 = I+ +pjAy, mod pjfZ9

++I =x,. +p’yJ mod p’+l,

72

Stage H. Recover A- ‘u from XH = A% mod pH [9,4]. Output A -lu (or, in some applica- tions, skip Stage H and output XH).

3. The cost estimates for Algorithm A

Hereafter, ‘ops’ will be our shorthand for ‘arithmetic operations’. At first we recJ1 the cost estimates for Algorithm B.

Stage j for j=l, 2;..., H- 1 involves O(n) ops in order to add vectors and to multiply them by constants and also involves two lower precision multiplications of matrices by vectors, that is, multiplication modulo p2 of the matrix S(0) by a vector and multiplication of the matrix A and S(0) by a llrector reduced modulo p2. We may avoid computing the matrix S(0) and find the vectors xi and y5’ for all j by solving linear system of equatrons Ax, = u mod p, Ayj = ~j/p’ mod p2.

Next we will estimate the cost of Stage 1 of Algwithm A, performed in two ways, shown in (a) and (b) below.

(a) Apply Algorithm B replacing p by X and AbyA-XE,sothat S(O)=A-‘.ThenStageOof Algorithm B costs 0(n3) ops due to computing A-‘. Stages 1, 2 ,.._, H - 1 of Algorithm B to- gether cost 0( Hn2) ops,

H=l +degdet(.A-AE)

+degdet Bi(A -XE, u)

fmin(l+2 rank E, 2n). (3)

Stage H - 1 outputs x,(h) mod XH for all j, At Stage H, the polynomials Si(h) and pi(X) are recovered from Xi(X) mod XH by computing PadC approximation [a]; this costs 0( H log2H) ops. The overall cost is O(n’) ops, dominated at Stage 0.

If A is a special matrix, say Toeplitz, I-Iankel, or Sylvester, then the cost of Algorithm B de- creases to 0(( H + log n)n log n) ops; specifically, to O(n log2n) at its Stage 0 and to O(Hn log n) at its Stages 1, 2,.. ., H - 1. If A is a Vander- monde matrix, then computing S(0) should be avoided and replaced by soiving H - 1 linear equations (with the coefficient matrix A) at Stages 1, 2,..., H - 1 of Algorithm B, for the cost

Page 3: Computing the determinant and the characteristic polynomial of a matrix via solving linear systems of equations

Volume 28, Number 2 INFORMATION PROCESSING LETTERS 24June1988

O(HPZ log’n), dominating the overall cost of Al- gorithm B in that case.

(b) This alternative way is suggested only if A is a special matrix such that a linear system Ax = u can be solved using O(n log2n) ops. Compute xi(&) for H distinct values h,, h = 0, 1,. . . , H - 1, such that the overall cost of that computa- tion is 0(&z log2n) ops (assuming that A is a special matrix). Then, recover s,(h) and pi(A) by means of rational interpolation 131.

Stages 2 and 3 of Algorithm A. Compute the gcd for the cost O(H log2H) ops and perform polynomial division via FFT for the cost O( H log H) ops [l,S].

Remark. The values X, can be chosen so as to facilitate computing Xi(h,), say by making the matrix A - XE strongly diagonally dominant. Then, efficient iterative methods may yield the approximation to xi( X, ), from which the value Xi(hh) GUI be recovered via the algorithm of 171.

4. p-adic lifting for computing det(A - XE)

If A and E are integer matrices we may choose an integer vector u and perform the computations modulo a prime p, with subsequent lifting to modulo PH by means of Algorithm B. Lifting costs Q(Hn2) multiplications modulo p or mod- ulo p2 of polynomials in X of degree < 1 + 2 rank E and O(Hr0) similar polynomial multipli- cations modulo pJ for J ranging from 1 to H.

5. The polynomial det( A - AE) or its divisor?

In many applications (in particular to comput- ing a single eigenvalue of the pencil AE - I) the divisors of det( A - X E) serve vev well. On the other hand,

deggcd(det B,(A -XE, u), det(A -XE))=O,

and then

(4)

pi(A)/gcd(si(x), pi(x)) =C de@ -XE)

for a nonzero constant c, unless the resultant of the two polynomials, det Bj(A - X E, u) and

det( A - A E), is zero. In the latter case, the entries of the matrix A and of the vector u must satisfy a certain polynomial equation [8].

If E = I, we skould SC& pi(h) and si(h) at Stage 1 of Algorithm A so as to set the leading coefficient of pi(X) to (- 1)“; then, c = 1.

How likely is that (4) holds? Let the vector u be fixed and assume a random distribution of the entries of the matrix A. Then, equation (4) holds only in an exceptional degenerated case; in par- ticular, the probability of arriving at the vanishing resultant and thus at a proper divisor of det( A - X E) (rather than at c det( A - A E )) converges to 0 as n 4 00. Actually, the matrix A is not random, and for some special classes of matrices A and E (and for a fixed vector u) the above resultant turns into 0. Generally, the resultant depends on JJ and may hardly vanish for a random vector u. More precisely, for a random u, equation (4) will hold with probability converging to 1 as n + 60 as long as the determinants of the matrix A - XE and of all n of its (n - 1) x (n - 1) submatrices formed by deleting column i and one of the rows of A - hE have only constant common divisors. For some special classes of matrices A md E, however, this may not be the case and, moreover, the above resultant may turn into 0 for all the vectors u. Then, of course, the random choice of the vector u does not help, but other modifica- tions of the approach may.

6. An example and an extension of the algorithm

Example. Let A be a triangular matrix, E = I. Then, trivially,

det(A-XI)=n(aj.-A). i

pi(X)=aii-h for all i, which defines an eigen- value of A (and, in some cases, this is the problem to be solved). If all the diagonal entries aii are distinct, we may compute det( A - h I) as the least common multiple (lcm) of the polynomials pi(X) for all i or as the denominator of the partial fraction

73

Page 4: Computing the determinant and the characteristic polynomial of a matrix via solving linear systems of equations

Volume 28, Number 2 INFORlvI!!TION PROCESSING LETTERS 24 June 1988

for any vector 1v = [ ui] of length n having no zero entries.

Generalizing the latter trick we arrive at the following extension of Algorithm A.

Algtwithm c

n.

matrices A and E. oose two vectors u and v of length

Stage 1. Solve the linear system of equation (2) and fiid the mner product

n- 1

VTX(X) = C UiXi(X) =S(X)/p(h), i=O

that is, compdte the coefficients of the polynorni- als s(X) and p(X).

Stage 2. Find d(X) = gcd( p(X), s(X)). Stage 3. Output the quotient p(A)/d(X).

The modification iittle changes the cost of Al- gorithm A but generally increases the probability of outputting the polynomral c det( A - A E) (rather than its lower degree divisor) assuming a random choice of the vectors II and v. It rather easily follows from Cramer’s rule that the output is c det( A - A E) with probablity converging to 1 as n --) do for random vectors II and v if the gcd of the determinants of the matrix A - AE and of all n2 of its (n - 1) x (n - 1) submatrices is 1, which is a der assumption than (4).

Algorithms A and C for E = I output det A as the X-free term of det( A -- AI), but rank I is n, and so the cost of Algorithms A and C is rather high (see (3)). We will define a matrix E of lower rank via equation (1)9 by choosing p pairs of random vectors u(i), v(i), i = 1, 2 ,..., p; then the cost of the algorithm is iower but for X = 0 we arrive at c det A and generally need to find c.

Let us examine the problem assuming that A is an integer matrix, so that the determinants of A and of all of its submatrices are integers. We will fix a natural p, choose integer vectors U, u(i),

74

v(i), i = 1, 2,. . . , p (the integer vectors u(i), v(i) are random vectors that define matrix E via (l)), and apply Algorithm A (or similarly Algorithm C) so as to keep the coefficients of all computed polynomials integer. We will scale the output polynomial by dividing it by its content [5, p. 4051, that is, by the gcd of all of its coefficients. Let q(X) denote the resulting polynomial. Note that content(q( X)) = 1. By the Fact of Section 1, q(0) always divides det A; furthermore, q(0) certainly equals det A if simultaneously (4) holds and

content(det( A - X E )) = 1. (5)

Consider the equation (5) assuming that gcd(A), that is, the gcd of all the entries of matrix A, equals one (otherwise, we will scale matrix A by dividing it by gcd( A), compute the detetinant of the resulting m&k, and correct the output at the end). Since we choose a random matrix E of (l), we will assume that all the p coefficients of det(A - A E) (excluding the h-free term, det A) are ran- dom and independent of each other. Then, each prime p will simultaneously divide all those coeffi- cients with probability l/p? Therefore, the probbility that (5) holds will be greater than

08

1 - x l/kP > 1 - p/2’-’ k=2

and will converge to 1 as p --, oo. Thus, our ap- proach gives a probabilistic average case reduction of computing det A to solving few (say O(log n)) linear systems.

References

PI

PI

131

r41

A.V. Aho, J.E. Hopcroft and J.D. Ullman, The Des&r and Analysis of Computer Algorithms (Addison-Wesley, Read- ing, MA, 1976). R.P. Brent, F.G. Gustavson and D.Y.Y. Yun, Fast soiution of Toeplitz systems of equations and computation of pad6 approximations, J. Afgorithms 1 (1980) 259-295. F.G. Gustavson and D.Y.Y. Yun, Fast algorithms for rational Hermite approximation and solution of Toeplitz systems, IEEE Trans. Circuits & Systems CAS-26 (9) (1979) 750-755. E. Kaltofen and H. Rolletscheck, Computing greatest com- mon divisors and factorizations in quadratic number fields, Lecture Notes in Computer Science, Vol. 204 (Springer, Berlin, 1985) 279-288.

Page 5: Computing the determinant and the characteristic polynomial of a matrix via solving linear systems of equations

Volume 28, Number 2 INFORMATION PROCESSING LETTERS 24Junel988

[S] D.E. Knuth, The Arf of Cornpurer Programming: Semi- numerical Algorithm, Vol. 2 (Addison-Wesley, Reading, MA, 1981).

[6] R.T. Moenck and J.H. Carter, Approximate algorithms to derive exact solutions to systems of linear equation, Proc. EUROSAM, Lecture Notes in Computer Science, Vol. 72 (Springer, Berlin, 1979) 63-73.

[7] S. Ursic and C. Patarra, Exact solution of systems of linear

equations with iterative methods, SIAM J. Algebraic & Discrefe Methods (1983) 111-115.

[8] B.L. Van der Waerden, Modern AIgebru (Frederick Ungar Publishing Co., New York, 1953).

[9] P. Wang, A p-adic algorithm for univariate partial frac- tions, Proc. 1981 ACM Symp. on Symbolic and Algebraic Compufurions (1981) 212-217.

75