The hyperbolic Schur decomposition

21

Click here to load reader

Transcript of The hyperbolic Schur decomposition

Page 1: The hyperbolic Schur decomposition

Linear Algebra and its Applications 440 (2014) 90–110

Contents lists available at ScienceDirect

Linear Algebra and its Applications

www.elsevier.com/locate/laa

The hyperbolic Schur decomposition

Vedran Šego a,b,∗a Faculty of Science, University of Zagreb, Croatiab School of Mathematics, The University of Manchester, Manchester, M13 9PL, UK

a r t i c l e i n f o a b s t r a c t

Article history:Received 12 December 2012Accepted 22 October 2013Available online 14 November 2013Submitted by F. Dopico

MSC:15A6346C2065F25

Keywords:Indefinite scalar productsHyperbolic scalar productsSchur decompositionJordan decompositionQuasitriangularizationQuasidiagonalizationStructured matrices

We propose a hyperbolic counterpart of the Schur decomposition,with the emphasis on the preservation of structures related tosome given hyperbolic scalar product. We give results regardingthe existence of such a decomposition and research the propertiesof its block triangular factor for various structured matrices.

© 2013 Elsevier Inc. All rights reserved.

1. Introduction

The Schur decomposition A = U T U∗ , sometimes also called Schur’s unitary triangularization, isa unitary similarity between any given square matrix A ∈ Cn×n and some upper triangular matrixT ∈ Cn×n . Such a decomposition has a structured form for various structured matrices, i.e., T is diag-onal if and only if A is normal, real diagonal if and only if A is Hermitian, positive (nonnegative) realdiagonal if and only if A is positive (semi)definite and so on.

Furthermore, the Schur decomposition can be computed in a numerically stable way, making ita good choice for calculating the eigenvalues of A (which are the diagonal elements of T ) as well as

* Correspondence to: School of Mathematics, The University of Manchester, Manchester, M13 9PL, UK.E-mail address: [email protected].

0024-3795/$ – see front matter © 2013 Elsevier Inc. All rights reserved.http://dx.doi.org/10.1016/j.laa.2013.10.037

Page 2: The hyperbolic Schur decomposition

V. Šego / Linear Algebra and its Applications 440 (2014) 90–110 91

the various matrix functions (for more details, see [11]). Its structure preserving property allows tosave time and memory when working with structured matrices. For example, computing the value ofsome function of a Hermitian matrix is reduced to working with a diagonal matrix, which involvesonly evaluation of the diagonal elements.

Unitary matrices are very useful when working with the traditional Euclidean scalar product〈x, y〉 = y∗x, as their columns form an orthonormal basis of Cn . However, many applications requirea nonstandard scalar product which is usually defined by [x, y] J = y∗ J x, where J is some nonsingu-lar matrix, and many of these applications consider Hermitian or skew-Hermitian J . The hyperbolicscalar product defined by a signature matrix J = diag( j1, . . . , jn), where jk ∈ {−1,1} for all k, arisesfrequently in applications. It is used, for example, in the theory of relativity and in the research of thepolarized light. More on the applications of such products can be found in [10,13,14,17].

The Euclidean matrix decompositions have some nice structure preserving properties even in non-standard scalar products, as shown by Mackey, Mackey and Tisseur [16], but it is often worth lookinginto versions of such decompositions that respect the structures related with the given scalar prod-uct. There is plenty of research on the subject, i.e., hyperbolic SVD [17,24], J1 J2-SVD [9], two-sidedhyperbolic SVD [20], hyperbolic CS decomposition [8,10] and indefinite QR factorization [19].

There are many advantages of using decompositions related to some specific, nonstandard scalarproduct, as such decompositions preserve structures related to a given scalar product. They can sim-plify calculation and provide a better insight into the structures of such structured matrices.

In this paper we investigate the existence of a decomposition which would resemble the traditionalSchur decomposition, but with respect to the given hyperbolic scalar product. In other words, oursimilarity matrix should be unitary-like (orthonormal, to be more precise) with respect to that scalarproduct.

As we shall see, a hyperbolic Schur decomposition can be constructed, but not for all squarematrices. Furthermore, we will have to relax conditions on both U and T . The matrix U will be hy-perexchange (a column-permutation of the matrix unitary with respect to J ). The matrix T will haveto be block upper triangular with diagonal blocks of order 1 and 2. Both of these changes are quiteusual in hyperbolic scalar products. For example, they appear in the traditional QR vs. the hyperbolicQR factorizations [19].

Some work on the hyperbolic Schur decomposition was done by Ammar, Mehl and Mehrmann [1,Theorem 8], but with somewhat different focus. They have assumed to have a partitioned J =I p ⊕ (−Iq), in the paper denoted as Σp,q , for which they have observed a Schur-like similaritythrough unitary factors (without permuting J ), producing more complex triangular factors. Also, theirdecomposition is applicable only to the set of J -unitary matrices, in the paper denoted as the Liegroup Op,q .

In the symplectic scalar product spaces, Schur-like decomposition was researched by Lin,Mehrmann and Xu [15], by Ammar, Mehl and Mehrmann [1], and by Xu [22,23].

In Section 2, we provide a brief overview of the definitions, properties and other results relatingto the hyperbolic scalar products that will be used later. In Section 3, the definition and the construc-tion of the hyperbolic Schur decomposition are presented. We also provide sufficient requirementsfor its existence and examples showing why such a decomposition does not exist for all matrices.In Section 4 we observe various properties of the proposed decomposition. We finalize the resultsby providing the necessary and the sufficient conditions for the existence of the hyperbolic Schurdecomposition of J -Hermitian matrices in Section 5.

The notation used is fairly standard. The capital letters refer to matrices and their blocks, elementsare denoted by the appropriate lowercase letter with two subscript indices, while lowercase letterswith a single subscript index represent vectors (including matrix columns). By J = diag(±1) we de-note a diagonal signature matrix defining the hyperbolic scalar product, while P and Pk (for someindices k) denote permutation matrices. We use

Sn := [δi,n+1− j] =⎡⎣ 1

. ..

⎤⎦ = S−1n

1

Page 3: The hyperbolic Schur decomposition

92 V. Šego / Linear Algebra and its Applications 440 (2014) 90–110

for the standard involutory permutation (see [6, Example 2.1.1]), J for a Jordan matrix and Jk(λ)

for a single Jordan block of order k associated with the eigenvalue λ. Vector ek denotes k-th columnof the identity matrix and ⊗ denotes the Kronecker product. The symbol ⊕ is used to describe adiagonal concatenation of matrices, i.e., A ⊕ B is a block diagonal matrix with the diagonal blocks Aand B .

Also a standard notation, but somewhat incorrect in terms of the indefinite scalar products, is|v| := √|[v, v]|. This is used as the norm of vector v induced by the scalar product [·,·], but oneshould keep in mind that it doesn’t have the usual properties of the norm (definiteness and thetriangle inequality do not hold), but is used nevertheless due to its relation with the scalar prod-uct.

2. The hyperbolic scalar products

As mentioned in the introduction, an indefinite scalar product is defined by a nonsingular Hermitianindefinite matrix J ∈ Cn×n as [x, y] J = y∗ J x. When J is known from the context, we simply write[x, y] instead of [x, y] J .

When J is a signature matrix, i.e., J = diag(±1) := diag( j11, j22, . . . , jnn), where jkk ∈ {−1,1} forall k, the scalar product is referred to as hyperbolic and takes the form

[x, y] J = y∗ J x =n∑

i=1

jiixi yi .

Throughout this paper we assume that all considered scalar products are hyperbolic, unless statedotherwise.

Indefinite scalar products have another important property which, unfortunately, causes a majorproblem with the construction of the decomposition. A vector v = 0 is said to be J -degenerate if[v, v] = 0; otherwise, we say that it is J -nondegenerate. Degenerate vectors are sometimes also calledJ -neutral.

If [v, v] < 0 for some vector v , we say that v is J -negative, while we call it J -positive if [v, v] > 0.When J is known from the context, we simply say that the vector is degenerate, nondegenerate, neutral,negative or positive.

We extend this notion to matrices as well: a matrix A is J -degenerate if rank A∗ J A < rank A.Otherwise, we say that A is J -nondegenerate. Again, if J is known from the context, we simply saythat A is degenerate or nondegenerate.

We say that the vector v is J -normalized, or just normalized when J is known from the context, if|[v, v]| = 1. As in the Euclidean scalar product, if a vector v is given, then the vector

v ′ = 1

|v| v = 1√|[v, v]| v (1)

is a normalization of v . Note that degenerate vectors cannot be normalized. Also, for a given vectorx ∈ Cn , sign[ξx, ξx] is constant for all ξ ∈ C \ {0}. This means that the normalization (1) does notchange the sign of the scalar product, i.e.,

sign[v, v] = sign[v ′, v ′] = [

v ′, v ′].Like in the Euclidean scalar products, we define the J -conjugate transpose (or J -adjoint) of A with

respect to a hyperbolic J , denoted as A[∗] J , as [Ax, y] J = [x, A[∗] J y] J for all vectors x, y ∈ Cn . It iseasy to see that A[∗] J = J A∗ J . Again, if J is known from the context, we simply write A[∗] .

The usual structured matrices are defined naturally. A matrix H is called J -Hermitian (orJ -selfadjoint) if H [∗] = H , i.e., if J H is Hermitian. A matrix U is said to be J -unitary if U [∗] = U−1,i.e., if U∗ J U = J .

Like their traditional counterparts, J -unitary matrices are orthonormal with respect to [·,·] J . How-ever, unlike in the Euclidean scalar product, in hyperbolic scalar products we have a wider class ofmatrices orthonormal with respect to J (which are not necessarily unitary with respect to the samescalar product), called J -hyperexchange matrices.

Page 4: The hyperbolic Schur decomposition

V. Šego / Linear Algebra and its Applications 440 (2014) 90–110 93

We say that U is J -hyperexchange if U∗ J U = P∗ J P for some permutation P . Although the term“hyperexchange” is quite common, we often refer to such matrices as J -orthonormal, to emphasizethat their columns are J -orthonormal vectors. In other words, if the columns of U are denoted as ui ,then

[ui, ui] = ±1, [ui, u j] = 0, for all i, j.

More on the definitions and properties related to hyperbolic (and, more generally, indefinite) scalarproducts can be found in [6].

Throughout the paper, we often consider the diagonal blocks of a given matrix A. In order to keepthe relation with the given hyperbolic scalar product induced by some J = diag(±1), we introducethe term the corresponding part of J . Let J = diag( j1, . . . , jn), jk ∈ {−1,1}, define a hyperbolic scalarproduct and let A = [Aij] ∈ Cn×n be a blockmatrix with elements apq partitioned in the blocks Aijin a way that each diagonal block Akk is of order 1 or 2. Observing the block Akk for a given k, thecorresponding part of J , here denoted as J ′ , is defined as J ′ = [ jp] if Akk = [app] is of order 1 and asJ ′ = diag( jp, jp+1) if

Akk =[

app ap,p+1ap+1,p ap+1,p+1

]is of order 2.

3. Definition and existence of the hyperbolic Schur decomposition

In this section we present the definition and the main results regarding the hyperbolic Schurdecomposition of a matrix A ∈Cn×n .

Like other hyperbolic generalizations of the Euclidean decompositions, this one also has a hy-perexchange matrix instead of a unitary one, as well as the block structured factor instead of atriangular/diagonal factor of the Schur decomposition. Block upper triangular matrices with the diag-onal blocks of order 1 and 2 are usually referred to as quasitriangular (see [18, Section 3.3]). Similarly,if T is block diagonal with the diagonal blocks of order 1 and 2, we refer to it as quasidiagonal.

Before providing a formal definition, let us first show obstacles which will explain why thesechanges are necessary.

It is a well known fact that the first column u1 of the similarity matrix U in any similarity trian-gularization A = U T U−1 is an eigenvector of A. Also, it is easy to construct a nonsingular matrix withall degenerate columns (see [21, Example 3.1]), which cannot be J -normalized. This means that thereare some matrices (even diagonalizable ones!) that are not unitarily triangularizable with respect tothe scalar product induced by J . However, allowing the blocks of T to be of order 2, we relax thatcondition, as shown in Example 3.1.

For better clarity, we say that a block X jj of a matrix X is irreducible if it cannot be split into

smaller blocks, without losing the block structure of the matrix. If X = ⊕ki=1 Xii is block diagonal, we

say that X jj is irreducible if it cannot be split as X jj = Y1 ⊕ Y2 for some square blocks Y1 and Y2. For

a block triangular X , we say that X jj is irreducible if it cannot be split as X jj =[

Y11 Y12Y22

], where Y11

and Y22 are square blocks. Since we mostly deal with blocks of order 2, the irreducible blocks willbe those of order 1 and those of order 2 with one of the nondiagonal elements (in case X is blockdiagonal) or the bottom left element (in case X is a block triangular) being nonzero.

The described notation of irreducible blocks is just a descriptive way of saying that we cannotsplit X into smaller blocks while preserving its block diagonal or block triangular structure. It is usedto simplify some of the statements and proofs by reducing the number of observed cases withoutlosing generality. See [21, Example 3.2] for further clarification of the concept of irreducible blocks.

Unfortunately, allowing the triangular factor to have blocks is not enough. We also need permu-tations of the decomposing matrix and the scalar product generator J . The following example showswhy this is needed, simultaneously illustrating a general approach to proving that some matrices donot have a hyperbolic Schur decomposition.

Page 5: The hyperbolic Schur decomposition

94 V. Šego / Linear Algebra and its Applications 440 (2014) 90–110

Example 3.1. Let J = diag(1,1,−1,−1) and let A = SJ S−1, where J = diag(1,2,3,4) and

S =

⎡⎢⎢⎢⎢⎢⎣

15√

31

17√

151

195√

71

257√

2552

5√

34

17√

158

195√

716

257√

2554

5√

316

17√

1564

195√

7256

257√

2558

5√

364

17√

15512

195√

74096

257√

255

⎤⎥⎥⎥⎥⎥⎦ .

We shall now show that no J -unitary V and quasitriangular T exist such that A = V T V −1. Since

S∗ J S =

⎡⎢⎢⎣−1 −0.994393 −0.970230 −0.949852

−0.994393 −1 −0.993828 −0.985075−0.970230 −0.993828 −1 −0.998151−0.949852 −0.985075 −0.998151 −1

⎤⎥⎥⎦ (2)

to 6 significant digits, all eigenvectors of A (i.e., columns of S) are normalized negative vectors. Fur-thermore, if we denote them by si , then −1 < [si, s j] < 0 for all i = j.

Let us now assume that there exist a J -unitary V and a quasitriangular T such that A = V T V −1.We distinguish the following possibilities:

1. The first block of T is of order 1. Then v1 is obviously a J -normalized eigenvector of A. But, thisis impossible, since V is J -unitary, meaning that V ∗ J V = J , so [v1, v1] = 1, while [si, si] < 0 forall i and, since normalization does not change the sign of the vector’s scalar product by itself,[v1, v1] < 0.

2. The first block of T is an irreducible block of order 2 (i.e., t21 = 0). Then it is easy to see that

Av1 = t11 v1 + t21 v2, Av2 = t12 v1 + t22 v2.

In other words,

(A − t11I)v1 = t21 v2, (A − t22I)v2 = t12 v1.

Multiplying the second equality with t21 and substituting t21 v2 with the expression from the firstequality, we get

t12t21 v1 = (A − t22I)t21 v2 = (A − t22I)(A − t11I)v1.

In other words, v1 is an eigenvector of (A − t22I)(A − t11I). Using the same argument, we see thatv2 is an eigenvector of (A − t11I)(A − t22I). Furthermore,

(A − t22I)(A − t11I)v1 = A2 v1 − (t11 + t22)Av1 + t11t22 v1 = t12t21 v1.

Now, we have

0 = A2 v1 − (t11 + t22)Av1 + (t11t22 − t12t21)v1

=(

A − t11 + t22

2I

)2

v1 −(

(t11 − t22)2

4+ t12t21

)v1.

So, v1 is an eigenvector of

A2 :=(

A − t11 + t22

2I

)2

, (3)

i.e.,

A2 v1 = λ′v1, λ′ := (t11 − t22)2

+ t12t21. (4)

4
Page 6: The hyperbolic Schur decomposition

V. Šego / Linear Algebra and its Applications 440 (2014) 90–110 95

But, since

(A − t22I)(A − t11I) = (A − t11I)(A − t22I),

v1 and v2 are both (linearly independent) eigenvectors of A2, with the same eigenvalue λ′ .Since A and A2 are diagonalizable, every eigenvector of A is also an eigenvector of A2. Moreover,since the eigenvalues of A are distinct (they are 1, 2, 3 and 4), A2 has at most one eigenvalue ofmultiplicity 2 (this is easily seen from its Jordan decomposition). In other words, its eigenspaceshave dimensions at most 2, so v1 and v2 are linear combinations of si and s j for some i = j.Let v1 = α1si + β1s j and v2 = α2si + β2s j . Obviously, α1, β1 = 0 (or v1 would be an eigenvectorof A, hence covered by the case 1). Then, from J = V ∗ J V , we get

1 = j11 = [v1, v1] = |α1|2[si, si] + |β1|2[s j, s j] + 2 Re(α1β1[si, s j]

)= −(|α1|2 + |β1|2 − 2[si, s j]Re(α1β1)

). (5)

From (2) we see that for i = k, −1 < [si, sk] < 0. Using (5) and the well known fact that |Re(z)| �|z|, i.e., Re(z) � −|z|, for all z ∈C, we see that

−1 = |α1|2 + |β1|2 − 2[si, s j]Re(α1β1)

= |α1|2 + |β1|2 + 2∣∣[si, s j]

∣∣ Re(α1β1)

� |α1|2 + |β1|2 − 2∣∣[si, s j]

∣∣ · |α1| · |β1|> |α1|2 + |β1|2 − 2|α1| · |β1| =

(|α1| − |β1|)2 � 0,

which is an obvious contradiction.

Since both possible cases have led to a contradiction, the described decomposition does not exist forthe pair (A, J ).

Knowing the obstacles, we are now ready to define the hyperbolic Schur decomposition.

Definition 3.2 (The hyperbolic Schur decomposition). For a given A ∈ Cn×n and J = diag(±1), a hy-perbolic Schur decomposition of A (with respect to J ) is any J -orthonormal similarity of A to thequasitriangular form, i.e.,

A = V T V −1, V ∗ J V = P∗ J P , (6)

where T is quasitriangular and P is a permutation.

It is easy to show (see [21]) that (6) is equivalent to the following identities:

A = (P1U P2)T (P1U P2)−1, U∗ J U = J , J = P∗

1 J P1,

A = (V P )T (V P )−1, V ∗ J V = J , (7)

A = (P W )T (P W )−1, W ∗ J ′W = J ′, J ′ = P∗ J P , (8)

P∗ A P = W T W −1,

where U is J -unitary, V is J -unitary, W is J ′-unitary and P , P1 and P2 are permutations such thatP = P1 P2.

Throughout the paper we also consider decompositions that resemble the Schur decomposition,but with some of the irreducible blocks in T of order strictly larger than 2. We refer to such decom-positions as the hyperbolic Schur-like decompositions.

Naturally, the most interesting question here concerns the existence of such a decomposition. Thefollowing theorem shows that all diagonalizable matrices have it.

Page 7: The hyperbolic Schur decomposition

96 V. Šego / Linear Algebra and its Applications 440 (2014) 90–110

Theorem 3.3 (Diagonalizable matrix). If A ∈ Cn×n is diagonalizable, then it has a hyperbolic Schur decompo-sition with respect to any given J = diag(±1).

Proof. Let A = SJ S−1, where J is a diagonal matrix, be a Jordan decomposition of A. Since S isnonsingular, and therefore S∗ J S is of full rank, by [19, Theorem 5.3], there exist matrices Q ∈ Cn×n

and R ∈Cn×n and permutations P1 and P2 such that

S = P1 Q R P∗2, Q ∗ J Q = J , J = P∗

1 J P1

and R is quasitriangular. Note that, since S is nonsingular, R is nonsingular as well and is thereforeinvertible. So,

A = SJ S−1 = P1 Q R P∗2J P2 R−1 Q −1 P∗

1. (9)

Since J is diagonal, P∗2J P2 is diagonal as well, which means it is also (block) upper triangular. We

already have that R is quasitriangular and R−1 has the same block triangular structure as R . In otherwords,

T := R P∗2J P2 R−1

is quasitriangular. From this and (9), we see that there exists P := P1 such that

A = (P Q )T (P Q )−1, Q ∗ J Q = J , J = P∗ J P ,

so (8) holds, which means that A has a hyperbolic Schur decomposition. �Obviously, there are also some nondiagonalizable matrices that have a hyperbolic Schur decom-

position. Trivial examples are all Jordan matrices with at least one diagonal block of order greaterthan 1, since they are by definition both nondiagonalizable and triangular.

In Theorem 3.3, we assume A to be diagonalizable, which is then conveniently used in the proof.Under the assumption of diagonalizability, we can also mimic the traditional proof from the Euclideancase, using the diagonalizability for the convenient choice of the columns of the similarity matrix.Although technically far more complex than in the Euclidean case, this proof is also pretty straight-forward and will be used to prove the existence of the hyperbolic Schur decomposition for somenondiagonalizable matrices in Proposition 3.7.

It is only natural to ask if there exists a nondiagonalizable matrix which does not have a hyperbolicSchur decomposition? As the following example shows, such matrices do exist.

Remark 3.4. When discussing the counterexamples for the existence of a hyperbolic Schur decompo-sition, we shall often define our matrices via their Jordan decompositions because the (non)existenceof a hyperbolic Schur decomposition heavily depends on the degeneracy and mutual J -orthogonalityof the (generalized) eigenvectors, i.e., of (some) columns in the similarity matrix S .

Example 3.5. Let J = diag(1,−1,1,−1) and A = SJ4(λ)S−1 for some λ ∈C and

S =⎡⎢⎣

1 1 1 11 1 1 −11 −1 1 −11 −1 −1 −1

⎤⎥⎦ .

Let us assume that A has a hyperbolic Schur decomposition (6). As usual, T11 denotes the smallestirreducible top left diagonal block of T = [ti j] (i.e., such that the elements of T beneath T11 are zero).In other words, T11 is of order 1 if t21 = 0 and of order 2 otherwise.

We denote the columns of V as vi and the columns of S as si . Note that

[vi, v j] ={±1, i = j,

0, i = j.

Page 8: The hyperbolic Schur decomposition

V. Šego / Linear Algebra and its Applications 440 (2014) 90–110 97

Now, if T11 is of order 1, then v1 is an eigenvector of A, i.e., it is collinear with s1. In other words,v1 = xs1 for some ξ = 0. Then

1 = ∣∣[v1, v1]∣∣ = |ξ |2∣∣[s1, s1]

∣∣ = 0,

which is an obvious contradiction, so T11 is of order 2.Let T11 be of an order 2, with the elements denoted as ti j , for i, j ∈ {1,2}. Similar as in Example 3.1

(case 2), we define

A2 :=(

A − t11 + t22

2I

), λ′ := (t11 − t22)

2

4+ t12t21.

Now, the first two columns of V , v1 and v2, are both (linearly independent) eigenvectors of A2 withthe same eigenvalue λ′ .

It is easy to see that if A2 is nonsingular, then it is similar to J4(λ′), which means it has only

one eigenvector. This is a contradiction with the assumption that v1 and v2 are linearly indepen-dent. Hence, A2 is singular, i.e., λ′ = 0, and we get t11 + t22 = 2λ. A simple calculation yields twoeigenvectors of A2:

s′1 = [1 1 0 0]T , s′

2 = [0 0 1 1]T .

Since they span the same eigenspace as v1 and v2, we conclude that

v1 = α11s′1 + α21s′

2, v2 = α12s′1 + α22s′

2

for some α11, α12, α21, α22. Note that[s′

1, s′1

] = 0,[s′

2, s′2

] = 0,[s′

1, s′2

] = 0,

which means that s′1 and s′

2 are both J -degenerate and mutually J -orthogonal. The contradiction isnow obvious:

±1 = [v1, v1] = [α11s′

1 + α21s′2,α11s′

1 + α21s′2

]= |α11|2

[s′

1, s′1

] + 2 Re(α11α21

[s′

1, s′2

]) + |α21|2[s′

2, s′2

] = 0.

It is fairly easy to construct a matrix A of order n similar to the one in Example 3.5 for theSchur-like decompositions with bigger diagonal blocks of T . This means some matrices do not havea hyperbolic Schur-like decomposition, regardless of the order of the biggest diagonal block in T , aslong as this order is strictly less than n. For a more detailed description, see [21].

Since the blocks in T are of an order at most 2, it makes sense to ask if all the matrices with theJordan blocks of order at most 2 (in their Jordan decomposition) have a hyperbolic Schur decomposi-tion. As the following example shows, this is also, unfortunately, not the case.

Example 3.6. Let J = diag(1,−1,1,−1) and let A = SJ S−1, where

S =⎡⎢⎣

1 1 1 21 2 1 11 2 −1 −11 1 −1 −2

⎤⎥⎦ , J = J2(λ1) ⊕ J2(λ2), λ1 = λ2.

Assume that A has a hyperbolic Schur decomposition (6) with V = [v1 · · · v4] hyperexchange and Tquasitriangular and let si = Sei . Using the same argumentation as in Example 3.5, we see that the topleft block of T has to be of order 2. Also, v1 and v2 must be eigenvectors of some A2(ξ) := (A − ξ I)2

associated with the same eigenvalue (for some ξ ). However, v1 and v2 must be linearly independent,and the only 2-dimensional eigenspaces of A2(ξ) are those spanned by:

1. {s1, s2} for ξ = λ1,

Page 9: The hyperbolic Schur decomposition

98 V. Šego / Linear Algebra and its Applications 440 (2014) 90–110

2. {s1, s3} for ξ = 12 (λ1 + λ2), and

3. {s3, s4} for ξ = λ2.

Each of these sets consists of degenerate, mutually J -orthogonal vectors. It is easy to see that thelinear combinations of such vectors are also degenerate (and mutually J -orthogonal), which is a con-tradiction with the assumption that v1 and v2 are columns of a hyperexchange matrix V .

However, if we limit the matrix to have only one Jordan block of order 2 (the rest of the Jor-dan form being diagonal), it will always have a hyperbolic Schur decomposition, as shown in thefollowing proposition. Its proof is a constructive one, following the idea of the iterative reduction,very much like the common proof in Schur decomposition (see [12, Theorem 2.3.1], [7, Theorem 7.1.3]or [21]).

Proposition 3.7. Let A ∈ Cn×n have a Jordan decomposition A = SJ S−1 such that J has at most one blockof order 2, while all others are of order 1. Then A has a hyperbolic Schur decomposition for any given J =diag(±1).

Proof. If all Jordan blocks of A are of order 1, the matrix A is diagonalizable and, by Theorem 3.3,has a hyperbolic Schur decomposition. So, we shall only consider a case when A has (exactly one)Jordan block of order 2.

Case 1. If there is a nondegenerate eigenvector s1 or A, we can J -normalize it, obtainingthe J -normal eigenvector v1 = s1/|s1|. As explained in [6, p. 10], v1 can be expanded to theJ -orthonormal basis {v1, v2, . . . , vn}. Defining a matrix

V := [v1 v2 · · · vn],we see that

A = V

⎡⎢⎣t11 ∗

0 A′

⎤⎥⎦ V −1, V ∗ J V = P∗ J P .

We repeat the process on A′ until we either get to the block of order 2 or to the A′ such that all itseigenvectors are J ′-degenerate, where J ′ is the corresponding (bottom right) part of P∗ J P . It is nothard to see that this sequence really gives the hyperbolic Schur-like decomposition such that blocksin T are of order 1, except maybe for the bottom right one which may be of an arbitrary order, whichis covered by Case 2.

Case 2. We now focus on A such that all its eigenvectors are J -degenerate, i.e., [si, si] = 0 forall i (except, maybe, the second one, since s2 is not an eigenvector but (the second) generalizedeigenvector associated with the aforementioned block J2(λ)), as this is the only case not resolved bythe previously described reductions.

Without the loss of generality, we may assume that

J = J2(λ1) ⊕ λ2 ⊕ · · · ⊕ λn−1.

So, by assumption, [si, si] = 0, for all i ∈ {1,3,4, . . . ,n}. Note the absence of the second column, asthis one is not an eigenvector, but a generalized eigenvector associated with J2(λ1).

Case 2.1. If [s1, s2] = 0 and [s2, s2] = 0, we define

v ′1 = ξ s1 + s2, v ′

2 = s2.

Since s1 and s2 are linearly independent, v ′1 and v ′

2 are also linearly independent for every ξ = 0. Weshall define the appropriate ξ in a moment. Note that[

v ′1, v ′

2

] = [ξ s1 + s2, s2] = ξ [s1, s2] + [s2, s2].

Page 10: The hyperbolic Schur decomposition

V. Šego / Linear Algebra and its Applications 440 (2014) 90–110 99

Since we want to construct a J -orthonormal set, we want [v ′1, v ′

2] = 0, so we define

ξ := −[s2, s2][s1, s2] .

Now that v ′1 and v ′

2 are J -orthogonal, we need to be able to J -normalize them and, in order to dothat, we need them to be nondegenerate. Vector v ′

2 = s2 is nondegenerate by assumption. We checkthe (non)degeneracy of vector v ′

1, using the fact that [s2, s2] ∈ R, which is valid for all vectors in anyindefinite scalar product space:[

v ′1, v ′

1

] = [ξ s1 + s2, ξ s1 + s2] = |ξ |2[s1, s1] + 2 Re(ξ [s1, s2]

) + [s2, s2]= −2 Re[s2, s2] + [s2, s2] = −[s2, s2] = 0.

We define v1 = v ′1/|v ′

1| and v2 = v ′2/|v ′

2|, obtaining the J -orthonormal set {v1, v2}. As we did inCase 1, we expand this set to the J -orthonormal basis {v1, v2, . . . , vn}, define the J -orthonormalmatrix V with columns v1, . . . , vn and, by construction, see that

A = V

⎡⎢⎣T11 ∗

0 A′

⎤⎥⎦ V −1, V ∗ J V = P∗ J P . (10)

Here, T11 is of order 2 and the matrix A′ is diagonalizable. Hence, by Theorem 3.3, A′ has a hyperbolicSchur decomposition, so A has one too.

Case 2.2. Let us now assume that [s1, s2] = 0 and [s2, s2] = 0. We define

v ′1 = ξ s1 − s2, v ′

2 = ξ s1 + s2.

As before, v ′1 and v ′

2 are linearly independent for every ξ = 0. In order to define the appropriate ξ ,note that[

v ′1, v ′

2

] = [ξ s1 − s2, ξ s1 + s2] = |ξ |2[s1, s1] + 2 Im(ξ [s1, s2]

) + [s2, s2]= 2 Im

(ξ [s1, s2]

).

Hence, to obtain the J -orthonormality of v ′1 and v ′

2, we define

ξ := [s1, s2],once again getting a J -orthogonal set {v ′

1, v ′2}. As before, we check that these vectors are

J -nondegenerate:[v ′

1, v ′1

] = [ξ s1 − s2, ξ s1 − s2] = |ξ |2[s1, s1] − 2 Re(ξ [s1, s2]

) + [s2, s2]= −2

∣∣[s1, s2]∣∣2 = 0,[

v ′2, v ′

2

] = [ξ s1 + s2, ξ s1 + s2] = |ξ |2[s1, s1] + 2 Re(ξ [s1, s2]

) + [s2, s2]= 2

∣∣[s1, s2]∣∣2 = 0.

Next, we J -normalize vectors v ′1 and v ′

2 to obtain a J -orthonormal set {v1, v2}, expand it to aJ -orthonormal basis {v1, . . . , vn} and a J -orthonormal matrix V . By construction, (10) holds andwe can further decompose A′ , which is again diagonalizable.

Case 2.3. We have now covered all the cases such that [s1, s2] = 0, so we now assume that[s1, s2] = 0. Let k be such that [s1, sk] = 0. Obviously, k = 2, so both vectors s1 and sk are J -degenerateeigenvectors of A. Note that such k must exist because, otherwise, k-th row and column of S∗ J Swould be zero, which is contradictory to the assumption that S and J are nonsingular.

We handle this case exactly the same way we did Case 2.2. The only difference is in the exactformula for the block T11 in (10), which we omit, as it is unimportant for the proof. �

Page 11: The hyperbolic Schur decomposition

100 V. Šego / Linear Algebra and its Applications 440 (2014) 90–110

We have researched the existence of the hyperbolic Schur decomposition of a given matrix in rela-tion to its Jordan structure. But, the Schur decomposition is particularly interesting as a decompositionthat preserves structures (with respect to the corresponding scalar product), so it is only natural toresearch the existence of the hyperbolic Schur decomposition for J -Hermitian and J -unitary matri-ces.

A discussion in [6, Section 5.6] gives a detailed analysis of J -Hermitian matrices for the casewhen J (in [6] denoted as H) has exactly one negative eigenvalue. In the case of the hyperbolicscalar products, the discussion is basically about the Minkowski spaces, i.e.,

J = ±diag(1,−1, . . . ,−1) or J = ±diag(1, . . . ,1,−1).

However, Proposition 3.7 cannot be used to conclude that all J -Hermitian matrices in the Minkowskispace have a hyperbolic Schur decomposition. The problem arises from the case (iv) in the aforemen-tioned discussion in [6, Section 5.6], which states that a J -Hermitian matrix can have a Jordan blockof order 3. Using this case, we construct the following very important example which shows thatthere really exist such matrices that have no hyperbolic Schur decomposition.

Example 3.8 ( J -Hermitian matrix that does not have a hyperbolic Schur decomposition). Let J =diag(1,1,−1) and let

A =[12 11 −16

11 8 −1316 13 −20

]= SJ3(0)S−1, S =

[3 5 34 5 35 7 4

].

Let us assume that A has a hyperbolic Schur decomposition A = V T V −1, V ∗ J V = P∗ J P . From thedefinition of A, it is obvious that all the eigenvectors of A are colinear with s1. Note that

S∗ J S =[0 0 1

0 1 21 2 2

],

which means that s1 is degenerate and J -orthogonal to s2. Following the previous discussions (seeExample 3.5 or the proof of Proposition 3.7), we see that the top left block of T must be of or-der 2. But, the associated (first two) columns of V , denoted v1 and v2, must be J -normal, mutuallyJ -orthogonal linear combinations of s1 and s2. This is impossible, since s1 is degenerate and s1 and s2are J -orthogonal.

To show this, assume that such vectors exist, i.e., we have αi j such that

v1 = α11s1 + α12s2, v2 = α21s1 + α22s2.

We check the desired properties of v1 and v2. First, J -normality:

1 = [v1, v1] = [α11s1 + α12s2,α11s1 + α12s2]= |α11|2[s1, s1] + 2 Re

(α11α12[s1, s2]

) + |α12|2[s2, s2] = |α12|2,1 = [v2, v2] = [α21s1 + α22s2,α21s1 + α22s2]

= |α21|2[s1, s1] + 2 Re(α21α22[s1, s2]

) + |α22|2[s2, s2] = |α22|2.So, |α12| = |α22| = 1. Using this result, we analyze the J -orthogonality of v1 and v2:

0 = ∣∣[v1, v2]∣∣ = ∣∣[α11s1 + α12s2,α21s1 + α22s2]

∣∣= ∣∣α11α21[s1, s1] + α11α22[s1, s2] + α12α21[s2, s1] + α12α22[s2, s2]

∣∣= ∣∣α12α22[s2, s2]

∣∣ = 1,

which is an obvious contradiction, hence no hyperbolic Schur decomposition exists for A.

Page 12: The hyperbolic Schur decomposition

V. Šego / Linear Algebra and its Applications 440 (2014) 90–110 101

The previous example is very significant, as it proves that not even all J -Hermitian matrices havea hyperbolic Schur decomposition. In [21] was described how Example 3.8 was constructed.

There also exist J -unitary matrices that do not have a hyperbolic Schur decomposition. One canbe constructed in a manner similar to Example 3.8.

Example 3.9 ( J -unitary matrix without a hyperbolic Schur decomposition). Let J = diag(1,1,−1) andlet

U =[−1 −32 31

8 −8 81 −32 33

]= SJ3(1)S−1, S = 1

8

[3 10 234 10 235 14 33

].

It is easy to see that U is J -unitary. Also, it does not have a hyperbolic Schur decomposition, whichcan be shown using exactly the same arguments as in Example 3.8.

Even though the above examples show that some J -Hermitian and J -unitary matrices do not havea hyperbolic Schur decomposition, they also show how rare such matrices are: in both cases we hadrather strict conditions that had to be met in order to construct them.

Interestingly, a special class of J -Hermitian matrices, referred to as J -nonnegative matrices, alwayshas a hyperbolic Schur decomposition.

We say that a matrix A is J -nonnegative if J A is positive semidefinite and A is J -positive if J Ais positive definite. These are, in a way, hyperbolic counterparts of positive definite and semidefinitematrices and find their applications in the research of J -nonnegative spaces and the semidefiniteJ -polar decomposition. For details, see [3] and [6].

Theorem 3.10 ( J -nonnegative matrix). Let J = diag(±1). If A ∈ Cn×n is J -nonnegative, then it has a hyper-bolic Schur decomposition with respect to J .

Proof. Since J A is positive semidefinite, we can write A = J B∗B for some B . Then the hyperbolicSVD of B , as described in [24, Section 3], where J is denoted as Φ , A∗ as AH and J V as P (we shalluse P as a permutation matrix, which is not explicitly used in [24]) is

B = U

⎡⎣ [ I j I j ]diag(|λ1|1/2, . . . , |λl|1/2)

0

⎤⎦ ( J V )∗,

where U∗U = I, ( J V )∗ J ( J V ) = V ∗ J V = P∗ J P , for some permutation P , and j = rank B − rank B J B∗ .It is easy to see that

A = J B∗B = V

⎡⎢⎢⎣[

I j I jI j I j

]diag(|λ1|, . . . , |λl|)

0

⎤⎥⎥⎦ P∗ J P V −1.

Note that P∗ J P is diagonal and[I j I j

I j I j

]=

[1 11 1

]⊗ I j

is permutationally similar to

j⊕i=1

[1 11 1

]= I j ⊗

[1 11 1

],

which completes the proof of the theorem. �

Page 13: The hyperbolic Schur decomposition

102 V. Šego / Linear Algebra and its Applications 440 (2014) 90–110

Remark 3.11. If A is J -positive, then J A is nonsingular, so j = 0 in the above proof, which means thatJ -positive matrices have a hyperbolic Schur decomposition with T diagonal with positive entries.

In fact, more is known for this case, as shown in [20, Corollary 5.3]:

A = V T V −1, V −1 = V [∗], T = JΣ, Σ = diag(|λ1|, . . . , |λn|

),

i.e., V can be chosen to be J -unitary if we set the signs in T according to those in J .

Before exploring the properties of the hyperbolic Schur decomposition, we investigate further thequasitriangular factor T . To do this, we define the following very important variant of the hyperbolicSchur decomposition.

Definition 3.12 (The complete hyperbolic Schur decomposition). Let J = diag(±1) and A ∈ Cn×n be of thesame order. We say that A = V T V −1 is a complete hyperbolic Schur decomposition of the matrix A withrespect to J if V is J -orthonormal, T is quasitriangular, and all diagonal blocks Tkk of order 2 areindecomposable.1

Definition 3.12 allows us to work with the fully reduced (in terms of the J -orthonormal similarity)quasitriangular matrices. This makes proofs simpler (by reducing the number of observed cases) andgives us the following properties of the diagonal blocks in an indecomposable matrix T . Note thatany matrix having a hyperbolic Schur decomposition trivially also has a complete hyperbolic Schurdecomposition.

Theorem 3.13 (The complete hyperbolic Schur decomposition). Let A = V T V −1 be a complete hyperbolicSchur decomposition of A with respect to some J = diag(±1). Then all the irreducible diagonal 2 × 2 blocksof T have only degenerate eigenvectors with respect to the corresponding part of J := V ∗ J V .

Furthermore, all such blocks are either degenerate (with respect to the corresponding part of J ) or nonsin-gular.

Proof. It is sufficient to note that, once we have a hyperbolic Schur decomposition, we can furtherdecompose irreducible 2 × 2 diagonal blocks of the quasitriangular factor T , either using the tradi-tional Schur decomposition (if the corresponding part of J , denoted J ′ , is definite, i.e., J ′ = ±I2) orthe hyperbolic Schur decomposition (if J ′ is hyperbolic, i.e., J ′ = diag(1,−1) or J ′ = diag(−1,1)). Inthe latter case, we can triangularize the observed 2 × 2 block if and only if it has at least one non-degenerate eigenvector, by J ′-normalizing that eigenvector and expanding it to the J ′-orthonormalbasis, which can always be done for a J ′-orthonormal set (see [6, p. 10]). So, the only indecompos-able 2 × 2 blocks of T are those with degenerate eigenvectors with respect to the hyperbolic scalarproduct induced by the corresponding part of J .

For the second part of the theorem, regarding the irreducible 2 × 2 diagonal blocks in the factor T ,note that all irreducible singular blocks of order 2 must have rank 1; otherwise, they are either 0(which is reducible as [0] ⊕ [0]) or nonsingular. Let us observe one of such blocks, denoting it as T ′and the corresponding part of J as J ′ = ±diag(1,−1). We see that T ′ must be in one of the followingtwo forms:

1. T ′ = S diag(λ,0)S−1, λ = 0, S∗ J ′ S =[

0 α

α 0

], α ∈C \ {0}, which gives(

T ′)∗J ′T ′ = S−∗ diag(λ,0)S∗ J ′ S diag(λ,0)S−1

= S−∗ diag(λ,0)

[0 αα 0

]diag(λ,0)S−1 = 0, or

1 A diagonal block Tkk is indecomposable if it cannot be further reduced by the hyperbolic Schur decomposition, i.e., there areno Jk-orthogonal V and triangular (not just quasitriangular!) T such that Tkk = V T V −1, where Jk is the part of J := V ∗ J Vcorresponding to Tkk .

Page 14: The hyperbolic Schur decomposition

V. Šego / Linear Algebra and its Applications 440 (2014) 90–110 103

2. T ′ = SJ2(0)S−1, S∗ J ′ S =[

0 α

α β

], α ∈ C \ {0}, β ∈C, which gives(

T ′)∗J ′T ′ = S−∗J2(0)T S∗ J ′ SJ2(0)S−1

= S−∗[

0 01 0

][0 αα β

][0 10 0

]S−1 = 0.

So, in both cases (T ′)∗ J ′T ′ = 0, which means that T ′ is J ′-degenerate. Since T ′ was an arbitraryirreducible diagonal singular block in the complete Schur decomposition, this means that all suchblocks are degenerate. �

The indecomposable blocks are sometimes referred to as atomic (see [4, p. 466]) and Theorem 3.13states that all irreducible blocks in the complete hyperbolic Schur decomposition are atomic andhave a specific eigenstructure. Note that each atomic block can have one or two eigenvectors, asTheorem 3.13 makes no statement on the Jordan structure of such blocks.

4. Properties

In this section, we assume that J = diag(±1) and A are given such that A has a hyperbolic Schurdecomposition, as described in Definition 3.2. We also use V , P and J from (7) and (6).

One of the main properties of the traditional Schur decomposition is that it keeps some structuresof the matrix unchanged, i.e., if the decomposed matrix A is normal, Hermitian or unitary, thenthe triangular block T of a Schur decomposition of A will also be normal, Hermitian or unitary,respectively. Not surprisingly, similar properties hold in the hyperbolic case as well.

Proposition 4.1 ( J -conjugate transpose). Let A have a hyperbolic Schur decomposition (7) with respect toJ = diag(±1). Then

A[∗] J = (V P )T [∗] J (V P )−1, J = P∗ J P .

Proof. The proof is straightforward. From (7), it follows that

A[∗] J = J((V P )T (V P )−1)∗

J = J (V P )−∗T ∗(V P )∗ J

= J V −∗ P T ∗ P∗V ∗ J = (V P )P∗(V ∗ J V)−1

P T ∗ P∗V ∗ J V P (V P )−1

= (V P ) J T ∗ J (V P )−1 = (V P )T [∗] J (V P )−1. �As we have seen in Example 3.5, some matrices do not have a hyperbolic Schur decomposition.

However, if a matrix A has it, then its conjugate transposes (both, the Euclidean and the hyperbolicone) have it as well.

Proposition 4.2 (Existence of the hyperbolic Schur decomposition for conjugate transposes). A matrix Ahas a hyperbolic Schur decomposition with respect to J = diag(±1) if and only if A∗ and A[∗] J have it aswell.

Proof. The proof is a direct consequence of Proposition 4.1. Note that if some matrix X is lowertriangular, then S−1

n XSn is upper triangular. Now, we have

A[∗] J = (V P )T [∗] J (V P )−1 = (V P )SnS−1n T [∗] J SnS−1

n (V P )−1

= (V PSn)(S−1

n T [∗] J Sn)(V PSn)

−1,

which is one possible hyperbolic Schur decomposition of A[∗] J . The similar proofs can be appliedto A∗ .

Page 15: The hyperbolic Schur decomposition

104 V. Šego / Linear Algebra and its Applications 440 (2014) 90–110

For the other implication, we need only apply what was already proven in the first part and usethe fact that

A = (A∗)∗ = (

A[∗] J)[∗] J

. �We now consider J -Hermitian matrices. As seen in the previous section, the hyperbolic Schur

decomposition was defined in the way that it keeps J -Hermitianity. This property can be showndirectly as well.

Proposition 4.3 ( J -Hermitian matrices). If a matrix A has a hyperbolic Schur decomposition with respect toJ = diag(±1), then A is J -Hermitian if and only if T is J -Hermitian and quasidiagonal, where J = P∗ J P .

Proof. From Proposition 4.1, we see that for A = (V P )T (V P )−1,

A[∗] J = (V P )T [∗] J (V P )−1,

so T = T [∗] J if and only if A[∗] J = A.If T is quasitriangular and J -Hermitian, then it is quasidiagonal. �At this point, it is worth noting that the spectrum of a J -Hermitian matrix A is always symmetric

with respect to the real axis. Moreover, the Jordan structure associated with λ is the same as that as-sociated with λ, as shown in [6, Proposition 4.2.3]. Also, by [6, Corollary 4.2.5], nonreal eigenvalues ofJ -Hermitian matrices have J -neutral root subspaces, which implies that all their adjoined eigenvec-tors are J -degenerate. This means that each such eigenvalue will participate in some singular atomicblock of order 2 in the quasidiagonal matrix T of the matrix’ hyperbolic Schur decomposition. Ofcourse, for J = ±I, all eigenvalues are real and such blocks do not exist.

In the following proposition, we consider J -normal matrices. Recall that A is J -normal if A A[∗] J =A[∗] J A.

Proposition 4.4 ( J -normal matrices). If a matrix A has a hyperbolic Schur decomposition with respect toJ = diag(±1), then A is J -normal if and only if T is J -normal quasitriangular, where J = P∗ J P .

Proof. This follows directly from Proposition 4.1:

A A[∗] J = (V P )T T [∗] J (V P )−1, A[∗] J A = (V P )T [∗] J T (V P )−1. �Unlike the Euclidean case, where a normal triangular matrix is also diagonal, in the hyperbolic

case, we have no guarantees that a block triangular J -normal matrix is also block diagonal, as shownin the following example.

Example 4.5 (A block triangular, J -normal matrix which is not block diagonal). Let J = diag(1,−1,1,−1)

and let A be the following block triangular matrix:

A =⎡⎢⎣

ξ 1 1 11 ξ 1 10 0 ξ 10 0 1 ξ

⎤⎥⎦ ,

for some ξ ∈ C. A simple multiplication shows that A A[∗] = A[∗] A, for any ξ ∈ C, and A is obviouslynot block diagonal (with the diagonal blocks of order 2).

Where does this difference between the Euclidean and the hyperbolic case come from? In theEuclidean case, for an upper triangular normal A, we have

Page 16: The hyperbolic Schur decomposition

V. Šego / Linear Algebra and its Applications 440 (2014) 90–110 105

∑k�n

|a1k|2 = (A A∗)

11 = (A∗ A

)11 = |a11|2, (11)

from which we conclude that∑

1<k�n |a1k|2 = 0, so |a1k| = 0 for all k > 1. Then we do the same for(A A∗)22, (A A∗)33, etc. But, in the hyperbolic case, for a block upper triangular J -normal A, (11) takesthe following form:

j11

∑k�n

jkk|a1k|2 = (A A[∗])

11 = (A[∗] A

)11 = |a11|2.

Since J contains both positive and negative numbers on its diagonal, the sum on the left hand sideof the previous equation may also contain both positive and negative elements, so we can make nodirect conclusion about the elements a1k for any k.

Similarly to J -normal and J -Hermitian, we can also analyze J -unitary matrices. Somewhat surpris-ingly, this property is much closer to the Euclidean case than the previous result regarding J -normalmatrices.

Let us first review the structure of quasitriangular hyperexchange matrices, as this will give usmore insight into the structure of the J -unitary matrices (which are a special case of the hyperex-change matrices).

Proposition 4.6 (Block triangular hyperexchange matrices). Let T be a quasitriangular hyperexchange matrixwith respect to some given J = diag(±1). Then T is also quasidiagonal.

Proof. Since T is a hyperexchange matrix, there exists a permutation P such that T ∗ J T = P J P∗ . Thismeans that T −1 = P J P∗T ∗ J . Because T is block upper triangular and J and P J P∗ are diagonal,

1. T −1 is block upper triangular, and2. T ∗ and P J P∗T ∗ J are block lower triangular.

Hence, T −1 is both block upper and block lower triangular, i.e., T −1 and therefore T are quasidiago-nal. �

We are now ready to analyze the hyperbolic Schur decomposition of a J -unitary matrix.

Proposition 4.7 ( J -unitary matrices). If a matrix A has a hyperbolic Schur decomposition A = V T V −1 withrespect to J = diag(±1), then A is J -unitary if and only if the quasitriangular factor T is J -unitary andquasidiagonal, where J = P∗ J P .

Proof. By Proposition 4.1, using the same arguments as in the proof of Proposition 4.4, T is J -unitary.The quasidiagonality of T follows directly from Proposition 4.6 and the fact that every J -unitarymatrix is also J -hyperexchange. �

The previous proposition can also be proven in a more straightforward manner, by analyzing thetop right block of dimensions 1 × (n − 1) or 2 × (n − 2) in T ∗ J T and then repeating the processiteratively on the bottom right block of order n − 1 or n − 2.

Eigenvalues of J -unitary and J -Hermitian matrices have well researched properties, nicely pre-sented in [16, Section 7] with J -unitary matrices being referred to as members of the automorphismgroup G and J -Hermitian matrices being referred to as members of the Jordan algebra J. These re-sults apply for various scalar products, but when it comes to hyperbolic products and a hyperbolicSchur decomposition, more can be said about the atomic blocks in T .

Proposition 4.8 (Nondiagonalizable atomic blocks). Let J = diag(±1) and A of the same order be given suchthat A has a complete hyperbolic Schur decomposition (7). Furthermore, let T be a nondiagonalizable atomicblock on the diagonal of T , let J be the corresponding part of J and let s1 and s2 denote the columns of thesimilarity matrix S in the Jordan decomposition T = SJ2(λ)S−1 . Then the following is true:

Page 17: The hyperbolic Schur decomposition

106 V. Šego / Linear Algebra and its Applications 440 (2014) 90–110

1. if A is J -unitary, then |λ| = 1 and [s1, s2 ] J ∈ iR \ {0}, and2. if A is J -Hermitian, then λ ∈R and [s1, s2 ] J ∈R \ {0}.

Proof. By Propositions 4.7 and 4.3, if A is J -unitary or J -Hermitian, T is quasidiagonal J -unitaryor J -Hermitian, respectively. Since we have assumed a complete hyperbolic Schur decomposition,by Theorem 3.13 the eigenvector of every nondiagonalizable atomic block T is J -degenerate, alsoimplying that J = ±diag(1,−1). This means that

T = SJ2(λ)S−1, S∗ J S =[

0 αα β

].

Since S∗ J S is Hermitian, β ∈ R. Note that α = [s2, s1 ] J = 0 because S∗ J S is nonsingular.

Let us now assume that A is J -unitary. As stated before, T is J -unitary and since it is quasidiago-nal, its block T is J -unitary. So,

J = T ∗ J T = S−∗J2(λ)∗ S∗ J SJ2(λ)S−1 = S−∗[

0 αα 2λRe(α) + β

]S−1.

Premultiplying by S∗ and postmultiplying S , we get[0 αα β

]= S∗ J S =

[0 αα 2λRe(α) + β

].

From here, we see that 2λRe(α) = 0. Since T is J -unitary and, hence, nonsingular, λ = 0. Obviously,Re(α) = 0, i.e., α is imaginary, so [s1, s2] ∈ iR.

The J -Hermitian case is similar. For a J -Hermitian A, as before, we conclude that T is J -Hermitian,and its block T is J -Hermitian. From T [∗] = T follows that J T ∗ = T J , so

J S−∗J2(λ)∗ S∗ = SJ2(λ)S−1 J .

Premultiplying by S∗ J and postmultiplying S−∗ S , we get

J2(λ)∗ = S∗ J SJ2(λ)S−1 J S−∗ = (S∗ J S

)J2(λ)

(S∗ J S

)−1.

Expanding all these matrices and multiplying those on the right hand side, we see that[λ

1 λ

]=

α/α λ

].

Here, we see that λ ∈ R (which also follows straight from [16, Theorem 7.6]) and α = α, i.e., α is real,so [s1, s2] ∈ R. �

It is worth noting that J -Hermitian J -unitary matrices always have a hyperbolic Schur decompo-sition with a specific structure of the atomic blocks of order 2, as shown in the following proposition.

Proposition 4.9 (J-Hermitian J-unitary matrices). Let J = diag(±1) and let A ∈ Cn×n be both J -Hermitianand J -unitary. Then A has a complete hyperbolic Schur decomposition (7) such that each diagonal atomicblock T of order 2 is of the form

T =[

α β

−β −α

], J = ±diag(1,−1),

for some α ∈R, β ∈ C such that α2 = |β|2 + 1. As before, J denotes the part of J corresponding to T .

Proof. From the J -unitarity and the J -Hermitianity of A, we get

I = A[∗] A = A2,

Page 18: The hyperbolic Schur decomposition

V. Šego / Linear Algebra and its Applications 440 (2014) 90–110 107

which means that A is involutory and, hence, diagonalizable (see [2, Fact 5.12.13]) which, by Theo-rem 3.3 means that it has a hyperbolic Schur decomposition. By Theorem 3.13, we can choose sucha decomposition in a way that each diagonal block of order 2 in T , denoted T , has only degenerateeigenvectors with respect to J (the corresponding part of J ), which is possible if and only if that partis either J = diag(1,−1) or J = diag(−1,1).

By Proposition 4.3, T is J -Hermitian and, by Proposition 4.7, it is also J -unitary. So,

I2 = T [∗] J T = T 2. (12)

Let us denote the elements of T as ti j (i, j ∈ {1,2}):

T =[

t11 t12t21 t22

], T [∗] J = J T ∗ J =

[t11 −t21

−t12 t22

].

Since T = T [∗] J , we see that t11, t22 ∈R and t21 = −t12. Note that we have chosen T to be irreducible(because it comes from the complete hyperbolic Schur decomposition), i.e., t12 = 0 or t21 = 0, whichmeans that both of the nondiagonal elements of T are nonzero. Furthermore, from (12), we get[

11

]=

[t2

11 − |t12|2 t12(t11 + t22)

−t12(t11 + t22) t222 − |t12|2

].

Since t12 = 0, we get t11 = −t22. Defining α := t11 and β := t12 completes this proof. �5. Existence of the hyperbolic Schur decomposition for J -Hermitian matrices

As we have seen in Example 3.8, some J -Hermitian matrices do not have a hyperbolic Schur de-composition. Also, Proposition 3.7 provides a sufficient, but not necessary condition for the existenceof such a decomposition in a general case. In this section we give a necessary and sufficient conditionfor the existence of the hyperbolic Schur decomposition of J -Hermitian matrices.

To achieve our goal, we briefly leave the hyperbolic scalar product spaces and move to more gen-eral, indefinite scalar product spaces. A detailed analysis of such spaces is given in [6], from where weuse Theorem 5.1.1, which states that for every nonsingular indefinite J and for every J -Hermitian A(i.e., J A = A∗ J ) there exists a nonsingular matrix X such that

A = XJ X−1, X∗ J X = S, J =β⊕

k=1

Jk, S =β⊕

k=1

εkSk, (13)

where J is a Jordan normal form of A. Blocks Jk for k = 1, . . . ,α are associated with the real eigen-values λ1, . . . , λα and blocks Jk for k = α + 1, . . . , β are associated with conjugate pairs of nonrealeigenvalues λα+1, . . . , λβ in the upper half-plane, i.e., Jk =J (λk)⊕J (λk) for k = α + 1, . . . , β . Matri-ces Sk denote standard involutory permutations of order the same as Jk , εk ∈ {−1,1} for k � α andεk = 1 for k > α. In [6], J , X and S are denoted as H , T −1 and J , respectively.

The decomposition (13) is referred to as the canonical form of the pair (A, J ) and has many appli-cations. See [6, Chapter 5] for the applications in the indefinite scalar product spaces and [15] for itssymplectic version and the application in the symplectic scalar product spaces.

The set of signs {ε1, . . . , εα} is called the sign characteristic of the pair (A, J ). In [15], this notion isadapted to the symplectic scalar product spaces, losing the property that εk = 1 for k > α. For moredetails, see [15, Section 3.2].

Apart from the applications to the nonstandard scalar product spaces, the sign characteristic playsa significant role in the research of the selfadjoint matrix polynomials [5, Chapter 12].

Throughout this paper, the Jordan normal form of a matrix plays a crucial role in the existence ofthe hyperbolic Schur decomposition. To better describe it, we associate with each Jordan block Jd(λ)

the term partial multiplicity, which is the size d of that Jordan block. Obviously, each eigenvalue hasas many partial multiplicities as it has associated Jordan blocks and the largest partial multiplicity of

Page 19: The hyperbolic Schur decomposition

108 V. Šego / Linear Algebra and its Applications 440 (2014) 90–110

each eigenvalue λ of a matrix A is equal to the size of its largest Jordan block or, equivalently, tomax{d ∈ N: (x − λ)d|μA(x)}, where μA(x) is the minimal polynomial of A.

We are now ready to present and prove the necessary and sufficient conditions for the existenceof the hyperbolic Shur decomposition of a J -Hermitian matrix.

Theorem 5.1 (Hyperbolic Schur decomposition of a J -Hermitian matrix). Let J = diag(±1) and let A ∈ Cn×n

be a J -Hermitian matrix of the same order. Then A has a hyperbolic Schur decomposition with respect to Jif and only if its real eigenvalues have partial multiplicities at most 2 and its nonreal eigenvalues have partialmultiplicities at most 1.

Proof. Let us first show that if such a decomposition of A exists, then the real eigenvalues of A havepartial multiplicities at most 2 and its nonreal eigenvalues have multiplicities at most 1.

Let A = U T U−1, where U is J -orthonormal, and let J := U∗ J U . By Proposition 4.3, T is quasidi-agonal and J -Hermitian. This means that we can write

T =⊕

Tk, J =⊕

k

Jk,

where each Tk is irreducible (i.e., either of order 1 or nondiagonal of order 2) and Jk has the sameorder as Tk . Furthermore, each Tk is Jk-Hermitian. Trivially, this means that all blocks Tk of order 1are real.

Notice that if Tk of order 2 has nonreal eigenvalues λ(k)1 and λ

(k)2 , then they form a complex

conjugate pair, i.e., λ(k)1 = λ

(k)2 , due to [16, Theorem 7.6], so their partial multiplicity must be 1. If they

are real, their partial multiplicity is 1 or 2, depending on the diagonalizability of Tk . Since A is similarto Tk , it has the same Jordan structure, i.e., the same eigenvalues with the same partial multiplicities.

Let us now show that any J -Hermitian matrix A with the described eigenvalues’ partial multiplici-ties has a hyperbolic Schur decomposition with respect to J . We observe the canonical form of (A, J )from (13). By the assumption, all Jk are of an order at most 2.

Let k be such that Jk is of order 2. Then Sk =[

11

], which is congruent to diag(1,−1). By the

Sylvester law of inertia, there exists Yk such that Y ∗k SkYk = diag(1,−1). For all k such that Jk is of

order 1 we define Yk = 1, so Y ∗k SYk = I1 for such k. Furthermore, we define

Y :=⊕

k

Yk, U := XY , T := Y −1J Y . (14)

Note that T has the same block structure as J , so it is quasidiagonal, and

Y ∗SY =(⊕

k

Y ∗k

)(⊕k

εkS∗k

)(⊕k

Yk

)=

⊕k

(εkY ∗

k S∗k Yk

) =: J ′, (15)

where J ′ = diag( j1, . . . , jn) for some j1, . . . , jn ∈ {−1,1}. Using (13), (14) and (15), we see that

A = XJ X−1 = U Y −1J Y U−1 = U T U−1,

U∗ J U = (XY )∗ J (XY ) = Y ∗ X∗ J XY = Y ∗SY = J ′,which proves that A is J -orthonormality similar to some quasidiagonal T , hence A has a hyperbolicSchur decomposition with respect to J . �6. Conclusion

In this paper we have introduced the hyperbolic Schur decomposition of a square matrix with re-spect to the scalar product induced by J = diag(±1). We have shown that all diagonalizable matriceshave such a decomposition, which means that the set of matrices that don’t have a hyperbolic Schurdecomposition is a subset of the set of nondiagonalizable matrices, which is a set of measure zero

Page 20: The hyperbolic Schur decomposition

V. Šego / Linear Algebra and its Applications 440 (2014) 90–110 109

(this follows from [12, Section 2.4.7]). We have also given examples of matrices that do not have sucha decomposition.

By its design, the hyperbolic Schur decomposition preserves structures of the structured matrices,albeit with respect to a somewhat changed (symmetrically permuted) J , denoted by J , which is acommon property of hyperbolic decompositions. We have analyzed the properties of such matrices,as well as the properties of atomic (indecomposable) blocks on the diagonal of the quasitriangularfactor T .

In Example 3.8 we have shown that there exist J -Hermitian (and, therefore, J -normal) matricesthat do not have a hyperbolic Schur decomposition, even in spaces as simple as the Minkowski space.In Example 3.9 we have provided an example of a J -unitary matrix for which there is no hyperbolicSchur decomposition, also showing that not all hyperexchange matrices have such a decomposition(since J -unitary matrices are a special case).

In Section 5, we have given sufficient and necessary conditions under which a J -Hermitian matrixhas a hyperbolic Schur decomposition.

We have shown that those structured matrices that do have a hyperbolic Schur decomposition alsohave desirable properties. The only exemption from this rule are general J -normal matrices (whichare not “nicer”, i.e., neither J -Hermitian nor hyperexchange) for which block triangularity does notimply block diagonality. It remains to be researched if the block triangular ones can always be blockdiagonalized via J -orthonormal similarities (i.e., by a hyperbolic Schur decomposition), as alwayshappens in the Euclidean case. Interestingly enough, all important subclasses of J -normal matricesmaintain the block diagonality of the factor T .

Another subject that remains to be researched is the algorithm to calculate such a decomposition.The first thing that might come to mind here is the QR algorithm used for the Euclidean Schur de-composition. However, this would not work, at least not as directly as one might hope. Firstly, everynonsingular matrix has a hyperbolic QR factorization, as shown by [19, Theorem 5.3]. But, as shownin the examples in this paper, not all such matrices have a hyperbolic Schur decomposition (see Ex-ample 3.5 for λ = 0, Example 3.6 for λ1, λ2 = 0 and Example 3.9). This means that any algorithmemploying the hyperbolic QR factorization to calculate the hyperbolic Schur decomposition will di-verge for such matrices. Secondly, many singular matrices have a hyperbolic Schur decomposition,while it is unclear if they also have a hyperbolic QR factorization, since [19, Theorem 5.3] covers onlythe matrices A such that A∗ J A is of full rank. This means that it may be possible for some matricesto have a hyperbolic Schur decomposition which is uncomputable via the hyperbolic QR factorization.

Acknowledgements

Most of the work on this paper was done at the School of Mathematics, University of Manchester,where I was invited as a research visitor by Françoise Tisseur, whom I thank dearly for the opportunityand a great working experience as well as many suggestions that helped me with this work. The paperwas also proofread by Nataša Strabic whom I thank for all the suggestions that have considerablyimproved the paper. I would also like to thank the anonymous referee at the LAA, who made aspecial impact on this paper and without whose suggestions Section 5 would not exist.

References

[1] G. Ammar, C. Mehl, V. Mehrmann, Schur-like forms for matrix Lie groups, Lie algebras and Jordan algebras, Linear AlgebraAppl. 287 (1–3) (1999) 11–39.

[2] D.S. Bernstein, Matrix Mathematics: Theory, Facts, and Formulas with Application to Linear Systems Theory, PrincetonUniversity Press, Princeton, NJ, USA, 2005.

[3] Y. Bolshakov, C.V.M. van der Mee, A. Ran, B. Reichstein, L. Rodman, Extension of isometries in finite-dimensional indefinitescalar product spaces and polar decompositions, SIAM J. Matrix Anal. Appl. 18 (3) (July 1997) 752–774.

[4] P. Davies, N. Higham, A Schur–Parlett algorithm for computing matrix functions, SIAM J. Matrix Anal. Appl. 25 (2) (2006)464–485.

[5] I. Gohberg, P. Lancaster, L. Rodman, Matrix Polynomials, Classics Appl. Math., Society for Industrial and Applied Mathemat-ics, Philadelphia, PA, USA, 1982.

[6] I. Gohberg, P. Lancaster, L. Rodman, Indefinite Linear Algebra and Applications, Birkhäuser, Basel, Switzerland, 2005.[7] G. Golub, C.F. Van Loan, Matrix Computations, third ed., Johns Hopkins University Press, Baltimore, MD, USA, 1996.

Page 21: The hyperbolic Schur decomposition

110 V. Šego / Linear Algebra and its Applications 440 (2014) 90–110

[8] E. Grimme, D. Sorensen, P. Van Dooren, Model reduction of state space systems via an implicitly restarted Lanczos method,Numer. Algorithms 12 (1–2) (1996) 1–31.

[9] S. Hassi, A Singular Value Decomposition of Matrices in a Space with an Indefinite Scalar Product, Ann. Acad. Sci. Fenn.,a 1, Math. Diss., vol. 79, Suomalainen Tiedeakatemia, Helsinki, 1990.

[10] N.J. Higham, J -orthogonal matrices: Properties and generation, SIAM Rev. 45 (3) (2003) 504–519.[11] N.J. Higham, Functions of Matrices: Theory and Computation, Society for Industrial and Applied Mathematics, 2008.[12] R.A. Horn, C.R. Johnson, Matrix Analysis, second ed., Cambridge University Press, Cambridge, UK, 2013.[13] A. Kılıçman, Z.A. Zhour, The representation and approximation for the weighted Minkowski inverse in Minkowski space,

Math. Comput. Modelling 47 (3–4) (2008) 363–371.[14] B.C. Levy, A note on the hyperbolic singular value decomposition, Linear Algebra Appl. 277 (1–3) (1998) 135–142.[15] W.-W. Lin, V. Mehrmann, H. Xu, Canonical forms for Hamiltonian and symplectic matrices and pencils, Linear Algebra Appl.

302–303 (1999) 469–533.[16] D.S. Mackey, N. Mackey, F. Tisseur, Structured factorizations in scalar product spaces, SIAM J. Matrix Anal. Appl. 27 (3)

(2006) 821–850.[17] R. Onn, A.O. Steinhardt, A. Bojanczyk, The hyperbolic singular value decomposition and applications, in: Applied Mathe-

matics and Computing, Trans. 8th Army Conf., Ithaca, NY, USA, in: ARO Rep., vol. 91-1, 1991, pp. 93–108.[18] G. Sewell, Computational Methods of Linear Algebra, second ed., Wiley, 2005.[19] S. Singer, Indefinite QR factorization, BIT 46 (1) (2006) 141–161.[20] V. Šego, Two-sided hyperbolic SVD, Linear Algebra Appl. 433 (7) (2010) 1265–1275.[21] V. Šego, The hyperbolic Schur decomposition (extended), http://eprints.ma.man.ac.uk/2026/, October 2013.[22] H. Xu, An SVD-like matrix decomposition and its applications, Linear Algebra Appl. 368 (2003) 1–24.[23] H. Xu, A numerical method for computing an SVD-like decomposition, SIAM J. Matrix Anal. Appl. 26 (4) (2005) 1058–1082.[24] H. Zha, A note on the existence of the hyperbolic singular value decomposition, Linear Algebra Appl. 240 (1996) 199–205.