Stats 346.3 Multivariate Data Analysis Stats 848.3.

Post on 24-Dec-2015

260 views 4 download

Tags:

Transcript of Stats 346.3 Multivariate Data Analysis Stats 848.3.

Stats 346.3

Multivariate Data Analysis

Stats 848.3

Instructor: W.H.Laverty

Office: 235 McLean Hall

Phone: 966-6096

Lectures:M W F

12:30am - 1:20pm Biol 123

Evaluation: Assignments, Term tests - 40%Final Examination - 60%

Dates for midterm tests:1. Friday, February 06

2. Friday, March 20

Each test and the Final Exam are Open Book

Students are allowed to take in Notes, texts, formula sheets, calculators (laptop computers.)

Text:

Stat 346 –Multivariate Statistical Methods – Donald Morrison

Not Required - I will give a list of other useful texts that will be in the library

Bibliography

1. Cooley, W.W., and Lohnes P.R. (1962). Multivariate Procedures for the Behavioural Sciences, Wiley, New York.

2. Fienberg, S. (1980), Analysis of Cross-Classified Data , MIT Press, Cambridge, Mass.

3. Fingelton, B. (1984), Models for Category Counts , Cambridge University Press.

4. Johnson, R.A. and Wichern D.W. Applied Multivariate Statistical Analysis , Prentice Hall.

5. Morrison, D.F. (1976), Multivariate Statistical Methods , McGraw-Hill, New York.

6. Seal, H. (1968), Multivariate Statistical Analysis for Biologists , Metheun, London

7. Alan Agresti (1990) Categorical Data Analysis, Wiley, New York.

• The lectures will be given in Power Point

• They are now posted on the Stats 346 web page

Course Outline

Introduction

Review of Linear Algebra and Matrix Analysis

Review of Linear Statistical Theory

Chapter 2

Chapter 1

Multivariate Normal distribution •Multivariate Data plots •Correlation - sample estimates and tests •Canonical Correlation

Chapter 3

Mean Vectors and Covariance matrices •Single sample procedures •Two sample procedures •Profile Analysis

Chapter 4

Multivariate Analysis of Variance (MANOVA) Chapter

5

Classification and Discrimination •Discriminant Analysis •Logistic Regression (if time permits) •Cluster Analysis

Chapters 6

The structure of correlation •Principal Components Analysis (PCA) •Factor Analysis

Chapter 9

Multivariate Multiple Regression

(if time permits)

References TBA

Discrete Multivariate Analysis

(if time permits)

References: TBA

Introduction

Multivariate Data

• We have collected data for each case in the sample or population on not just one variable but on several variables – X1, X2, … Xp

• This is likely the situation – very rarely do you collect data on a single variable.

• The variables maybe1. Discrete (Categorical)2. Continuous (Numerical)

• The variables may be 1. Dependent (Response variables)2. Independent (Predictor variables)

Independent variables

Dependent Variables

Categorical Continuous Continuous & Categorical

Categorical Multiway frequency Analysis(Log Linear Model)

Discriminant Analysis Discriminant Analysis

Continuous ANOVA (single dep var)MANOVA (Mult dep var)

MULTIPLE REGRESSION(single dep variable)MULTIVARIATEMULTIPLE REGRESSION (multiple dependent variable)

ANACOVA (single dep var)MANACOVA (Mult dep var)

Continuous & Categorical

?? ?? ??

A chart illustrating Statistical Procedures

Multivariate Techniques

Multivariate Techniques can be classified as follows:

1. Techniques that are direct analogues of univariate procedures.

• There are univariate techniques that are then generalized to the multivariate situarion

• e. g. The two independent sample t test, generalized to Hotelling’s T2 test

• ANOVA (Analysis of Variance) generalized to MANOVA (Multivariate Analysis of Variance)

2. Techniques that are purely multivariate procedures.

• Correlation, Partial correlation, Multiple correlation, Canonical Correlation

• Principle component Analysis, Factor Analysis- These are techniques for studying complicated

correlation structure amongst a collection of variables

3. Techniques for which a univariate procedures could exist but these techniques become much more interesting in the multivariate setting.

• Cluster Analysis and Classification- Here we try to identify subpopulations from the data

• Discriminant Analysis- In Discriminant Analysis, we attempt to use a

collection of variables to identify the unknown population for which a case is a member

An Example:

A survey was given to 132 students

• Male=35,

• Female=97

They rated, on a Likert scale

• 1 to 5

• their agreement with each of 40 statements.

All statements are related to the Meaning of Life

Questions and Statements

1. How religious/spiritual would you say you are?

2. To have trustworthy and intimate friend(s)

3. To have a fulfilling career

4. To be closely connected to family 5. To share values/beliefs with others in your close circle or

community

6. To have and raise children 7. To continually set short and long-term, achievable goals for

yourself

8. To feel satisfied with yourself (feel good about yourself)

9. To live up to the expectations of family and close friends

10. To contribute to world peace

Statements - continued

11. To be involved in an intimate relationship with a significant person

12. To give of yourself to others.

13. To be able to plan and take time for leisure.

14. To act on your own personal beliefs, despite outside pressure.

15. To be seen as physically attractive. 16. To feel confident in choosing new experiences to better

yourself.

17. To care about the state of the physical/natural environment.

18. To take responsibility for your mistakes.

19. To make restitution for you mistakes, if necessary.

20. To be involved with social or political causes.

21. To keep up with media and popular-culture trends.

22. To adhere to religious practices based on tradition or rituals. 23. To use your own creativity in a way that you believe is

worthwhile. 24. The meaning of life is found in understanding ones ultimate

purpose for life. 25. The meaning of life can be discovered through intentionally

living a life that glorifies a Spiritual being.

26. There is a reason for everything that happens. 27. Obtaining things in life that are material and tangible is only

part of discovering the meaning of life. 28. People unearth the same basic values when attempting to find

the meaning of life. 29. It is more important to cultivate character than to be consumed

with outward rewards, or, awards.

30. Some aims or goals in life are more valuable than other goals.

31. The purpose of life lies in promoting the ends of truth, beauty, and goodness.

32. A meaningful life is one that contributes to the well-being of others.

33. The meaning of life is the same as a happy life.

34. The meaning of life is found in realizing my potential.

35. Life has purpose only in the everyday details of living. 36. There is no, one, universal way of obtaining a meaningful life

for all people. 37. People passionately desire different things. Obtaining these

things contributes to making life more meaningful for them. 38. What contributes to a meaningful life varies according to each

person (or group). 39. Lives can be meaningful even without the existence of a God

or spiritual realm.

40. Our lives have no significance, but we must live as if they do.

Cluster Analysis of n = 132 university students using responses from Meaning of Life questionnaire (40 questions)

Cases

Lin

kage D

istance

0

10

20

30

40

50

60

70

80

Fig. 1. Dendrogram showing clustering using Ward`s method of Euclidean distances

Discriminant Analysis of n = 132 university students into the three identified populations

0

1

2

3

4

5

6

7

8

-4 -3 -2 -1 0 1 2 3 4 5 6

F1 (Discriminant function 1)

F2 (

Dis

crim

inan

t fun

ctio

n 2)

Semi-ReligiousReligiousHumanistic

Optimistic

Pessimistic

Religious Non-religious

Fig. 4. Cluster map

A Review of Linear Algebra

With some Additions

11 12 1

21 22 2

1 2

n

nij

m m mn

a a a

a a aA a

a a a

Matrix AlgebraDefinition

An n × m matrix, A, is a rectangular array of elements

n = # of columns

m = # of rows

dimensions = n × m

1

2

n

v

v

v

v

Definition

A vector, v, of dimension n is an n × 1 matrix rectangular array of elements

vectors will be column vectors (they may also be row vectors)

1

2

n

v

v

v

v

A vector, v, of dimension n

can be thought a point in n dimensional space

v2

v1

v3

1

2

3

v

v

v

v

11 11 12 12 1 1

21 21 22 22 2 2

1 1 2 2

n n

n nij ij

m m m m mn mn

a b a b a b

a b a b a bA B a b

a b a b a b

Matrix OperationsAddition

Let A = (aij) and B = (bij) denote two n × m matrices Then the sum, A + B, is the matrix

The dimensions of A and B are required to be both n × m.

11 12 1

21 22 2

1 2

n

nij

m m mn

ca ca ca

ca ca cacA ca

ca ca ca

Scalar Multiplication

Let A = (aij) denote an n × m matrix and let c be any scalar. Then cA is the matrix

v2

v1

v3

1

2

3

v

v

v

v

Addition for vectors

1

2

3

w

w

w

w

1 1

2 2

3 3

v w

v w

v w

v w

v2

v1

v3

1

2

3

v

v

v

v

Scalar Multiplication for vectors

1

2

3

cv

c cv

cv

v

1

m

il ij jlj

c a b

Matrix multiplication

Let A = (aij) denote an n × m matrix and B = (bjl) denote an m × k matrix

Then the n × k matrix C = (cil) where

is called the product of A and B and is denoted by A∙B

1

m

i ij jj

w a v

In the case that A = (aij) is an n × m matrix and B = v = (vj) is an m × 1 vector

Then w = A∙v = (wi) where

is an n × 1 vector

v2

v1

v3

1

2

3

v

v

v

v

w2

w1

w3

1

2

3

w

w A

w

w v

A

1 0 0

0 1 0

0 0 1

nI I

Definition

An n × n identity matrix, I, is the square matrix

Note:1. AI = A

2. IA = A.

Definition (The inverse of an n × n matrix)

AB = BA = I,

If the matrix B exists then A is called invertible Also B is called the inverse of A and is denoted by A-1

11 12 1

21 22 2

1 2

n

nij

n n nn

a a a

a a aA a

a a a

Let A denote the n × n matrix

Let B denote an n × n matrix such that

The Woodbury Theorem

11 1 1 1 1 1A BCD A A B C DA B DA

where the inverses11 1 1 1, and exist.A C C DA B

Then all we need to show is that

H(A + BCD) = (A + BCD) H = I.

Proof:

Let 11 1 1 1 1H A A B C DA B DA

H A BCD

11 1 1 1 1A A B C DA B DA A BCD

11 1 1 1 1A A A B C DA B DA A

11 1 1 1 1A BCD A B C DA B DA BCD

11 1 1I A B C DA B D

11 1 1 1 1A BCD A B C DA B DA BCD

1I A BCD 11 1 1 1A B C DA B I DA BC D

1I A BCD 11 1 1 1 1A B C DA B C DA B CD

1 1I A BCD A BCD I

The Woodbury theorem can be used to find the inverse of some pattern matrices:

Example: Find the inverse of the n × n matrix

1 0 0 1 1 1

0 1 0 1 1 1

0 0 1 1 1 1

b a a

a b ab a a

a a b

1

11 1 1

1

b a I a A BCD

where1

1

1

B

A b a I 1 1 1D

1 1C a

1 1A I

b a

hence 1 1

Ca

1 1

1

11 11 1 1

1

C DA B Ia b a

and

11 b a nn b a an

a b a a b a a b a

Thus

Now using the Woodbury theorem

11 1

1

a b aC DA B

b a n

11 1 1 1 1 1A BCD A A B C DA B DA

1

11 1 11 1 1

1

1

a b aI I I

b a b a b a n b a

1

111 1 1

1

1

aI

b a b a b a n

Thus

1 0 0 1 1 1

0 1 0 1 1 11

1

0 0 1 1 1 1

a

b a b a b a n

1b a a

a b a

a a b

c d d

d c d

d d c

where

1

ad

b a b a n

1

and 1

ac

b a b a b a n

21 11

1 1

b a na

b a b a n b a b a n

Note: for n = 2

2 2

a ad

b a b a b a

2 2

1and

b bc

b a b a b a

1

2 2

1Thus

b a b a

a b a bb a

Also1

b a a b a a b a a c d d

a b a a b a a b a d c d

a a b a a b a a b d d c

1 ( 2) ( 2)

( 2) 1 ( 2)

( 2) ( 2) 1

bc n ad bd ac n ad bd ac n ad

bd ac n ad bc n ad bd ac n ad

bd ac n ad bd ac n ad bc n ad

Now

1

ad

b a b a n

21and

1

b a nc

b a b a n

22 11

1 1

b a n n abbc n ad

b a b a n b a b a n

22 1

1

b b a n n a

b a b a n

2 2

2 2

2 11

2 1

b ab n n a

b ab n n a

( 2)2( 2)

1 1

b n a ab a nabd ac n ad

b a b a n b a b a n

0

and

This verifies that we have calculated the inverse

11 12

21 22

q

n m n q

p m p

A AA

A A

Block Matrices

Let the n × m matrix

be partitioned into sub-matrices A11, A12, A21, A22,

11 12

21 22

p

m k m p

l k l

B BB

B B

Similarly partition the m × k matrix

11 12 11 12

21 22 21 22

A A B BA B

A A B B

Product of Blocked Matrices

Then

11 11 12 21 11 12 12 22

21 11 22 21 21 12 22 22

A B A B A B A B

A B A B A B A B

11 12

21 22

p

n n n p

p n p

A AA

A A

The Inverse of Blocked Matrices

Let the n × n matrix

be partitioned into sub-matrices A11, A12, A21, A22,

11 12

21 22

p

n n n p

p n p

B BB

B B

Similarly partition the n × n matrix

Suppose that B = A-1

11 12 11 12

21 22 21 22

A A B BA B

A A B B

Product of Blocked Matrices

Then

11 11 12 21 11 12 12 22

21 11 22 21 21 12 22 22

A B A B A B A B

A B A B A B A B

0

0

pp n p

n pn p p

I

I

Hence 11 11 12 21 1A B A B I

11 12 12 22 0 2A B A B

21 11 22 21 0 3A B A B

21 12 22 22 4A B A B I

From (1)1 1

11 12 21 11 11A A B B B

From (3)1 1 1 1

22 21 21 11 21 11 22 210 or A A B B B B A A

Hence 1 111 12 22 21 11A A A A B

using the Woodbury Theorem

or 1111 11 12 22 21B A A A A

11 1 1 1

11 11 12 22 21 11 12 21 11A A A A A A A A A

Similarly11

22 22 21 11 12B A A A A

11 1 1 122 22 21 11 12 22 21 12 22A A A A A A A A A

21 11 22 21 0 3A B A B From

122 21 11 21 0A A B B

11 1 121 22 21 11 22 21 11 12 22 21B A A B A A A A A A

and

11 1 112 11 12 22 11 12 22 21 11 12B A A B A A A A A A

similarly

11 12

21 22

p

n n n p

p n p

A AA

A A

Summarizing

Let

11 12

21 22

p

n p

p n p

B B

B B

Suppose that A-1 = B

then

11 1 121 22 21 11 22 21 11 12 22 21B A A B A A A A A A

11 1 112 11 12 22 11 12 22 21 11 12B A A B A A A A A A

1 11 1 1 1 111 11 12 22 21 11 11 12 22 21 11 12 21 11B A A A A A A A A A A A A A

1 11 1 1 1 1

22 22 21 11 12 22 22 21 11 12 22 21 12 22B A A A A A A A A A A A A A

0 0

0 0

0 0

0 0

p

p

p p

a b

aI bI a bA

cI dI c d

c d

Example

Let

11 12

21 22

p

n p

p n p

B B

B B

Find A-1 = B

11 12 21 22, , ,A aI A bI A cI A dI

1 1111

bc dd d ad bcB aI bI I cI a I I

1 1122

bc aa a ad bcB dI cI I bI d I I

1 121 22 21 11 ( ) d c

d ad bc ad bcB A A B I cI I I

1 112 11 12 22 ( ) a b

a ad bc ad bcB A A B I bI I I

1hence d b

ad bc ad bc

c aad bc ad bc

I IA

I I

11 12 1

21 22 2

1 2

n

nij

m m mn

a a a

a a aA a

a a a

The transpose of a matrixConsider the n × m matrix, A

is called the transpose of A

11 21 1

12 22 2

1 2

m

mji

m m mn

a a a

a a aA a

a a a

then the m × n matrix, (also denoted by AT)A

Symmetric Matrices

• An n × n matrix, A, is said to be symmetric if

Note:

AA

11

111

AA

ABAB

ABAB

The trace and the determinant of a square matrix

11 12 1

21 22 2

1 2

n

nij

n n nn

a a a

a a aA a

a a a

Let A denote then n × n matrix

Then

1

n

iii

tr A a

11 12 1

21 22 2

1 2

det the determinant of

n

n

n n nn

a a a

a a aA A

a a a

also

where1

n

ij ijj

a A

cofactor of ij ijA a the determinant of the matrix

after deleting row and col.th thi j

11 1211 22 12 21

21 22

deta a

a a a aa a

1. 1, I tr I n

Some properties

2. , AB A B tr AB tr BA

1 13. A

A

122 11 12 22 2111 12

121 22 11 22 21 11 12

4. A A A A AA A

AA A A A A A A

22 11 12 21 if 0 or 0A A A A

Some additional Linear Algebra

Inner product of vectors

Let denote two p × 1 vectors. Then. and x y

1

1 1 1, , p p p

p

y

x y x x x y x y

y

1

p

i ii

x y

Note:2 21 the length of px x x x x

Let denote two p × 1 vectors. Then. and x y

cos angle between and x y

x yx x y y

x

y

Note:Let denote two p × 1 vectors. Then. and x y

cos angle between and x y

x yx x y y

x

y

0 2 and 0 if yx

.orthogonal are and then ,0 if Thus yxyx

2

Special Types of Matrices

1. Orthogonal matrices– A matrix is orthogonal if P'P = PP' = I– In this cases P-1=P' .– Also the rows (columns) of P have length 1 and

are orthogonal to each other

then P P PP I

Suppose P is an orthogonal matrix

Let denote p × 1 vectors. and x y

Let and u Px v Py

Then u v Px Py x P Py x y

and u u Px Px x P Px x x

Orthogonal transformation preserve length and angles – Rotations about the origin, Reflections

The following matrix P is orthogonal

Example

62

61

61

21

21

31

31

31

0P

Special Types of Matrices(continued)

2. Positive definite matrices– A symmetric matrix, A, is called positive definite

if:

– A symmetric matrix, A, is called positive semi definite if:

022 112211222

111 nnnnn xxaxxaxaxaxAx

0 allfor

x

0 xAx

0 allfor

x

If the matrix A is positive definite then

0 wheresatisfy that , points, ofset the ccxAxx

.0 origin, at the centered

ellipsoid l dimensionaan of surface on the are

n

Theorem The matrix A is positive definite if

0,,0,0,0 321 nAAAA

nnnn

n

n

n

aaa

aaa

aaa

AA

aaa

aaa

aaa

Aaa

aaAaA

21

22212

11211

332313

232212

131211

32212

12112111

and

,,,

where

Special Types of Matrices(continued)

3. Idempotent matrices– A symmetric matrix, E, is called idempotent if:

– Idempotent matrices project vectors onto a linear subspace

EEE

xExEE

xE

x

Definition

Let A be an n × n matrix

Let and be such thatx

with 0Ax x x

then is called an eigenvalue of A and

and is called an eigenvector of A andx

Note:

0A I x

1If 0 then 0 0A I x A I

thus 0 A I

is the condition for an eigenvalue.

11 1

1

det = 0n

n nn

a a

A I

a a

= polynomial of degree n in .

Hence there are n possible eigenvalues 1, … , n

0 if 0x Ax x

Proof A is positive definite if

be an eigenvalue and

Thereom If the matrix A is symmetric then the eigenvalues of A, 1, … , n,are real.

Thereom If the matrix A is positive definite then the eigenvalues of A, 1, … , n, are positive.

and x Let

corresponding eigenvector of A.

then Ax x

and , or 0x x

x Ax x xx Ax

Proof: Note

Thereom If the matrix A is symmetric and the eigenvalues of A are 1, … , n, with corresponding eigenvectors

i.e. i i iAx x 1, , nx x

If i ≠ j then 0 i jx x

j i i j ix Ax x x

and i j j i jx Ax x x

0 i j i jx x

hence 0 i jx x

Thereom If the matrix A is symmetric with distinct eigenvalues, 1, … , n, with corresponding eigenvectors

1 1 1then n n nA x x x x

1, , nx x

Assume 1 i ix x

1 1

1

0

, ,

0n

n n

x

x x

x

PDP

1 1 1then n n nA x x x x

proof

Note 1 i ix x

1 1 1 1

1

1

, ,n

n

n n n n

x x x x x

P P x x

x x x x x

and 0 if i jx x i j

1 0

0 1

I

P is called an orthogonal matrix

therefore

1

1 1 1, , n n n

n

x

I PP x x x x x x

x

thus

1 1 and .P P PP PP I

1now i iAx x

1 1 1 1 1 n n n n nAx x Ax x x x x x

and i i i i iAx x x x

1 1 1 1 1 n n n n nA x x x x x x x x

1 1 1 n n nA x x x x

Comment

The previous result is also true if the eigenvalues are not distinct.

Namely if the matrix A is symmetric with eigenvalues, 1, … , n, with corresponding

eigenvectors of unit length

1 1 1then n n nA x x x x

1, , nx x

1 1

1

0

, ,

0n

n n

x

x x

x

PDP

An algorithm for computing eigenvectors, eigenvalues of positive

definite matrices

• Generally to compute eigenvalues of a matrix we need to first solve the equation for all values of .– |A – I| = 0 (a polynomial of degree n in )

• Then solve the equation for the eigenvector

xxA

, , x

Recall that if A is positive definite then

1 1 1 n n nA x x x x

jixxxx

xxx

jiii

n

if 0 and 1 i.e.1.length of

rseigenvecto orthogonal theare ,,, where 21

It can be shown that

seigenvalue theare 0 and 21 n

222

2211

21

2nnn xxxxxxA

and that 222111 nnmn

mmm xxxxxxA

1111

221

2111 xxxxxxxx m

nn

m

n

m

m

Thus for large values of m

The algorithim

1.Compute powers of A - A2 , A4 , A8 , A16 , ...

2.Rescale (so that largest element is 1 (say))

3.Continue until there is no change, The resulting matrix will be

4.Find

5. Find

constant a 11 xxAm

c 11 xxAm

c that so 11 xxbbAb m

11111 using and 1

xxAbbb

x

To find

6. Repeat steps 1 to 5 with the above matrix to find

7. Continue to find

:Note and 22 x

222111 nnn xxxxxxA

22 and x

nnxxx and ,, and , and 4433

Example

A =5 4 24 10 12 1 2

1 2 3

eigenvalue 12.54461 3.589204 0.866182eignvctr 0.496986 0.677344 0.542412

0.849957 -0.50594 -0.146980.174869 0.534074 -0.82716

Differentiation with respect to a vector, matrix

1

p

df x

dxdf x

dxdf x

dx

Differentiation with respect to a vector

Let denote a p × 1 vector. Let denote a function of the components of .

x f x

x

1 1

then

p

p

f x

x adf x

adx

af x

x

1. Suppose 1 1 n nf x a x a x a x

Rules

1

then 2

p

f x

xdf x

Axdx

f x

x

2. Suppose

2 211 1 pp pf x x Ax a x a x

12 1 2 13 1 3 1, 12 2 2 p p p pa x x a x x a x x

1 1i.e. 2 2 2i ii i ip p

i

f xa x a x a x

x

1122 0 or

df xAx b x A b

dx

Example

f x x Ax b x c

1. Determine when

is a maximum or minimum.

solution

2 2 0

dg xAx x

dx

f x x Ax

2. Determine when is a maximum if1.x x

let 1g x x Ax x x

is the Lagrange multiplier.

solution

or Ax x

Assume A is a positive definite matrix.

This shows that is an eigenvector of A. x

and f x x Ax x x

Thus is the eigenvector of A associated with the largest eigenvalue, . x

11 1

1

p

ij

q pp

f X f X

x xdf X f X

dX xf X f X

x x

Differentiation with respect to a matrix

Let X denote a q × p matrix. Let f (X) denote a function of the components of X then:

1lnthen

d XX

dX

Example

Let X denote a p × p matrix. Let f (X) = ln |X|

Solution

1 1i i ij ij ip ipX x X x X x X

= (i,j)th element of X-1ln 1

ijij

XX

x X

Note Xij are cofactors

trthen

d AXA

dX

Example

Let X and A denote p × p matrices.

Solution

1 1

trp p

ik kik k

AX a x

trji

ij

AXa

x

Let f (X) = tr (AX)

111

1

p

ij

q qp

dudu

dx dxdudU

dx dxdu du

dx dx

Differentiation of a matrix of functions

Let U = (uij) denote a q × p matrix of functions of x then:

1.

d aU dUa

dx dx

Rules:

2.

d U V dU dV

dx dx dx

3.

d UV dU dVV U

dx dx dx

1

1 14. d U dU

U Udx dx

1U U I 1

1 0p p

dU dUU U

dx dx

Proof:

11dU dU

U Udx dx

11 1dU dU

U Udx dx

tr5. tr

d AU dUA

dx dx

1 1

trp p

ik kii k

AU a u

Proof:

1 1

trtr

p pki

iki k

AU u dUa A

x x dx

11 1tr

6. trd AU dU

AU Udx dx

Proof:

1

1 1tr7. tr ij

ij

d AXE X AX

dx

1

1 1 1 1trtr tr ij

ij ij

d AX dXAX X AX E X

dx dx

1 ,( ) where

0 otherwisekl kl kl

ij ij

i k j lE e e

1 1tr ijE X AX

11 1tr

8. d AX

X AXdX

The Generalized Inverse of a matrix

Recall

B (denoted by A-1) is called the inverse of A if

AB = BA = I

• A-1 does not exist for all matrices A

• A-1 exists only if A is a square matrix and |A| ≠ 0

• If A-1 exists then the system of linear equations has a unique solutionAx b

1x A b

Definition

B (denoted by A-) is called the generalized inverse (Moore – Penrose inverse) of A if

1. ABA = A

2. BAB = B

3. (AB)' = AB

4. (BA)' = BA

Note: A- is unique

Proof: Let B1 and B2 satisfying

1. ABiA = A

2. BiABi = Bi

3. (ABi)' = ABi

4. (BiA)' = BiA

Hence

B1 = B1AB1 = B1AB2AB1 = B1 (AB2)'(AB1) '

= B1B2'A'B1

'A'= B1B2'A' = B1AB2 = B1AB2AB2

= (B1A)(B2A)B2 = (B1A)'(B2A)'B2 = A'B1'A'B2

'B2

= A'B2'B2= (B2A)'B2

= B2AB2 = B2

The general solution of a system of Equations

Ax b

x A b I A A z

The general solution

x A b I A A z

where is arbitrary

Suppose a solution exists

0Ax b

x A b I A A z

Let

then Ax A A b I A A z

AA b A AA A z

0 0AA Ax Ax b

Calculation of the Moore-Penrose g-inverse

1then A A A A

1 1

A A A A A A A A A A I

Let A be a p×q matrix of rank q < p,

Proof

and AA A AI A A AA IA A thus

also is symmetricA A I

1

and is symmetricAA A A A A

1then B B BB

1 1

BB B B BB BB BB I

Let B be a p×q matrix of rank p < q,

Proof

and BB B IB B B BB B I B thus

also is symmetricBB I

1

and is symmetricB B B BB B

1 1then C B BB A A A

1 1 1

CC AB B BB A A A A A A A

Let C be a p×q matrix of rank k < min(p,q),

Proof

is symmetric, as well as

then C = AB where A is a p×k matrix of rank k and B is a k×q matrix of rank k

1 1 1

C C B BB A A A AB B BB B

1

Also CC C A A A A AB AB C

1 1 1

and C CC B BB B B BB A A A

1 1

B BB A A A C

References

1. Matrix Algebra Useful for Statistics, Shayle R. Searle

2. Mathematical Tools for Applied Multivariate Analysis, J. Douglas Carroll, Paul E. Green