Stats 346.3 Multivariate Data Analysis Stats 848.3.

117
Stats 346.3 Multivariate Data Analysis Stats 848.3

Transcript of Stats 346.3 Multivariate Data Analysis Stats 848.3.

Page 1: Stats 346.3 Multivariate Data Analysis Stats 848.3.

Stats 346.3

Multivariate Data Analysis

Stats 848.3

Page 2: Stats 346.3 Multivariate Data Analysis Stats 848.3.

Instructor: W.H.Laverty

Office: 235 McLean Hall

Phone: 966-6096

Lectures:M W F

12:30am - 1:20pm Biol 123

Evaluation: Assignments, Term tests - 40%Final Examination - 60%

Page 3: Stats 346.3 Multivariate Data Analysis Stats 848.3.

Dates for midterm tests:1. Friday, February 06

2. Friday, March 20

Each test and the Final Exam are Open Book

Students are allowed to take in Notes, texts, formula sheets, calculators (laptop computers.)

Page 4: Stats 346.3 Multivariate Data Analysis Stats 848.3.

Text:

Stat 346 –Multivariate Statistical Methods – Donald Morrison

Not Required - I will give a list of other useful texts that will be in the library

Page 5: Stats 346.3 Multivariate Data Analysis Stats 848.3.

Bibliography

1. Cooley, W.W., and Lohnes P.R. (1962). Multivariate Procedures for the Behavioural Sciences, Wiley, New York.

2. Fienberg, S. (1980), Analysis of Cross-Classified Data , MIT Press, Cambridge, Mass.

3. Fingelton, B. (1984), Models for Category Counts , Cambridge University Press.

4. Johnson, R.A. and Wichern D.W. Applied Multivariate Statistical Analysis , Prentice Hall.

5. Morrison, D.F. (1976), Multivariate Statistical Methods , McGraw-Hill, New York.

6. Seal, H. (1968), Multivariate Statistical Analysis for Biologists , Metheun, London

7. Alan Agresti (1990) Categorical Data Analysis, Wiley, New York.

Page 6: Stats 346.3 Multivariate Data Analysis Stats 848.3.

• The lectures will be given in Power Point

• They are now posted on the Stats 346 web page

Page 7: Stats 346.3 Multivariate Data Analysis Stats 848.3.

Course Outline

Page 8: Stats 346.3 Multivariate Data Analysis Stats 848.3.

Introduction

Page 9: Stats 346.3 Multivariate Data Analysis Stats 848.3.

Review of Linear Algebra and Matrix Analysis

Review of Linear Statistical Theory

Chapter 2

Chapter 1

Page 10: Stats 346.3 Multivariate Data Analysis Stats 848.3.

Multivariate Normal distribution •Multivariate Data plots •Correlation - sample estimates and tests •Canonical Correlation

Chapter 3

Page 11: Stats 346.3 Multivariate Data Analysis Stats 848.3.

Mean Vectors and Covariance matrices •Single sample procedures •Two sample procedures •Profile Analysis

Chapter 4

Page 12: Stats 346.3 Multivariate Data Analysis Stats 848.3.

Multivariate Analysis of Variance (MANOVA) Chapter

5

Page 13: Stats 346.3 Multivariate Data Analysis Stats 848.3.

Classification and Discrimination •Discriminant Analysis •Logistic Regression (if time permits) •Cluster Analysis

Chapters 6

Page 14: Stats 346.3 Multivariate Data Analysis Stats 848.3.

The structure of correlation •Principal Components Analysis (PCA) •Factor Analysis

Chapter 9

Page 15: Stats 346.3 Multivariate Data Analysis Stats 848.3.

Multivariate Multiple Regression

(if time permits)

References TBA

Page 16: Stats 346.3 Multivariate Data Analysis Stats 848.3.

Discrete Multivariate Analysis

(if time permits)

References: TBA

Page 17: Stats 346.3 Multivariate Data Analysis Stats 848.3.

Introduction

Page 18: Stats 346.3 Multivariate Data Analysis Stats 848.3.

Multivariate Data

• We have collected data for each case in the sample or population on not just one variable but on several variables – X1, X2, … Xp

• This is likely the situation – very rarely do you collect data on a single variable.

• The variables maybe1. Discrete (Categorical)2. Continuous (Numerical)

• The variables may be 1. Dependent (Response variables)2. Independent (Predictor variables)

Page 19: Stats 346.3 Multivariate Data Analysis Stats 848.3.

Independent variables

Dependent Variables

Categorical Continuous Continuous & Categorical

Categorical Multiway frequency Analysis(Log Linear Model)

Discriminant Analysis Discriminant Analysis

Continuous ANOVA (single dep var)MANOVA (Mult dep var)

MULTIPLE REGRESSION(single dep variable)MULTIVARIATEMULTIPLE REGRESSION (multiple dependent variable)

ANACOVA (single dep var)MANACOVA (Mult dep var)

Continuous & Categorical

?? ?? ??

A chart illustrating Statistical Procedures

Page 20: Stats 346.3 Multivariate Data Analysis Stats 848.3.

Multivariate Techniques

Multivariate Techniques can be classified as follows:

1. Techniques that are direct analogues of univariate procedures.

• There are univariate techniques that are then generalized to the multivariate situarion

• e. g. The two independent sample t test, generalized to Hotelling’s T2 test

• ANOVA (Analysis of Variance) generalized to MANOVA (Multivariate Analysis of Variance)

Page 21: Stats 346.3 Multivariate Data Analysis Stats 848.3.

2. Techniques that are purely multivariate procedures.

• Correlation, Partial correlation, Multiple correlation, Canonical Correlation

• Principle component Analysis, Factor Analysis- These are techniques for studying complicated

correlation structure amongst a collection of variables

Page 22: Stats 346.3 Multivariate Data Analysis Stats 848.3.

3. Techniques for which a univariate procedures could exist but these techniques become much more interesting in the multivariate setting.

• Cluster Analysis and Classification- Here we try to identify subpopulations from the data

• Discriminant Analysis- In Discriminant Analysis, we attempt to use a

collection of variables to identify the unknown population for which a case is a member

Page 23: Stats 346.3 Multivariate Data Analysis Stats 848.3.

An Example:

A survey was given to 132 students

• Male=35,

• Female=97

They rated, on a Likert scale

• 1 to 5

• their agreement with each of 40 statements.

All statements are related to the Meaning of Life

Page 24: Stats 346.3 Multivariate Data Analysis Stats 848.3.

Questions and Statements

1. How religious/spiritual would you say you are?

2. To have trustworthy and intimate friend(s)

3. To have a fulfilling career

4. To be closely connected to family 5. To share values/beliefs with others in your close circle or

community

6. To have and raise children 7. To continually set short and long-term, achievable goals for

yourself

8. To feel satisfied with yourself (feel good about yourself)

9. To live up to the expectations of family and close friends

10. To contribute to world peace

Page 25: Stats 346.3 Multivariate Data Analysis Stats 848.3.

Statements - continued

11. To be involved in an intimate relationship with a significant person

12. To give of yourself to others.

13. To be able to plan and take time for leisure.

14. To act on your own personal beliefs, despite outside pressure.

15. To be seen as physically attractive. 16. To feel confident in choosing new experiences to better

yourself.

17. To care about the state of the physical/natural environment.

18. To take responsibility for your mistakes.

19. To make restitution for you mistakes, if necessary.

20. To be involved with social or political causes.

Page 26: Stats 346.3 Multivariate Data Analysis Stats 848.3.

21. To keep up with media and popular-culture trends.

22. To adhere to religious practices based on tradition or rituals. 23. To use your own creativity in a way that you believe is

worthwhile. 24. The meaning of life is found in understanding ones ultimate

purpose for life. 25. The meaning of life can be discovered through intentionally

living a life that glorifies a Spiritual being.

26. There is a reason for everything that happens. 27. Obtaining things in life that are material and tangible is only

part of discovering the meaning of life. 28. People unearth the same basic values when attempting to find

the meaning of life. 29. It is more important to cultivate character than to be consumed

with outward rewards, or, awards.

30. Some aims or goals in life are more valuable than other goals.

Page 27: Stats 346.3 Multivariate Data Analysis Stats 848.3.

31. The purpose of life lies in promoting the ends of truth, beauty, and goodness.

32. A meaningful life is one that contributes to the well-being of others.

33. The meaning of life is the same as a happy life.

34. The meaning of life is found in realizing my potential.

35. Life has purpose only in the everyday details of living. 36. There is no, one, universal way of obtaining a meaningful life

for all people. 37. People passionately desire different things. Obtaining these

things contributes to making life more meaningful for them. 38. What contributes to a meaningful life varies according to each

person (or group). 39. Lives can be meaningful even without the existence of a God

or spiritual realm.

40. Our lives have no significance, but we must live as if they do.

Page 28: Stats 346.3 Multivariate Data Analysis Stats 848.3.

Cluster Analysis of n = 132 university students using responses from Meaning of Life questionnaire (40 questions)

Cases

Lin

kage D

istance

0

10

20

30

40

50

60

70

80

Fig. 1. Dendrogram showing clustering using Ward`s method of Euclidean distances

Page 29: Stats 346.3 Multivariate Data Analysis Stats 848.3.

Discriminant Analysis of n = 132 university students into the three identified populations

0

1

2

3

4

5

6

7

8

-4 -3 -2 -1 0 1 2 3 4 5 6

F1 (Discriminant function 1)

F2 (

Dis

crim

inan

t fun

ctio

n 2)

Semi-ReligiousReligiousHumanistic

Optimistic

Pessimistic

Religious Non-religious

Fig. 4. Cluster map

Page 30: Stats 346.3 Multivariate Data Analysis Stats 848.3.

A Review of Linear Algebra

With some Additions

Page 31: Stats 346.3 Multivariate Data Analysis Stats 848.3.

11 12 1

21 22 2

1 2

n

nij

m m mn

a a a

a a aA a

a a a

Matrix AlgebraDefinition

An n × m matrix, A, is a rectangular array of elements

n = # of columns

m = # of rows

dimensions = n × m

Page 32: Stats 346.3 Multivariate Data Analysis Stats 848.3.

1

2

n

v

v

v

v

Definition

A vector, v, of dimension n is an n × 1 matrix rectangular array of elements

vectors will be column vectors (they may also be row vectors)

Page 33: Stats 346.3 Multivariate Data Analysis Stats 848.3.

1

2

n

v

v

v

v

A vector, v, of dimension n

can be thought a point in n dimensional space

Page 34: Stats 346.3 Multivariate Data Analysis Stats 848.3.

v2

v1

v3

1

2

3

v

v

v

v

Page 35: Stats 346.3 Multivariate Data Analysis Stats 848.3.

11 11 12 12 1 1

21 21 22 22 2 2

1 1 2 2

n n

n nij ij

m m m m mn mn

a b a b a b

a b a b a bA B a b

a b a b a b

Matrix OperationsAddition

Let A = (aij) and B = (bij) denote two n × m matrices Then the sum, A + B, is the matrix

The dimensions of A and B are required to be both n × m.

Page 36: Stats 346.3 Multivariate Data Analysis Stats 848.3.

11 12 1

21 22 2

1 2

n

nij

m m mn

ca ca ca

ca ca cacA ca

ca ca ca

Scalar Multiplication

Let A = (aij) denote an n × m matrix and let c be any scalar. Then cA is the matrix

Page 37: Stats 346.3 Multivariate Data Analysis Stats 848.3.

v2

v1

v3

1

2

3

v

v

v

v

Addition for vectors

1

2

3

w

w

w

w

1 1

2 2

3 3

v w

v w

v w

v w

Page 38: Stats 346.3 Multivariate Data Analysis Stats 848.3.

v2

v1

v3

1

2

3

v

v

v

v

Scalar Multiplication for vectors

1

2

3

cv

c cv

cv

v

Page 39: Stats 346.3 Multivariate Data Analysis Stats 848.3.

1

m

il ij jlj

c a b

Matrix multiplication

Let A = (aij) denote an n × m matrix and B = (bjl) denote an m × k matrix

Then the n × k matrix C = (cil) where

is called the product of A and B and is denoted by A∙B

Page 40: Stats 346.3 Multivariate Data Analysis Stats 848.3.

1

m

i ij jj

w a v

In the case that A = (aij) is an n × m matrix and B = v = (vj) is an m × 1 vector

Then w = A∙v = (wi) where

is an n × 1 vector

v2

v1

v3

1

2

3

v

v

v

v

w2

w1

w3

1

2

3

w

w A

w

w v

A

Page 41: Stats 346.3 Multivariate Data Analysis Stats 848.3.

1 0 0

0 1 0

0 0 1

nI I

Definition

An n × n identity matrix, I, is the square matrix

Note:1. AI = A

2. IA = A.

Page 42: Stats 346.3 Multivariate Data Analysis Stats 848.3.

Definition (The inverse of an n × n matrix)

AB = BA = I,

If the matrix B exists then A is called invertible Also B is called the inverse of A and is denoted by A-1

11 12 1

21 22 2

1 2

n

nij

n n nn

a a a

a a aA a

a a a

Let A denote the n × n matrix

Let B denote an n × n matrix such that

Page 43: Stats 346.3 Multivariate Data Analysis Stats 848.3.

The Woodbury Theorem

11 1 1 1 1 1A BCD A A B C DA B DA

where the inverses11 1 1 1, and exist.A C C DA B

Page 44: Stats 346.3 Multivariate Data Analysis Stats 848.3.

Then all we need to show is that

H(A + BCD) = (A + BCD) H = I.

Proof:

Let 11 1 1 1 1H A A B C DA B DA

H A BCD

11 1 1 1 1A A B C DA B DA A BCD

11 1 1 1 1A A A B C DA B DA A

11 1 1 1 1A BCD A B C DA B DA BCD

Page 45: Stats 346.3 Multivariate Data Analysis Stats 848.3.

11 1 1I A B C DA B D

11 1 1 1 1A BCD A B C DA B DA BCD

1I A BCD 11 1 1 1A B C DA B I DA BC D

1I A BCD 11 1 1 1 1A B C DA B C DA B CD

1 1I A BCD A BCD I

Page 46: Stats 346.3 Multivariate Data Analysis Stats 848.3.

The Woodbury theorem can be used to find the inverse of some pattern matrices:

Example: Find the inverse of the n × n matrix

1 0 0 1 1 1

0 1 0 1 1 1

0 0 1 1 1 1

b a a

a b ab a a

a a b

1

11 1 1

1

b a I a A BCD

Page 47: Stats 346.3 Multivariate Data Analysis Stats 848.3.

where1

1

1

B

A b a I 1 1 1D

1 1C a

1 1A I

b a

hence 1 1

Ca

1 1

1

11 11 1 1

1

C DA B Ia b a

and

11 b a nn b a an

a b a a b a a b a

Page 48: Stats 346.3 Multivariate Data Analysis Stats 848.3.

Thus

Now using the Woodbury theorem

11 1

1

a b aC DA B

b a n

11 1 1 1 1 1A BCD A A B C DA B DA

1

11 1 11 1 1

1

1

a b aI I I

b a b a b a n b a

1

111 1 1

1

1

aI

b a b a b a n

Page 49: Stats 346.3 Multivariate Data Analysis Stats 848.3.

Thus

1 0 0 1 1 1

0 1 0 1 1 11

1

0 0 1 1 1 1

a

b a b a b a n

1b a a

a b a

a a b

c d d

d c d

d d c

Page 50: Stats 346.3 Multivariate Data Analysis Stats 848.3.

where

1

ad

b a b a n

1

and 1

ac

b a b a b a n

21 11

1 1

b a na

b a b a n b a b a n

Page 51: Stats 346.3 Multivariate Data Analysis Stats 848.3.

Note: for n = 2

2 2

a ad

b a b a b a

2 2

1and

b bc

b a b a b a

1

2 2

1Thus

b a b a

a b a bb a

Page 52: Stats 346.3 Multivariate Data Analysis Stats 848.3.

Also1

b a a b a a b a a c d d

a b a a b a a b a d c d

a a b a a b a a b d d c

1 ( 2) ( 2)

( 2) 1 ( 2)

( 2) ( 2) 1

bc n ad bd ac n ad bd ac n ad

bd ac n ad bc n ad bd ac n ad

bd ac n ad bd ac n ad bc n ad

Page 53: Stats 346.3 Multivariate Data Analysis Stats 848.3.

Now

1

ad

b a b a n

21and

1

b a nc

b a b a n

22 11

1 1

b a n n abbc n ad

b a b a n b a b a n

22 1

1

b b a n n a

b a b a n

2 2

2 2

2 11

2 1

b ab n n a

b ab n n a

Page 54: Stats 346.3 Multivariate Data Analysis Stats 848.3.

( 2)2( 2)

1 1

b n a ab a nabd ac n ad

b a b a n b a b a n

0

and

This verifies that we have calculated the inverse

Page 55: Stats 346.3 Multivariate Data Analysis Stats 848.3.

11 12

21 22

q

n m n q

p m p

A AA

A A

Block Matrices

Let the n × m matrix

be partitioned into sub-matrices A11, A12, A21, A22,

11 12

21 22

p

m k m p

l k l

B BB

B B

Similarly partition the m × k matrix

Page 56: Stats 346.3 Multivariate Data Analysis Stats 848.3.

11 12 11 12

21 22 21 22

A A B BA B

A A B B

Product of Blocked Matrices

Then

11 11 12 21 11 12 12 22

21 11 22 21 21 12 22 22

A B A B A B A B

A B A B A B A B

Page 57: Stats 346.3 Multivariate Data Analysis Stats 848.3.

11 12

21 22

p

n n n p

p n p

A AA

A A

The Inverse of Blocked Matrices

Let the n × n matrix

be partitioned into sub-matrices A11, A12, A21, A22,

11 12

21 22

p

n n n p

p n p

B BB

B B

Similarly partition the n × n matrix

Suppose that B = A-1

Page 58: Stats 346.3 Multivariate Data Analysis Stats 848.3.

11 12 11 12

21 22 21 22

A A B BA B

A A B B

Product of Blocked Matrices

Then

11 11 12 21 11 12 12 22

21 11 22 21 21 12 22 22

A B A B A B A B

A B A B A B A B

0

0

pp n p

n pn p p

I

I

Page 59: Stats 346.3 Multivariate Data Analysis Stats 848.3.

Hence 11 11 12 21 1A B A B I

11 12 12 22 0 2A B A B

21 11 22 21 0 3A B A B

21 12 22 22 4A B A B I

From (1)1 1

11 12 21 11 11A A B B B

From (3)1 1 1 1

22 21 21 11 21 11 22 210 or A A B B B B A A

Page 60: Stats 346.3 Multivariate Data Analysis Stats 848.3.

Hence 1 111 12 22 21 11A A A A B

using the Woodbury Theorem

or 1111 11 12 22 21B A A A A

11 1 1 1

11 11 12 22 21 11 12 21 11A A A A A A A A A

Similarly11

22 22 21 11 12B A A A A

11 1 1 122 22 21 11 12 22 21 12 22A A A A A A A A A

Page 61: Stats 346.3 Multivariate Data Analysis Stats 848.3.

21 11 22 21 0 3A B A B From

122 21 11 21 0A A B B

11 1 121 22 21 11 22 21 11 12 22 21B A A B A A A A A A

and

11 1 112 11 12 22 11 12 22 21 11 12B A A B A A A A A A

similarly

Page 62: Stats 346.3 Multivariate Data Analysis Stats 848.3.

11 12

21 22

p

n n n p

p n p

A AA

A A

Summarizing

Let

11 12

21 22

p

n p

p n p

B B

B B

Suppose that A-1 = B

then

11 1 121 22 21 11 22 21 11 12 22 21B A A B A A A A A A

11 1 112 11 12 22 11 12 22 21 11 12B A A B A A A A A A

1 11 1 1 1 111 11 12 22 21 11 11 12 22 21 11 12 21 11B A A A A A A A A A A A A A

1 11 1 1 1 1

22 22 21 11 12 22 22 21 11 12 22 21 12 22B A A A A A A A A A A A A A

Page 63: Stats 346.3 Multivariate Data Analysis Stats 848.3.

0 0

0 0

0 0

0 0

p

p

p p

a b

aI bI a bA

cI dI c d

c d

Example

Let

11 12

21 22

p

n p

p n p

B B

B B

Find A-1 = B

Page 64: Stats 346.3 Multivariate Data Analysis Stats 848.3.

11 12 21 22, , ,A aI A bI A cI A dI

1 1111

bc dd d ad bcB aI bI I cI a I I

1 1122

bc aa a ad bcB dI cI I bI d I I

1 121 22 21 11 ( ) d c

d ad bc ad bcB A A B I cI I I

1 112 11 12 22 ( ) a b

a ad bc ad bcB A A B I bI I I

1hence d b

ad bc ad bc

c aad bc ad bc

I IA

I I

Page 65: Stats 346.3 Multivariate Data Analysis Stats 848.3.

11 12 1

21 22 2

1 2

n

nij

m m mn

a a a

a a aA a

a a a

The transpose of a matrixConsider the n × m matrix, A

is called the transpose of A

11 21 1

12 22 2

1 2

m

mji

m m mn

a a a

a a aA a

a a a

then the m × n matrix, (also denoted by AT)A

Page 66: Stats 346.3 Multivariate Data Analysis Stats 848.3.

Symmetric Matrices

• An n × n matrix, A, is said to be symmetric if

Note:

AA

11

111

AA

ABAB

ABAB

Page 67: Stats 346.3 Multivariate Data Analysis Stats 848.3.

The trace and the determinant of a square matrix

11 12 1

21 22 2

1 2

n

nij

n n nn

a a a

a a aA a

a a a

Let A denote then n × n matrix

Then

1

n

iii

tr A a

Page 68: Stats 346.3 Multivariate Data Analysis Stats 848.3.

11 12 1

21 22 2

1 2

det the determinant of

n

n

n n nn

a a a

a a aA A

a a a

also

where1

n

ij ijj

a A

cofactor of ij ijA a the determinant of the matrix

after deleting row and col.th thi j

11 1211 22 12 21

21 22

deta a

a a a aa a

Page 69: Stats 346.3 Multivariate Data Analysis Stats 848.3.

1. 1, I tr I n

Some properties

2. , AB A B tr AB tr BA

1 13. A

A

122 11 12 22 2111 12

121 22 11 22 21 11 12

4. A A A A AA A

AA A A A A A A

22 11 12 21 if 0 or 0A A A A

Page 70: Stats 346.3 Multivariate Data Analysis Stats 848.3.

Some additional Linear Algebra

Page 71: Stats 346.3 Multivariate Data Analysis Stats 848.3.

Inner product of vectors

Let denote two p × 1 vectors. Then. and x y

1

1 1 1, , p p p

p

y

x y x x x y x y

y

1

p

i ii

x y

Page 72: Stats 346.3 Multivariate Data Analysis Stats 848.3.

Note:2 21 the length of px x x x x

Let denote two p × 1 vectors. Then. and x y

cos angle between and x y

x yx x y y

x

y

Page 73: Stats 346.3 Multivariate Data Analysis Stats 848.3.

Note:Let denote two p × 1 vectors. Then. and x y

cos angle between and x y

x yx x y y

x

y

0 2 and 0 if yx

.orthogonal are and then ,0 if Thus yxyx

2

Page 74: Stats 346.3 Multivariate Data Analysis Stats 848.3.

Special Types of Matrices

1. Orthogonal matrices– A matrix is orthogonal if P'P = PP' = I– In this cases P-1=P' .– Also the rows (columns) of P have length 1 and

are orthogonal to each other

Page 75: Stats 346.3 Multivariate Data Analysis Stats 848.3.

then P P PP I

Suppose P is an orthogonal matrix

Let denote p × 1 vectors. and x y

Let and u Px v Py

Then u v Px Py x P Py x y

and u u Px Px x P Px x x

Orthogonal transformation preserve length and angles – Rotations about the origin, Reflections

Page 76: Stats 346.3 Multivariate Data Analysis Stats 848.3.

The following matrix P is orthogonal

Example

62

61

61

21

21

31

31

31

0P

Page 77: Stats 346.3 Multivariate Data Analysis Stats 848.3.

Special Types of Matrices(continued)

2. Positive definite matrices– A symmetric matrix, A, is called positive definite

if:

– A symmetric matrix, A, is called positive semi definite if:

022 112211222

111 nnnnn xxaxxaxaxaxAx

0 allfor

x

0 xAx

0 allfor

x

Page 78: Stats 346.3 Multivariate Data Analysis Stats 848.3.

If the matrix A is positive definite then

0 wheresatisfy that , points, ofset the ccxAxx

.0 origin, at the centered

ellipsoid l dimensionaan of surface on the are

n

Page 79: Stats 346.3 Multivariate Data Analysis Stats 848.3.

Theorem The matrix A is positive definite if

0,,0,0,0 321 nAAAA

nnnn

n

n

n

aaa

aaa

aaa

AA

aaa

aaa

aaa

Aaa

aaAaA

21

22212

11211

332313

232212

131211

32212

12112111

and

,,,

where

Page 80: Stats 346.3 Multivariate Data Analysis Stats 848.3.

Special Types of Matrices(continued)

3. Idempotent matrices– A symmetric matrix, E, is called idempotent if:

– Idempotent matrices project vectors onto a linear subspace

EEE

xExEE

xE

x

Page 81: Stats 346.3 Multivariate Data Analysis Stats 848.3.

Definition

Let A be an n × n matrix

Let and be such thatx

with 0Ax x x

then is called an eigenvalue of A and

and is called an eigenvector of A andx

Page 82: Stats 346.3 Multivariate Data Analysis Stats 848.3.

Note:

0A I x

1If 0 then 0 0A I x A I

thus 0 A I

is the condition for an eigenvalue.

Page 83: Stats 346.3 Multivariate Data Analysis Stats 848.3.

11 1

1

det = 0n

n nn

a a

A I

a a

= polynomial of degree n in .

Hence there are n possible eigenvalues 1, … , n

Page 84: Stats 346.3 Multivariate Data Analysis Stats 848.3.

0 if 0x Ax x

Proof A is positive definite if

be an eigenvalue and

Thereom If the matrix A is symmetric then the eigenvalues of A, 1, … , n,are real.

Thereom If the matrix A is positive definite then the eigenvalues of A, 1, … , n, are positive.

and x Let

corresponding eigenvector of A.

then Ax x

and , or 0x x

x Ax x xx Ax

Page 85: Stats 346.3 Multivariate Data Analysis Stats 848.3.

Proof: Note

Thereom If the matrix A is symmetric and the eigenvalues of A are 1, … , n, with corresponding eigenvectors

i.e. i i iAx x 1, , nx x

If i ≠ j then 0 i jx x

j i i j ix Ax x x

and i j j i jx Ax x x

0 i j i jx x

hence 0 i jx x

Page 86: Stats 346.3 Multivariate Data Analysis Stats 848.3.

Thereom If the matrix A is symmetric with distinct eigenvalues, 1, … , n, with corresponding eigenvectors

1 1 1then n n nA x x x x

1, , nx x

Assume 1 i ix x

1 1

1

0

, ,

0n

n n

x

x x

x

PDP

1 1 1then n n nA x x x x

Page 87: Stats 346.3 Multivariate Data Analysis Stats 848.3.

proof

Note 1 i ix x

1 1 1 1

1

1

, ,n

n

n n n n

x x x x x

P P x x

x x x x x

and 0 if i jx x i j

1 0

0 1

I

P is called an orthogonal matrix

Page 88: Stats 346.3 Multivariate Data Analysis Stats 848.3.

therefore

1

1 1 1, , n n n

n

x

I PP x x x x x x

x

thus

1 1 and .P P PP PP I

1now i iAx x

1 1 1 1 1 n n n n nAx x Ax x x x x x

and i i i i iAx x x x

1 1 1 1 1 n n n n nA x x x x x x x x

1 1 1 n n nA x x x x

Page 89: Stats 346.3 Multivariate Data Analysis Stats 848.3.

Comment

The previous result is also true if the eigenvalues are not distinct.

Namely if the matrix A is symmetric with eigenvalues, 1, … , n, with corresponding

eigenvectors of unit length

1 1 1then n n nA x x x x

1, , nx x

1 1

1

0

, ,

0n

n n

x

x x

x

PDP

Page 90: Stats 346.3 Multivariate Data Analysis Stats 848.3.

An algorithm for computing eigenvectors, eigenvalues of positive

definite matrices

• Generally to compute eigenvalues of a matrix we need to first solve the equation for all values of .– |A – I| = 0 (a polynomial of degree n in )

• Then solve the equation for the eigenvector

xxA

, , x

Page 91: Stats 346.3 Multivariate Data Analysis Stats 848.3.

Recall that if A is positive definite then

1 1 1 n n nA x x x x

jixxxx

xxx

jiii

n

if 0 and 1 i.e.1.length of

rseigenvecto orthogonal theare ,,, where 21

It can be shown that

seigenvalue theare 0 and 21 n

222

2211

21

2nnn xxxxxxA

and that 222111 nnmn

mmm xxxxxxA

1111

221

2111 xxxxxxxx m

nn

m

n

m

m

Page 92: Stats 346.3 Multivariate Data Analysis Stats 848.3.

Thus for large values of m

The algorithim

1.Compute powers of A - A2 , A4 , A8 , A16 , ...

2.Rescale (so that largest element is 1 (say))

3.Continue until there is no change, The resulting matrix will be

4.Find

5. Find

constant a 11 xxAm

c 11 xxAm

c that so 11 xxbbAb m

11111 using and 1

xxAbbb

x

Page 93: Stats 346.3 Multivariate Data Analysis Stats 848.3.

To find

6. Repeat steps 1 to 5 with the above matrix to find

7. Continue to find

:Note and 22 x

222111 nnn xxxxxxA

22 and x

nnxxx and ,, and , and 4433

Page 94: Stats 346.3 Multivariate Data Analysis Stats 848.3.

Example

A =5 4 24 10 12 1 2

1 2 3

eigenvalue 12.54461 3.589204 0.866182eignvctr 0.496986 0.677344 0.542412

0.849957 -0.50594 -0.146980.174869 0.534074 -0.82716

Page 95: Stats 346.3 Multivariate Data Analysis Stats 848.3.

Differentiation with respect to a vector, matrix

Page 96: Stats 346.3 Multivariate Data Analysis Stats 848.3.

1

p

df x

dxdf x

dxdf x

dx

Differentiation with respect to a vector

Let denote a p × 1 vector. Let denote a function of the components of .

x f x

x

Page 97: Stats 346.3 Multivariate Data Analysis Stats 848.3.

1 1

then

p

p

f x

x adf x

adx

af x

x

1. Suppose 1 1 n nf x a x a x a x

Rules

Page 98: Stats 346.3 Multivariate Data Analysis Stats 848.3.

1

then 2

p

f x

xdf x

Axdx

f x

x

2. Suppose

2 211 1 pp pf x x Ax a x a x

12 1 2 13 1 3 1, 12 2 2 p p p pa x x a x x a x x

1 1i.e. 2 2 2i ii i ip p

i

f xa x a x a x

x

Page 99: Stats 346.3 Multivariate Data Analysis Stats 848.3.

1122 0 or

df xAx b x A b

dx

Example

f x x Ax b x c

1. Determine when

is a maximum or minimum.

solution

Page 100: Stats 346.3 Multivariate Data Analysis Stats 848.3.

2 2 0

dg xAx x

dx

f x x Ax

2. Determine when is a maximum if1.x x

let 1g x x Ax x x

is the Lagrange multiplier.

solution

or Ax x

Assume A is a positive definite matrix.

This shows that is an eigenvector of A. x

and f x x Ax x x

Thus is the eigenvector of A associated with the largest eigenvalue, . x

Page 101: Stats 346.3 Multivariate Data Analysis Stats 848.3.

11 1

1

p

ij

q pp

f X f X

x xdf X f X

dX xf X f X

x x

Differentiation with respect to a matrix

Let X denote a q × p matrix. Let f (X) denote a function of the components of X then:

Page 102: Stats 346.3 Multivariate Data Analysis Stats 848.3.

1lnthen

d XX

dX

Example

Let X denote a p × p matrix. Let f (X) = ln |X|

Solution

1 1i i ij ij ip ipX x X x X x X

= (i,j)th element of X-1ln 1

ijij

XX

x X

Note Xij are cofactors

Page 103: Stats 346.3 Multivariate Data Analysis Stats 848.3.

trthen

d AXA

dX

Example

Let X and A denote p × p matrices.

Solution

1 1

trp p

ik kik k

AX a x

trji

ij

AXa

x

Let f (X) = tr (AX)

Page 104: Stats 346.3 Multivariate Data Analysis Stats 848.3.

111

1

p

ij

q qp

dudu

dx dxdudU

dx dxdu du

dx dx

Differentiation of a matrix of functions

Let U = (uij) denote a q × p matrix of functions of x then:

Page 105: Stats 346.3 Multivariate Data Analysis Stats 848.3.

1.

d aU dUa

dx dx

Rules:

2.

d U V dU dV

dx dx dx

3.

d UV dU dVV U

dx dx dx

Page 106: Stats 346.3 Multivariate Data Analysis Stats 848.3.

1

1 14. d U dU

U Udx dx

1U U I 1

1 0p p

dU dUU U

dx dx

Proof:

11dU dU

U Udx dx

11 1dU dU

U Udx dx

Page 107: Stats 346.3 Multivariate Data Analysis Stats 848.3.

tr5. tr

d AU dUA

dx dx

1 1

trp p

ik kii k

AU a u

Proof:

1 1

trtr

p pki

iki k

AU u dUa A

x x dx

11 1tr

6. trd AU dU

AU Udx dx

Page 108: Stats 346.3 Multivariate Data Analysis Stats 848.3.

Proof:

1

1 1tr7. tr ij

ij

d AXE X AX

dx

1

1 1 1 1trtr tr ij

ij ij

d AX dXAX X AX E X

dx dx

1 ,( ) where

0 otherwisekl kl kl

ij ij

i k j lE e e

1 1tr ijE X AX

11 1tr

8. d AX

X AXdX

Page 109: Stats 346.3 Multivariate Data Analysis Stats 848.3.

The Generalized Inverse of a matrix

Page 110: Stats 346.3 Multivariate Data Analysis Stats 848.3.

Recall

B (denoted by A-1) is called the inverse of A if

AB = BA = I

• A-1 does not exist for all matrices A

• A-1 exists only if A is a square matrix and |A| ≠ 0

• If A-1 exists then the system of linear equations has a unique solutionAx b

1x A b

Page 111: Stats 346.3 Multivariate Data Analysis Stats 848.3.

Definition

B (denoted by A-) is called the generalized inverse (Moore – Penrose inverse) of A if

1. ABA = A

2. BAB = B

3. (AB)' = AB

4. (BA)' = BA

Note: A- is unique

Proof: Let B1 and B2 satisfying

1. ABiA = A

2. BiABi = Bi

3. (ABi)' = ABi

4. (BiA)' = BiA

Page 112: Stats 346.3 Multivariate Data Analysis Stats 848.3.

Hence

B1 = B1AB1 = B1AB2AB1 = B1 (AB2)'(AB1) '

= B1B2'A'B1

'A'= B1B2'A' = B1AB2 = B1AB2AB2

= (B1A)(B2A)B2 = (B1A)'(B2A)'B2 = A'B1'A'B2

'B2

= A'B2'B2= (B2A)'B2

= B2AB2 = B2

The general solution of a system of Equations

Ax b

x A b I A A z

The general solution

x A b I A A z

where is arbitrary

Page 113: Stats 346.3 Multivariate Data Analysis Stats 848.3.

Suppose a solution exists

0Ax b

x A b I A A z

Let

then Ax A A b I A A z

AA b A AA A z

0 0AA Ax Ax b

Page 114: Stats 346.3 Multivariate Data Analysis Stats 848.3.

Calculation of the Moore-Penrose g-inverse

1then A A A A

1 1

A A A A A A A A A A I

Let A be a p×q matrix of rank q < p,

Proof

and AA A AI A A AA IA A thus

also is symmetricA A I

1

and is symmetricAA A A A A

Page 115: Stats 346.3 Multivariate Data Analysis Stats 848.3.

1then B B BB

1 1

BB B B BB BB BB I

Let B be a p×q matrix of rank p < q,

Proof

and BB B IB B B BB B I B thus

also is symmetricBB I

1

and is symmetricB B B BB B

Page 116: Stats 346.3 Multivariate Data Analysis Stats 848.3.

1 1then C B BB A A A

1 1 1

CC AB B BB A A A A A A A

Let C be a p×q matrix of rank k < min(p,q),

Proof

is symmetric, as well as

then C = AB where A is a p×k matrix of rank k and B is a k×q matrix of rank k

1 1 1

C C B BB A A A AB B BB B

1

Also CC C A A A A AB AB C

1 1 1

and C CC B BB B B BB A A A

1 1

B BB A A A C

Page 117: Stats 346.3 Multivariate Data Analysis Stats 848.3.

References

1. Matrix Algebra Useful for Statistics, Shayle R. Searle

2. Mathematical Tools for Applied Multivariate Analysis, J. Douglas Carroll, Paul E. Green