CHAPTER 4 INCREMENTAL PRINCIPAL COMPONENT ANALYSIS ON...

10
CHAPTER 4 INCREMENTAL PRINCIPAL COMPONENT ANALYSIS ON EIGENFACES 2D AND 3D MODEL 4.1 INTRODUCTION Learning and representing are the two important concepts that characterize any cognitive vision system. Learning is never a terminating process nor a moribund process in the real world scenario. Once knowledge of an object is procured, it keeps on updating whenever a small piece of information is encountered pertaining to it. Hence, realistic approaches are needed to update the already existing learnt representations in this cognitive vision. In general, visual learning is viewed as appearance-based modeling of scenes and objects by using principal component analysis. Nevertheless, principal component analysis deals with stable database, where faces are given as input to the existing system prior to processing. This means, the training set is known before hand. But if input images are fed on an online basis, it would not work. In such situation, there is a need for designing and developing a procedure for step-by-step learning [92], which has the capability of processing new face images along with updating the face space, as the images are appended sequentially. As the face space increases, by the addition of a new face image, the mean and the covariance matrix need to be

Transcript of CHAPTER 4 INCREMENTAL PRINCIPAL COMPONENT ANALYSIS ON...

CHAPTER 4

INCREMENTAL PRINCIPAL COMPONENT

ANALYSIS ON EIGENFACES – 2D AND 3D MODEL

4.1 INTRODUCTION

Learning and representing are the two important concepts

that characterize any cognitive vision system. Learning is never a

terminating process nor a moribund process in the real world

scenario. Once knowledge of an object is procured, it keeps on

updating whenever a small piece of information is encountered

pertaining to it. Hence, realistic approaches are needed to update the

already existing learnt representations in this cognitive vision. In

general, visual learning is viewed as appearance-based modeling of

scenes and objects by using principal component analysis.

Nevertheless, principal component analysis deals with stable

database, where faces are given as input to the existing system prior

to processing. This means, the training set is known before hand. But

if input images are fed on an online basis, it would not work. In such

situation, there is a need for designing and developing a procedure for

step-by-step learning [92], which has the capability of processing new

face images along with updating the face space, as the images are

appended sequentially. As the face space increases, by the addition of

a new face image, the mean and the covariance matrix need to be

updated accordingly based on the existing mean and covariance

matrix.

The dimensions of the covariance matrix also gradually

increase by the addition of a new face image to the existing dataset.

Eventually, this inturn reflects the eigenvectors and eigenvalues. Each

time a new face is added to the face space, the number of eigenvectors

and eigenvalues also increases respectively [93]. Furthermore,

computation of the new covariance matrix and the subsequent new

eigenvectors and thus eigenvalues involves increased complexity in

terms of time and space. The complexity has to be reduced by

retaining original data using incremental PCA as the relevant

technique.

4.2 MODELING INCREMENTAL PRINCIPAL COMPONENT

ANALYSIS (IPCA)

Let X1, X2, X3, …, XM be the images in 1-D vector format,

containing the training images information. Thus the mean image

vector is given by

where M is the number of face images and Ψ is the mean of M images.

M

1=ii

M21M

XM1=

MX+...+X+X=Ψ

(4.1)

)Ψ(XM1+Ψ=

MΨΨ+

MX=

)1) Ψ(M+(XM1=

)X+(XM1=

XM1=Ψ

1MM1M

1M1M

M

1MM

1M

1=iiM

M

1=iiM

(4.2)

The above equation is to estimate mean for a stationery

random process {XM}, where M is the time index, XM is a column vector

in a d-dimensional space, to generate eigenspace. The samples are

obtained sequentially over time.

Hence,

)Ψ(X1+M1+Ψ=Ψ' M1+MM1+M (4.3)

where Ψ ' is the estimated mean for (M+1) images.

If XM is a non-stationery random process, which implies that

it has time-varying mean, then the estimated mean is given as

......+α+α+α+1......+Xα+αX+X=Ψ' 32

2M2

1MMM

(4.4)

where is the decay parameter. This parameter controls the previous

samples or images and determines or estimates their contribution to

calculate the current mean. In general, this parameter takes the value

between 0 and 1, so that

α11=.....+α+α+α+1 32

(4.5)

The weighted mean is defined as

M

1=ii

i

M

1=ii

M

w

Xw=Ψ (4.6)

where wi is the weight associated with each sample.

If weights are equal, then

M

1=ii

M

1=i

M

1=ii

M

1=ii

i

M

1=ii

M

XM1=

1W

XW=

w

Xw=Ψ

(4.7)

Eq. 4.7 is equivalent to Eq. 4.1

Weights can be predicted as sample frequencies if all the

samples are different. Then they can be used to calculate probabilities

where pi = wi / Σwi

Let M

1=iiM w=W

where WM is sum of the weights.

Then,

i

M

=1ii

M

M

i

M

=1ii

M

=1ii

i

M

=1ii

M

XwW1=

W

Xw=

w

Xw=Ψ

)Ψ(XWw+Ψ=

)ΨwXw+Ψ(WW1=

))Ψw(W+X(wW1=

)ΨW+X(wW1=

)Xw+X(wW1=

1MMM

M1M

1MMMM1MMM

1MMMMMM

1M1MMMM

i

1M

1=iiMM

M

(4.8)

Let wM / WM be a constant 0<a<1 and let = 1- a

This produces the standard formula for the exponentially weighted

moving average

M1M

1MM

1MM1M

1MM1MM

α)X(1+αΨ=

a)Ψ(1+aX=

aΨaX+Ψ=

)Ψa(X+Ψ=Ψ

(4.9)

Substituting Eq. 4.5 in Eq. 4.4, we get

α11

Xα+.....+Xα+αX+X=Ψ' 11M

2M2

1MMM

11M2M

21MM Xα+.....+Xα+Xα+Xα1=

.....+Xα+Xαα1+Xα1= 2M2

1MM

.....+Xα+Xαα1+Xα1= 2M1MM

.....+α+α+α+1

.....+αX+Xα+Xα1= 322M1M

M

1MM Ψ'.α+Xα1=

M1M Xα1+Ψ'.α= (4.10)

where Ψ'M-1 is the mean of M-1 images, XM is the Mth image and Ψ'M is

the estimated mean of M images.

Hence estimated mean for M+1 images can be given as

1+MM1+M α)X(1+αΨ=Ψ' (4.11)

Based on the current sample and the previously estimated

mean, it is possible to obtain the new estimated mean in a recursive

manner. Let us assume that according to time vary, the images in the

image space changes. Consider Γ' to be the new input image, which

has to be appended to the existing database. Ψ is previously

calculated mean, which is the current mean. The mean normalized

vector is given as,

Φ = Γ ' – Ψ (4.12)

The new mean value is estimated by using the parameter,

which ranges from 0 to 1

Ψ ' = Ψ + (1 - ) Γ ' (4.13)

Substituting the value of Γ ' from 4.12 in Eq. 4.13, we get

Ψ ' = Ψ + (1 - ) (Φ + Ψ)

= Ψ + Φ + Ψ – Φ - Ψ

= Ψ + Φ - Φ

= Ψ + (1- ) Φ (4.14)

Let us consider C to be the existing, already calculated

covariance matrix with M' eigenvectors, which are significant. Now

estimation of the new covariance matrix is carried out by using ,

which ranges from 0 to 1. However, C' is the new covariance matrix

given by

C' = C + (1 - ) Φ ΦT (4.15)

Using the first M' significant eigenvectors and the related eigenvalues,

covariance matrix C can be computed. The matrix can be

computed as represented in the following equation.

C ≡ μ E μ' (4.16)

where μ is a matrix of M' eigenvectors of C. The order of μ is N X M',

where each column μ is an eigenvector of C, and E is the diagonal

matrix of eigenvalues. The order of E is M' X M' which corresponds to

M' eigenvectors.

TTiii

M'

1=iΦΦα1+μμλα=C' (4.17)

Let A = [a1, a2, …, aM'+1]

Where M'to1=iμαλ=a iii

Φα1=aM'+1 (4.18)

Then the covariance matrix in Eq. 4.17 can be expressed as

C' = AAT (4.19)

As discussed earlier, the matrix L = ATA is used to compute

C matrices eigenvectors. We obtain (M' + 1) eigenvectors of L by

solving the equations. Let ρi' be the new eigenvectors and λi' be the

new eigenvalues of the matrix L.

Then,

L ρi' = λi' ρi' where i= 1 to M'+1 (4.20)

Replacing L in Eq. 4.20, we get

ATA . ρi' = λi' . ρi'

By multiplying with A on both sides, we obtain

A . ATA . ρi' = A . λi' . ρi' (4.21)

Since λi is a scalar quantity, Eq. 4.21 becomes

AAT . Aρi' = λi' . (Aρi') (4.22)

Making use of the value of C' from Eq. 4.19 in Eq. 4.22, we get

C' . Aρi' = λi' . (Aρi') (4.23)

Let μi' (= Aρi') be (M'+1) eigenvectors and λi' be eigenvalues of C'.

4.2.1 RECOGNITION PROCEDURE FROM IPCA MODEL

The new weights of the training set can be calculated as,

Ψ'X.μ'=Φ.μ'=w iTki

Tkk (4.24)

Converting weights into (M'+1) X 1 format, we have

Ω = [w1, w2, w3, …, wM'+1]T (4.25)

To test the above procedure for an unknown image Γ, it is necessary

to compute its weight (wi). The weights can be generated as follows

Ψ'Γ.μ'=w Tii (4.26)

The unknown image weight matrix is given by Ω. The Euclidean

Distance εk between the given obscure image and each face class is

outlined by,

2K

2k ΩΩ=ε k = 1, 2, …, Nc (4.27)

where Nc represents the number of face classes. Euclidean Distance

(ε) between the unknown images and the reconstructed images is

calculated to recognize similarities between similar faces and

dissimilar faces like images.

ε2 = ║ Γ - Γf ║2 (4.28)

Calculation of Euclidean Distances is nothing but sum of

the square roots of the sum of the squares of the differences between

two values. A set of empirical thresholds values, are used to compare

the calculated Euclidean Distances. Based on the threshold value, the

input image can be categorized as known or unknown image.

There are a number of advantages of using the incremental

principal component analysis. First, it is possible to derive the

eigenvectors more effectively [89], then a traditional batch mode PCA.

Secondly, the incremental procedure permits the construction of the

eigenspace by using less storage and hence makes it feasible for a

huge dataset. Thirdly, in some applications, the initial accessibility of

the training data might be necessary, such as on-line training. In

such a case, incremental training algorithm helps to solve the problem

of iteratively performing the same PCA method every time a new image

is added. The disadvantage of incremental methods lies in their

accuracy when compared to batch methods. However, incremental

updates are few, the inaccuracies are negligible and are probably

accepted for majority of the applications. But, when many thousands

of updates are made, the inaccuracies also increase.

The model presented in this chapter is used for testing of

the standard Yale face space and FEI face space. The GRIET face

space is also used for testing purpose. A detailed discussion of the

results for the databases is presented in the next chapter.