Handling Outliers and Missing Data in Statistical Data...

48
Handling Outliers and Missing Data in Statistical Data Models Kaushik Mitra Date: 17/1/2011 ECSU Seminar, ISI

Transcript of Handling Outliers and Missing Data in Statistical Data...

Page 1: Handling Outliers and Missing Data in Statistical Data …kmitra/files/talksPosters/HandlingOutliers... · Handling Outliers and Missing Data in Statistical Data Models Kaushik Mitra

Handling Outliers and Missing Data in Statistical Data Models

Kaushik Mitra

Date: 17/1/2011

ECSU Seminar, ISI

Page 2: Handling Outliers and Missing Data in Statistical Data …kmitra/files/talksPosters/HandlingOutliers... · Handling Outliers and Missing Data in Statistical Data Models Kaushik Mitra

Statistical Data Models

• Goal: Find structure in data• Applications

– Finance– Engineering– Sciences

• Biological

– Wherever we deal with data

• Some examples– Regression– Matrix factorization

• Challenges: Outliers and Missing data

Page 3: Handling Outliers and Missing Data in Statistical Data …kmitra/files/talksPosters/HandlingOutliers... · Handling Outliers and Missing Data in Statistical Data Models Kaushik Mitra

Outliers Are Quite Common

Google search results for `male faces’

Page 4: Handling Outliers and Missing Data in Statistical Data …kmitra/files/talksPosters/HandlingOutliers... · Handling Outliers and Missing Data in Statistical Data Models Kaushik Mitra

Need to Handle Outliers Properly

Noisy image Gaussian filtered image Desired result

Removing salt-and-pepper (outlier) noise

Page 5: Handling Outliers and Missing Data in Statistical Data …kmitra/files/talksPosters/HandlingOutliers... · Handling Outliers and Missing Data in Statistical Data Models Kaushik Mitra

Missing Data Problem

Completing missing tracks

Incomplete tracksCompleted tracks by a sub-optimal method

Desired result

Missing tracks in structure from motion

Page 6: Handling Outliers and Missing Data in Statistical Data …kmitra/files/talksPosters/HandlingOutliers... · Handling Outliers and Missing Data in Statistical Data Models Kaushik Mitra

Our Focus

• Outliers in regression

– Linear regression

– Kernel regression

• Matrix factorization in presence of missing data

Page 7: Handling Outliers and Missing Data in Statistical Data …kmitra/files/talksPosters/HandlingOutliers... · Handling Outliers and Missing Data in Statistical Data Models Kaushik Mitra

Robust Linear Regression for High Dimension Problems

Page 8: Handling Outliers and Missing Data in Statistical Data …kmitra/files/talksPosters/HandlingOutliers... · Handling Outliers and Missing Data in Statistical Data Models Kaushik Mitra

What is Regression?

• Regression

– Find functional relation between y and x

• x: independent variable

• y: dependent variable

– Given

• data: (yi,xi) pairs

• Model y = f(x, w)+n

– Estimate w

– Predict y for a new x

Page 9: Handling Outliers and Missing Data in Statistical Data …kmitra/files/talksPosters/HandlingOutliers... · Handling Outliers and Missing Data in Statistical Data Models Kaushik Mitra

Robust Regression

• Real world data corrupted with outliers

• Outliers make estimates unreliable

• Robust regression– Unknown

• Parameter, w

• Outliers

– Combinatorial problem• N data and k outliers

• C(N,k) ways

Page 10: Handling Outliers and Missing Data in Statistical Data …kmitra/files/talksPosters/HandlingOutliers... · Handling Outliers and Missing Data in Statistical Data Models Kaushik Mitra

Prior Work

• Combinatorial algorithms

– Random sample consensus (RANSAC)

– Least Median Squares (LMedS)

• Exponential in dimension

• M-estimators

– Robust cost functions

– local minima

Page 11: Handling Outliers and Missing Data in Statistical Data …kmitra/files/talksPosters/HandlingOutliers... · Handling Outliers and Missing Data in Statistical Data Models Kaushik Mitra

Robust Linear Regression model

• Linear regression model : yi=xiTw+ei

– ei, Gaussian noise

• Proposed robust model: ei=ni+si

– ni, inlier noise (Gaussian)

– si, outlier noise (sparse)

• Matrix-vector form

– y=Xw+n+s

• Estimate w, s

y1

y2

.

.yN

x1T

x2T

.

.xN

T

n1

n2

.

.nN

s1

s2

.

.sN

= + +

w1

w2

.wD

Page 12: Handling Outliers and Missing Data in Statistical Data …kmitra/files/talksPosters/HandlingOutliers... · Handling Outliers and Missing Data in Statistical Data Models Kaushik Mitra

Simplification

• Objective (RANSAC): Find w that minimizes the number of outliers

• Eliminate w• Model: y=Xw+n+s• Premultiple by C : CX=0, N ≥ D

– Cy=CXw+Cs+Cn– z=Cs+g

– g Gaussian

• Problem becomes: • Solve for s -> identify outliers -> LS -> w

20 ||||||||min Cszss

tosubject

20 ||||||||min sXwysws,

tosubject

Page 13: Handling Outliers and Missing Data in Statistical Data …kmitra/files/talksPosters/HandlingOutliers... · Handling Outliers and Missing Data in Statistical Data Models Kaushik Mitra

Relation to Sparse Learning

• Solve:

– Combinatorial problem

• Sparse Basis Selection/ Sparse Learning

• Two approaches :

– Basis Pursuit (Chen, Donoho, Saunder 1995)

– Bayesian Sparse Learning (Tipping 2001)

20 ||||||||min Cszss

tosubject

Page 14: Handling Outliers and Missing Data in Statistical Data …kmitra/files/talksPosters/HandlingOutliers... · Handling Outliers and Missing Data in Statistical Data Models Kaushik Mitra

Basis Pursuit Robust regression (BPRR)

• Solve – Basis Pursuit Denoising (Chen et. al. 1995)– Convex problem– Cubic complexity : O(N3)

• From Compressive Sensing theory (Candes 2005)– Equivalent to original problem if

• s is sparse• C satisfy Restricted Isometry Property (RIP)

• Isometry: ||s1 - s2|| = ||C(s1 – s2)||• Restricted: to the class of sparse vectors

• In general, no guarantees for our problem

Cszss

thatsuch1

min

Page 15: Handling Outliers and Missing Data in Statistical Data …kmitra/files/talksPosters/HandlingOutliers... · Handling Outliers and Missing Data in Statistical Data Models Kaushik Mitra

Bayesian Sparse Robust Regression (BSRR)

• Sparse Bayesian learning technique (Tipping 2001)

– Puts a sparsity promoting prior on s :

– Likelihood : p(z/s)=Ν(Cs,εI)

– Solves the MAP problem p(s/z)

– Cubic Complexity : O(N3)

N

i issp

1

1)(

Page 16: Handling Outliers and Missing Data in Statistical Data …kmitra/files/talksPosters/HandlingOutliers... · Handling Outliers and Missing Data in Statistical Data Models Kaushik Mitra

Setup for Empirical Studies

• Synthetically generated data

• Performance criteria

– Angle between ground truth

and estimated hyper-planes

Page 17: Handling Outliers and Missing Data in Statistical Data …kmitra/files/talksPosters/HandlingOutliers... · Handling Outliers and Missing Data in Statistical Data Models Kaushik Mitra

Vary Outlier Fraction

BSRR performs well in all dimensions

Combinatorial algorithms like RANSAC, MSAC, LMedS not practical in high dimensions

Dimension = 2 Dimension = 8 Dimension = 32

Page 18: Handling Outliers and Missing Data in Statistical Data …kmitra/files/talksPosters/HandlingOutliers... · Handling Outliers and Missing Data in Statistical Data Models Kaushik Mitra

Facial Age Estimation • Fgnet dataset : 1002 images of 82 subjects

• Regression– y : Age

– x: Geometric feature vector

Page 19: Handling Outliers and Missing Data in Statistical Data …kmitra/files/talksPosters/HandlingOutliers... · Handling Outliers and Missing Data in Statistical Data Models Kaushik Mitra

Outlier Removal by BSRR

• Label data as inliers and outliers

• Detected 177 outliers in 1002 images

BSRR

Inlier MAE 3.73

Outlier MAE 19.14

Overall MAE 6.45

•Leave-one-out testing

Page 20: Handling Outliers and Missing Data in Statistical Data …kmitra/files/talksPosters/HandlingOutliers... · Handling Outliers and Missing Data in Statistical Data Models Kaushik Mitra

Summary for Robust Linear Regression

• Modeled outliers as sparse variable

• Formulated robust regression as Sparse Learning problem

– BPRR and BSRR

• BSRR gives the best performance

• Limitation: linear regression model

– Kernel model

Page 21: Handling Outliers and Missing Data in Statistical Data …kmitra/files/talksPosters/HandlingOutliers... · Handling Outliers and Missing Data in Statistical Data Models Kaushik Mitra

Robust RVM Using Sparse Outlier Model

Page 22: Handling Outliers and Missing Data in Statistical Data …kmitra/files/talksPosters/HandlingOutliers... · Handling Outliers and Missing Data in Statistical Data Models Kaushik Mitra

Relevance Vector Machine (RVM)

• RVM model:

– : kernel function

• Examples of kernels

– k(xi, xj) = (xiTxj)

2 : polynomial kernel

– k(xi, xj) = exp( -||xi - xj||2/2σ2) : Gaussian kernel

• Kernel trick: k(xi,xj) = ψ(xi)Tψ(xj)

– Map xi to feature space ψ(xi)

N

i

i ewkwy1

0),()( ixxx

),( ixxk

Page 23: Handling Outliers and Missing Data in Statistical Data …kmitra/files/talksPosters/HandlingOutliers... · Handling Outliers and Missing Data in Statistical Data Models Kaushik Mitra

RVM: A Bayesian Approach

• Bayesian approach– Prior distribution : p(w)– Likelihood :

• Prior specification– p(w) : sparsity promoting prior p(wi) = 1/|wi|– Why sparse?

• Use a smaller subset of training data for prediction• Support vector machine

• Likelihood – Gaussian noise

• Non-robust : susceptible to outliers

),|( wxyp

Page 24: Handling Outliers and Missing Data in Statistical Data …kmitra/files/talksPosters/HandlingOutliers... · Handling Outliers and Missing Data in Statistical Data Models Kaushik Mitra

Robust RVM model

• Original RVM model

– e, Gaussian noise

• Explicitly model outliers, ei= ni + si

– ni, inlier noise (Gaussian)

– si, outlier noise (sparse and heavy-tailed)

• Matrix vector form

– y = Kw + n + s

• Parameters to be estimated: w and s

N

i

jj ewkwy1

0)( xx,

Page 25: Handling Outliers and Missing Data in Statistical Data …kmitra/files/talksPosters/HandlingOutliers... · Handling Outliers and Missing Data in Statistical Data Models Kaushik Mitra

Robust RVM Algorithms

• y = [K|I]ws + n

– ws = [wT sT]T : sparse vector

• Two approaches

– Bayesian

– Optimization

Page 26: Handling Outliers and Missing Data in Statistical Data …kmitra/files/talksPosters/HandlingOutliers... · Handling Outliers and Missing Data in Statistical Data Models Kaushik Mitra

Robust Bayesian RVM (RB-RVM)

• Prior specification

– w and s independent : p(w, s) = p(w)p(s)

– Sparsity promoting prior for s: p(si)= 1/|si|

• Solve for posterior p(w, s|y)

• Prediction: use w inferred above

• Computation: a bigger RVM

– ws instead of w

– [K|I] instead of K

Page 27: Handling Outliers and Missing Data in Statistical Data …kmitra/files/talksPosters/HandlingOutliers... · Handling Outliers and Missing Data in Statistical Data Models Kaushik Mitra

Basis Pursuit RVM (BP-RVM)

• Optimization approach

– Combinatorial

• Closest convex approximation

• From compressive sensing theory

– Same solution if [K|I] satisfies RIP

• In general, can not guarantee

20 ||]|[||||||min ssw

wIKyws

tosubject

21 ||]|[||||||min ssw

wIKyws

tosubject

Page 28: Handling Outliers and Missing Data in Statistical Data …kmitra/files/talksPosters/HandlingOutliers... · Handling Outliers and Missing Data in Statistical Data Models Kaushik Mitra

Experimental Setup

Page 29: Handling Outliers and Missing Data in Statistical Data …kmitra/files/talksPosters/HandlingOutliers... · Handling Outliers and Missing Data in Statistical Data Models Kaushik Mitra

Prediction : Asymmetric Outliers Case

Page 30: Handling Outliers and Missing Data in Statistical Data …kmitra/files/talksPosters/HandlingOutliers... · Handling Outliers and Missing Data in Statistical Data Models Kaushik Mitra

Image Denoising

• Salt and pepper noise

– Outliers

• Regression formulation

– Image as a surface over 2D grid

• y: Intensity

• x: 2D grid

• Denoised image obtained by prediction

Page 31: Handling Outliers and Missing Data in Statistical Data …kmitra/files/talksPosters/HandlingOutliers... · Handling Outliers and Missing Data in Statistical Data Models Kaushik Mitra

Salt and Pepper Noise

Page 32: Handling Outliers and Missing Data in Statistical Data …kmitra/files/talksPosters/HandlingOutliers... · Handling Outliers and Missing Data in Statistical Data Models Kaushik Mitra

Some More ResultsRVM RB-RVM Median Filter

Page 33: Handling Outliers and Missing Data in Statistical Data …kmitra/files/talksPosters/HandlingOutliers... · Handling Outliers and Missing Data in Statistical Data Models Kaushik Mitra

Age Estimation from Facial Images

• RB-RVM detected 90 outliers

• Leave-one-person-out testing

Page 34: Handling Outliers and Missing Data in Statistical Data …kmitra/files/talksPosters/HandlingOutliers... · Handling Outliers and Missing Data in Statistical Data Models Kaushik Mitra

Summary for Robust RVM

• Modeled outliers as sparse variables

• Jointly estimated parameter and outliers

• Bayesian approach gives very good result

Page 35: Handling Outliers and Missing Data in Statistical Data …kmitra/files/talksPosters/HandlingOutliers... · Handling Outliers and Missing Data in Statistical Data Models Kaushik Mitra

Limitations of Regression

• Regression: y = f(x,w)+n– Noise in only “y”

– Not always reasonable

• All variables have noise– M = [x1 x2 … xN]

– Principal component analysis (PCA)• [x1 x2 … xN] = ABT

– A: principal components

– B: coefficients

– M = ABT: matrix factorization (our next topic)

Page 36: Handling Outliers and Missing Data in Statistical Data …kmitra/files/talksPosters/HandlingOutliers... · Handling Outliers and Missing Data in Statistical Data Models Kaushik Mitra

Matrix Factorization in the presence of Missing Data

Page 37: Handling Outliers and Missing Data in Statistical Data …kmitra/files/talksPosters/HandlingOutliers... · Handling Outliers and Missing Data in Statistical Data Models Kaushik Mitra

Applications in Computer Vision

• Matrix factorization: M=ABT

• Applications: build 3-D models from images– Geometric approach (Multiple views)

– Photometric approach (Multiple Lightings)

37

Structure from Motion (SfM)

Photometric stereo

Page 38: Handling Outliers and Missing Data in Statistical Data …kmitra/files/talksPosters/HandlingOutliers... · Handling Outliers and Missing Data in Statistical Data Models Kaushik Mitra

Matrix Factorization

• Applications in Vision

– Affine Structure

from Motion (SfM)

– Photometric stereo

• Solution: SVD

– M=USVT

– Truncate S to rank r

• A=US0.5, B=VS0.5

38

M =xij

yij= CST

Rank 4 matrix

M = NST, rank = 3

Page 39: Handling Outliers and Missing Data in Statistical Data …kmitra/files/talksPosters/HandlingOutliers... · Handling Outliers and Missing Data in Statistical Data Models Kaushik Mitra

Missing Data Scenario

• Missed feature tracks in SfM

• Specularities and shadow in photometric stereo

39

Incomplete feature tracks

Page 40: Handling Outliers and Missing Data in Statistical Data …kmitra/files/talksPosters/HandlingOutliers... · Handling Outliers and Missing Data in Statistical Data Models Kaushik Mitra

Challenges in Missing Data Scenario

• Can’t use SVD

• Solve:

• W: binary weight matrix, λ: regularization parameter

• Challenges

– Non-convex problem

– Newton’s method based algorithm (Buchanan et. al. 2005)

• Very slow

• Design algorithm

– Fast (handle large scale data)

– Flexible enough to handle additional constraints

• Ortho-normality constraints in ortho-graphic SfM

)||||||(||||)(||min 222

FFF

TBAABMW

BA,

Page 41: Handling Outliers and Missing Data in Statistical Data …kmitra/files/talksPosters/HandlingOutliers... · Handling Outliers and Missing Data in Statistical Data Models Kaushik Mitra

Proposed Solution

• Formulate matrix factorization as a low-rank semidefinite program (LRSDP)

– LRSDP: fast implementation of SDP (Burer, 2001)

• Quasi-Newton algorithm

• Advantages of the proposed formulation:

– Solve large-scale matrix factorization problem

– Handle additional constraints

41

Page 42: Handling Outliers and Missing Data in Statistical Data …kmitra/files/talksPosters/HandlingOutliers... · Handling Outliers and Missing Data in Statistical Data Models Kaushik Mitra

Low-rank Semidefinite Programming (LRSDP)

• Stated as:

• Variable: R

• Constants

• C: cost

• Al, bl: constants

• Challenge

• Formulating matrix factorization as LRSDP

• Designing C, Al, bl

klbtosubject l

T

l

T ,...,2,1,min RRARRCR

Page 43: Handling Outliers and Missing Data in Statistical Data …kmitra/files/talksPosters/HandlingOutliers... · Handling Outliers and Missing Data in Statistical Data Models Kaushik Mitra

Matrix factorization as LRSDP: Noiseless Case

• We want to formulate:

• As:

• LRSDP formulation:

),()(||||||||min ,,

22

,jifortosubject jiji

T

FFBA

MABBA

)(||||||||

)(||||),(||||

22

22

T

FF

T

F

T

F

trace

tracetrace

RRBA

BBBAAA

jimji

T

jiji

T

,,,, )()( MRRMAB

||,...,2,1, lbtosubject l

TTRRARRC l

C identity matrix,Al indicator matrix

Page 44: Handling Outliers and Missing Data in Statistical Data …kmitra/files/talksPosters/HandlingOutliers... · Handling Outliers and Missing Data in Statistical Data Models Kaushik Mitra

Affine SfM

• Dinosaur sequence

• MF-LRSDP gives the best reconstruction

72% missing data

Page 45: Handling Outliers and Missing Data in Statistical Data …kmitra/files/talksPosters/HandlingOutliers... · Handling Outliers and Missing Data in Statistical Data Models Kaushik Mitra

Photometric Stereo

• Face sequence

• MF-LRSDP and damped Newton gives the best result

42% missing data

Page 46: Handling Outliers and Missing Data in Statistical Data …kmitra/files/talksPosters/HandlingOutliers... · Handling Outliers and Missing Data in Statistical Data Models Kaushik Mitra

Additional Constraints:Orthographic Factorization

• Dinosaur sequence

Page 47: Handling Outliers and Missing Data in Statistical Data …kmitra/files/talksPosters/HandlingOutliers... · Handling Outliers and Missing Data in Statistical Data Models Kaushik Mitra

Summary

• Formulated missing data matrix factorization as LRSDP– Large scale problems

– Handle additional constraints

• Overall summary– Two statistical data models

• Regression in presence of outliers– Role of sparsity

• Matrix factorization in presence of missing data– Low rank semidefinite program

Page 48: Handling Outliers and Missing Data in Statistical Data …kmitra/files/talksPosters/HandlingOutliers... · Handling Outliers and Missing Data in Statistical Data Models Kaushik Mitra

Thank you! Questions?

48