L4 linear model and matrix representation

32
Introduction to Linear Model Statistical Methods in Finance Lecture 4 Ta-Wei Huang December 7, 2016 Ta-Wei Huang Introduction to Linear Model December 7, 2016 1 / 29

Transcript of L4 linear model and matrix representation

Page 1: L4 linear model and matrix representation

Introduction to Linear ModelStatistical Methods in Finance

Lecture 4

Ta-Wei Huang

December 7, 2016

Ta-Wei Huang Introduction to Linear Model December 7, 2016 1 / 29

Page 2: L4 linear model and matrix representation

Table of Contents

Regression analysis is almost certainly the most important tool at the

econometrician’s disposal. But what is regression? Let’s what we’ll talk

about in today’s lecture.

1 Basic Idea in Regression

2 Matrix Representation

3 The Least Square Estimator

4 Next Lecture

Ta-Wei Huang Introduction to Linear Model December 7, 2016 2 / 29

Page 3: L4 linear model and matrix representation

Table of Contents

Regression analysis is almost certainly the most important tool at the

econometrician’s disposal. But what is regression? Let’s what we’ll talk

about in today’s lecture.

1 Basic Idea in Regression

2 Matrix Representation

3 The Least Square Estimator

4 Next Lecture

Ta-Wei Huang Introduction to Linear Model December 7, 2016 2 / 29

Page 4: L4 linear model and matrix representation

Table of Contents

Regression analysis is almost certainly the most important tool at the

econometrician’s disposal. But what is regression? Let’s what we’ll talk

about in today’s lecture.

1 Basic Idea in Regression

2 Matrix Representation

3 The Least Square Estimator

4 Next Lecture

Ta-Wei Huang Introduction to Linear Model December 7, 2016 2 / 29

Page 5: L4 linear model and matrix representation

Table of Contents

Regression analysis is almost certainly the most important tool at the

econometrician’s disposal. But what is regression? Let’s what we’ll talk

about in today’s lecture.

1 Basic Idea in Regression

2 Matrix Representation

3 The Least Square Estimator

4 Next Lecture

Ta-Wei Huang Introduction to Linear Model December 7, 2016 2 / 29

Page 6: L4 linear model and matrix representation

Basic Idea in Regression

What is Linear Model 1

In very general terms, regression (linear model) is concerned with

describing and evaluating the linear relationship between a given

variable and one or more other variables.

More specifically, linear model is an attempt to explain movements in

a variable by reference to movements in one or more other variables.

(Causal Relationship)

Ta-Wei Huang Introduction to Linear Model December 7, 2016 3 / 29

Page 7: L4 linear model and matrix representation

Basic Idea in Regression

What is Linear Model 2

Causal Relationship: Y = f(X1, · · · , Xk) + ε

Y : Output Variable (Response, Effect)

Xi: Input Variables (Causes)

f : a function presenting the causal relationship

ε: a random error term

The causal relationship f is deterministic but unknown. Can we

approximate it by

f(X1, · · · , Xn) =∑

i βigi(X1, · · · , Xk),

where gi is known (and chosen by ourselves)?

Ta-Wei Huang Introduction to Linear Model December 7, 2016 4 / 29

Page 8: L4 linear model and matrix representation

Basic Idea in Regression

Definition of Linear Model

Definition (Linear Model)

The linear model of an output variable Y with input variables X1, · · · , Xk

has the general form

Y =

p∑i=1

βigi(X1, · · · , Xk) + ε,

where X1, · · · , Xn are accurate deterministic, gi is a known function of

X1, · · · , Xn for i = 1, . . . , k, βi is a unknown parameter entered in

linearity for i = 1, . . . , p, and ε is a random error term.

Ta-Wei Huang Introduction to Linear Model December 7, 2016 5 / 29

Page 9: L4 linear model and matrix representation

Basic Idea in Regression

Explanation of Linear Model

Y =

p∑i=1

βigi(X1, · · · , Xk) + ε

The definition implies that we know the form gi of effects on Y , but

we don’t know the magnitude βi of effects on Y . ⇒ signal

In a linear model, we assume that variation due to random error ε

only occurs on the output Y . ⇒ error

Ta-Wei Huang Introduction to Linear Model December 7, 2016 6 / 29

Page 10: L4 linear model and matrix representation

Basic Idea in Regression

Rationale of Linear Model 1

A general model is given by Y = f(X1, · · · , Xk) + ε, where f is unknown

and arbitrary.

By Taylor’s theorem, f(X1, · · · , Xn) =∑∞

i=01i!D

if(X) ·X, which

implies that there are infinite parameters to be estimated.

Usually we don’t have enough data to estimate f directly, and so we

have to assume that it has some more restricted form.

A local (the range of data X1, · · · , Xk) approximation of f may be

achievable by a linear model.

Because the predictors can be transformed and combined in any way,

linear models are actually very flexible.

Ta-Wei Huang Introduction to Linear Model December 7, 2016 7 / 29

Page 11: L4 linear model and matrix representation

Basic Idea in Regression

Rationale of Linear Model 2

Let f(X) =∑p

i=1 βigi(X). Then the predictive value Y = f(X).

The mean squared error (MSE) is E(Y − Y )2 = E(f(X) + ε− f(X))2

= (f(X)− f(X))2︸ ︷︷ ︸reducible

+ V ar(ε)︸ ︷︷ ︸irreducible

An intuitive way is to locally minimize the reducible error

(f(X)− f(X))2, but not always do that for some reasons.

Ta-Wei Huang Introduction to Linear Model December 7, 2016 8 / 29

Page 12: L4 linear model and matrix representation

Basic Idea in Regression

When to Use Linear Model 1

Examine the causal relationship between an output variable Y

(effect) and some input variables X = (X1, · · · , Xk) (causes).

Question

Let Yi,t = Ri,t −Rf,t and Xi,t = Rm,t −Rf,t. We find out the linear

model Yi,t = αi,t + βi,tXi,t + εi,t fitting well. Can we conclude that the

reason for a higher return on one stock is the higher market premium?

From the above question, we know that sometimes the causal

relationship is not quite clear, especially in financial data. Can we still

use linear models to model behaviours in financial markets?

Ta-Wei Huang Introduction to Linear Model December 7, 2016 9 / 29

Page 13: L4 linear model and matrix representation

Basic Idea in Regression

When to Use Linear Model 2

Even where no sensible causal relationship exists between X and Y ,

we may wish to relate them by some sore of mathematical equation

(rationale: sample from a multivariate normal population) since there

is a strong association between X and Y .

From mathematical derivation, one can see that if (X, Y ) follows a

joint normal distribution, then Y |X has the pattern of linear model.

(See: Sampling Model)

Ta-Wei Huang Introduction to Linear Model December 7, 2016 10 / 29

Page 14: L4 linear model and matrix representation

Basic Idea in Regression

When to Use Linear Model 3

Question

To investigate the behaviour of returns, which models should be used?

1 Ri,t −Rf,t = αi,t + βi,t(Rm,t −Rf,t) + εi,t

2 Rm,t −Rf,t = αi,t + βi,t(Ri,t −Rf,t) + εi,t

The domain knowledge is important to determine input and output

variables, and that’s why theoretical models are still important even if

they are not realistic. However, in machine learning and prediction,

association relationship is enough!

Ta-Wei Huang Introduction to Linear Model December 7, 2016 11 / 29

Page 15: L4 linear model and matrix representation

Basic Idea in Regression

Data Type in Linear Model

A response/output/dependent variable Y is modeled or explained by

predicton/input/independent/regressor variables that are functions of

X = (X1, · · · , Xk).

Y : an ”approximately” continuous random variable

X: continuous/discrete/categorical deterministic variables

Note that if X is a random vector, then Y |X allows us to treat X as

a deterministic vector.

Ta-Wei Huang Introduction to Linear Model December 7, 2016 12 / 29

Page 16: L4 linear model and matrix representation

Basic Idea in Regression

Types of Linear Model

X = (X1, · · · , Xk): quantitative ⇒ multiple regression

X = (X1, · · · , Xk): qualitative + quantitative ⇒ analysis of

covariance

X = (X1, · · · , Xk): qualitative ⇒ analysis of variance (ANOVA)

multiple Y ’s ⇒ multivariate regression

qualitative Y ⇒ generalized linear model (logistic regression)

Ta-Wei Huang Introduction to Linear Model December 7, 2016 13 / 29

Page 17: L4 linear model and matrix representation

Basic Idea in Regression

Statistical Procedure of Linear Model

Ta-Wei Huang Introduction to Linear Model December 7, 2016 14 / 29

Page 18: L4 linear model and matrix representation

Matrix Representation

The Data Structure

The data structure of n records with one output variable Y and k input

variables X1, ..., Xk is

Y X1 X2 · · · Xk

y1 x11 x12 · · · x1k

y2 x21 x22 · · · x2k...

......

......

yn xn1 xn2 · · · xnk

Ta-Wei Huang Introduction to Linear Model December 7, 2016 15 / 29

Page 19: L4 linear model and matrix representation

Matrix Representation

Matrix Representation

The functional form of a linear model is

Yi = β0 + β1g1(xi1, · · · , xik) + · · ·+ βp−1gp−1(xi1, · · · , xik) + εi,

for i = 1, 2, . . . , n.

The matrix representation of that model is

Y = Xβ + ε,

where Yn×1 =

(y1...yn

), Xn×p =

1 g11 ··· g1p−1

......

. . ....

1 gn1 ··· gnp−1

,

βp×1 =

(β0...

βp−1

), and εn×1 =

( ε1...εn

).

Ta-Wei Huang Introduction to Linear Model December 7, 2016 16 / 29

Page 20: L4 linear model and matrix representation

Matrix Representation

Common Linear Models

Definition

A linear model Y = Xβ + ε is said to be

a least square model if there is no assumption on ε. The parameter

space is Θ = β : β ∈ Rp.

a Gauss Markov model if E(ε) = 0 and cov(ε) = σ2I. The

parameter space is Θ = (β, σ2) : β ∈ Rp, σ2 ∈ R+.

a Aitken model if E(ε) = 0 and cov(ε) = σ2V, where V is known.

The parameter space is Θ = (β, σ2) : β ∈ Rp, σ2 ∈ R+.

a general linear mixed model if E(ε) = 0 and cov(ε) = Σ ≡ Σ(θ).

The parameter space is Θ = (β, θ) : β ∈ Rp, σ2 ∈ Ω, where Ω is

the set of all values of θ such that Σ(θ) is positive definite.

Ta-Wei Huang Introduction to Linear Model December 7, 2016 17 / 29

Page 21: L4 linear model and matrix representation

Matrix Representation

Gauss Markov Model

Definition

A linear model Y = Xβ + ε is said to be a Gauss Markov model if

E(ε) = 0 and cov(ε) = σ2I. The parameter space of this model is

Θ = (β, σ2) : β ∈ Rp, σ2 ∈ R+.

Common Gauss Markov Model

One-sample Problem

Simple Linear Regression

Multiple Linear Regression

ANOVA and ANCOVA

Ta-Wei Huang Introduction to Linear Model December 7, 2016 18 / 29

Page 22: L4 linear model and matrix representation

Matrix Representation

Example 1 (One-sample Problem)

Assume that Y1, · · · , Yn is an iid sample with mean µ and variance

σ2 > 0. If ε1, · · · , εn are iid with mean E(εi) = 0 and common variance

σ2, then the functional form of the GM model is Yi = µ+ εi. The matrix

form of this model is Y = Xβ + ε, where Yn×1 =

(Y1...Yn

), Xn×1 =

( 1...1

),

β1×1 = µ, and εn×1 =

( ε1...εn

).

Ta-Wei Huang Introduction to Linear Model December 7, 2016 19 / 29

Page 23: L4 linear model and matrix representation

Matrix Representation

Example 2 (Simple Linear Regression)

Consider the model where a response variable Y is linearly related to an

independent variable x via Yi = β0 + β1xi + εi for i = 1, 2, . . . , n, where

εi are uncorrelated random variables with mean 0 and common variance

σ2 > 0. The matrix form of this model is Y = Xβ + ε, where

Yn×1 =

(Y1...Yn

), Xn×2 =

( 1 x1...

...1 xn

), β2×1 =

(β0β1

), and εn×1 =

( ε1...εn

).

Ta-Wei Huang Introduction to Linear Model December 7, 2016 20 / 29

Page 24: L4 linear model and matrix representation

Matrix Representation

Example 3 (Multiple Linear Regression)

Consider the model where a response variable Y is linearly related to

several independent variables, say x1, · · · , xk via

Yi = β0 + β1xi1 + · · ·+ β1xk + εi for i = 1, 2, . . . , n, where εi are

uncorrelated with mean 0 and common variance σ2 > 0. The matrix form

of this model is Y = Xβ + ε, where Yn×1 =

(Y1...Yn

),

Xn×p =

1 x11 ··· x1k...

.... . .

...1 xn1 ··· xnk

, βp×1 =

(β0...βk

), and εn×1 =

( ε1...εn

).

Ta-Wei Huang Introduction to Linear Model December 7, 2016 21 / 29

Page 25: L4 linear model and matrix representation

The Least Square Estimator

Introduction

Consider the GM linear mode Y = Xβ + ε, where Y is an n× 1 vector of

observed responses, X is an n× p matrix of functions of input variables, β

is a p× 1 unknown parameters needed estimating, and ε is an n× 1 vector

of random errors.

If Y is a random vector but input variables x1, · · · , xn are fixed

constants, then E(Y) = Xβ and cov(Y ) = σ2I.

If Y and input variables X1, · · · , Xn are all random, then

E(Y|X1, · · · , Xn) = Xβ and cov(Y |X1, · · · , Xn) = σ2I.

Ta-Wei Huang Introduction to Linear Model December 7, 2016 22 / 29

Page 26: L4 linear model and matrix representation

The Least Square Estimator

A Geometric Viewpoint: Simple Case 1

Now, consider a simple regression model Yi = β0 + β1xi + εi, i = 1, 2, 3.

⇒(Y1Y2Y3

)=

(1 x11 x21 x3

)(β0β1

)+(ε1ε2ε3

)= β0

(111

)+ β1

(x1x2x3

)+(ε1ε2ε3

)Then the random vector Y ∈ R3 with

two dimensions coming from β0

(111

)+ β1

(x1x2x3

)one dimension coming from

(ε1ε2ε3

)Let β =

(β0β1

)be an estimator for β =

(β0β1

).

Ta-Wei Huang Introduction to Linear Model December 7, 2016 23 / 29

Page 27: L4 linear model and matrix representation

The Least Square Estimator

A Geometric Viewpoint: Simple Case 2

Let Ω = β0(

111

)+ β1

(x1x2x3

): β0, β1 ∈ R. Then dim(Ω) = 2.

Finding β is equivalent to finding a vector Xβ on two-dimensional Ω.

Our target is to find β such that Y = Xβ + ε = Xβ + ε.

Question

What Y = Xβ captures information best?

Answer

Find β such that Y = Xβ close to Y. ⇒ What is the meaning of

”close”? The Euclidean distance?

Ta-Wei Huang Introduction to Linear Model December 7, 2016 24 / 29

Page 28: L4 linear model and matrix representation

The Least Square Estimator

A Geometric Viewpoint: Simple Case 3

Ta-Wei Huang Introduction to Linear Model December 7, 2016 25 / 29

Page 29: L4 linear model and matrix representation

The Least Square Estimator

A General Case

Consider the GM linear mode Y = Xβ + ε, where Y is an n× 1 vector of

observed responses, X is an n× p matrix of functions of input variables, β

is a p× 1 unknown parameters needed estimating, and ε is an n× 1 vector

of random errors.

Question

The Euclidean distance is (Y −Xβ)′(Y −Xβ) = ε′ε. Under what

assumptions is this a good measure for closeness?

Ta-Wei Huang Introduction to Linear Model December 7, 2016 26 / 29

Page 30: L4 linear model and matrix representation

The Least Square Estimator

Assumptions

Assumptions: cov(ε) = σ2I

Homoskedasticity

Uncorrelation

Ta-Wei Huang Introduction to Linear Model December 7, 2016 27 / 29

Page 31: L4 linear model and matrix representation

The Least Square Estimator

The Ordinary Least Square Estimator

Definition (Least Square Estimator)

An estimator β is a least squares estimate of β if

β = arg minβ∈Rp(Y −Xβ)′(Y −Xβ).

Ta-Wei Huang Introduction to Linear Model December 7, 2016 28 / 29

Page 32: L4 linear model and matrix representation

Next Lecture

The Next Lecture

In next lecture, I will introduce a classical local optimization methods - the

least square model, and then discuss the geometrical interpretation of the

ordinary least square estimator.

Ta-Wei Huang Introduction to Linear Model December 7, 2016 29 / 29