Introduction to Mathematical Economics Part 1 - Loglinear Publications

271
Mathematical Economics An Introduction to Part 1 Michael Sampson Q 2 Q 1 U Loglinear Publishing

Transcript of Introduction to Mathematical Economics Part 1 - Loglinear Publications

Page 1: Introduction to Mathematical Economics Part 1 - Loglinear Publications

Mathematical

Economics

An Introduction to

Part 1

Michael Sampson

Q2

Q1

U

LoglinearPublishing

Page 2: Introduction to Mathematical Economics Part 1 - Loglinear Publications

Copyright © 2001 Michael Sampson. Loglinear Publications: http://www.loglinear.com Email: [email protected].

Terms of Use This document is distributed "AS IS" and with no warranties of any kind, whether express or implied. Until November 1, 2001 you are hereby given permission to print one (1) and only one hardcopy version free of charge from the electronic version of this document (i.e., the pdf file) provided that:

1. The printed version is for your personal use only. 2. You make no further copies from the hardcopy version. In particular no photocopies,

electronic copies or any other form of reproduction. 3. You agree not to ever sell the hardcopy version to anyone else. 4. You agree that if you ever give the hardcopy version to anyone else that this page, in

particular the Copyright Notice and the Terms of Use are included and the person to whom the copy is given accepts these Terms of Use.

Until November 1, 2001 you are hereby given permission to make (and if you wish sell) an unlimited number of copies on paper only from the electronic version (i.e., the pdf file) of this document or from a printed copy of the electronic version of this document provided that:

1. You agree to pay a royalty of either $3.00 Canadian or $2.00 US per copy to the

author within 60 days of making the copies or to destroy any copies after 60 days for which you have not paid the royalty of $3.00 Canadian or $2.00 US per copy. Payment can be made either by cheque or money order and should be sent to the author at:

Professor Michael Sampson Department of Economics Concordia University 1455 de Maisonneuve Blvd W. Montreal, Quebec Canada, H3G 1M8

2. If you intend to make five or more copies, or if you can reasonably expect that five or more copies of the text will be made then you agree to notify the author before making any copies by Email at: [email protected] or by fax at 514-848-4536.

3. You agree to include on each paper copy of this document and at the same page number as this page on the electronic version of the document: 1) the above Copyright Notice, 2) the URL: http://www.loglinear.com and the Email address [email protected]. You may then if you wish remove this Terms of Use from the paper copies you make.

Page 3: Introduction to Mathematical Economics Part 1 - Loglinear Publications

Contents

Preface v

1 The Mathematical Method 11.1 De…nitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 The Di¤erence Between ‘ = ’ and ‘´ ’ . . . . . . . . . . . . . . . 21.3 Implication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.4 Negation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.5 Proof by Contradiction . . . . . . . . . . . . . . . . . . . . . . . . 51.6 Necessary Conditions and Su¢cient Conditions . . . . . . . . . . 61.7 Necessary and Su¢cient Conditions . . . . . . . . . . . . . . . . 81.8 ‘Or’ and ‘And’ . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91.9 The Quanti…ers 9 and 8 . . . . . . . . . . . . . . . . . . . . . . . 101.10 Proof by Counter-Example . . . . . . . . . . . . . . . . . . . . . 101.11 Proof by Induction . . . . . . . . . . . . . . . . . . . . . . . . . . 111.12 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

1.12.1 Integer Exponents . . . . . . . . . . . . . . . . . . . . . . 151.12.2 Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . 161.12.3 Non-integer Exponents . . . . . . . . . . . . . . . . . . . . 191.12.4 The Geometric Series . . . . . . . . . . . . . . . . . . . . 21

2 Univariate Calculus 232.1 Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

2.1.1 Slopes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232.1.2 Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . 242.1.3 The Use of the Word ‘Marginal’ in Economics . . . . . . . 272.1.4 Elasticities . . . . . . . . . . . . . . . . . . . . . . . . . . 282.1.5 The Constant Elasticity Functional Form . . . . . . . . . 312.1.6 Local and Global Properties . . . . . . . . . . . . . . . . . 322.1.7 The Sum, Product and Quotient Rules . . . . . . . . . . . 342.1.8 The Chain Rule . . . . . . . . . . . . . . . . . . . . . . . 372.1.9 Inverse Functions . . . . . . . . . . . . . . . . . . . . . . . 392.1.10 The Derivative of an Inverse Function . . . . . . . . . . . 422.1.11 The Elasticity of an Inverse Function . . . . . . . . . . . . 43

2.2 Second Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . 45

i

Page 4: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CONTENTS ii

2.2.1 Convexity and Concavity . . . . . . . . . . . . . . . . . . 452.2.2 Economics and ‘Diminishing Marginal ...’ . . . . . . . . . 48

2.3 Maximization and Minimization . . . . . . . . . . . . . . . . . . . 492.3.1 First-Order Conditions . . . . . . . . . . . . . . . . . . . . 492.3.2 Second-Order Conditions . . . . . . . . . . . . . . . . . . 512.3.3 Su¢cient Conditions for a Global Maximum or Minimum 522.3.4 Pro…t Maximization . . . . . . . . . . . . . . . . . . . . . 55

2.4 Econometrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 602.4.1 Least Squares Estimation . . . . . . . . . . . . . . . . . . 602.4.2 Maximum Likelihood . . . . . . . . . . . . . . . . . . . . . 64

2.5 Ordinal and Cardinal Properties . . . . . . . . . . . . . . . . . . 672.5.1 Class Grades . . . . . . . . . . . . . . . . . . . . . . . . . 672.5.2 Ordinal and Cardinal Properties of Functions . . . . . . . 682.5.3 Concavity and Convexity are Cardinal Properties . . . . . 692.5.4 Quasi-Concavity and Quasi-Convexity . . . . . . . . . . . 702.5.5 New Su¢cient Conditions for a Global Maximum or Min-

imum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 712.6 Exponential Functions and Logarithms . . . . . . . . . . . . . . . 73

2.6.1 Exponential Growth and the Rule of 72 . . . . . . . . . . 822.7 Taylor Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

2.7.1 The Error of the Taylor Series Approximation . . . . . . . 862.7.2 The Taylor Series for ex and ln (1 + x) . . . . . . . . . . . 892.7.3 L’Hôpital’s Rule . . . . . . . . . . . . . . . . . . . . . . . 902.7.4 Newton’s Method . . . . . . . . . . . . . . . . . . . . . . . 91

2.8 Technical Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . 942.8.1 Continuity and Di¤erentiability . . . . . . . . . . . . . . . 942.8.2 Corner Solutions . . . . . . . . . . . . . . . . . . . . . . . 962.8.3 Advanced Concavity and Convexity . . . . . . . . . . . . 97

3 Matrix Algebra 1013.1 Matrix Addition and Subtraction . . . . . . . . . . . . . . . . . . 103

3.1.1 The Matrix 0 . . . . . . . . . . . . . . . . . . . . . . . . . 1043.2 Matrix Multiplication . . . . . . . . . . . . . . . . . . . . . . . . 105

3.2.1 The Identity Matrix . . . . . . . . . . . . . . . . . . . . . 1093.3 The Transpose of a Matrix . . . . . . . . . . . . . . . . . . . . . 110

3.3.1 Symmetric Matrices . . . . . . . . . . . . . . . . . . . . . 1113.3.2 Proof that ATA is Symmetric . . . . . . . . . . . . . . . . 112

3.4 The Inverse of a Matrix . . . . . . . . . . . . . . . . . . . . . . . 1123.4.1 Diagonal Matrices . . . . . . . . . . . . . . . . . . . . . . 116

3.5 The Determinant of a Matrix . . . . . . . . . . . . . . . . . . . . 1173.5.1 Determinants of Upper and Lower Triangular Matrices . 1223.5.2 Calculating the Inverse of a Matrix with Determinants . . 124

3.6 The Trace of a Matrix . . . . . . . . . . . . . . . . . . . . . . . . 1263.7 Higher Dimensional Spaces . . . . . . . . . . . . . . . . . . . . . 127

3.7.1 Vectors as Points in an n Dimensional Space: <n . . . . . 1273.7.2 Length and Distance . . . . . . . . . . . . . . . . . . . . 128

Page 5: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CONTENTS iii

3.7.3 Angle and Orthogonality . . . . . . . . . . . . . . . . . . 1293.7.4 Linearly Independent Vectors . . . . . . . . . . . . . . . . 134

3.8 Solving Systems of Equations . . . . . . . . . . . . . . . . . . . . 1393.8.1 Cramer’s Rule . . . . . . . . . . . . . . . . . . . . . . . . 142

3.9 Eigenvalues and Eigenvectors . . . . . . . . . . . . . . . . . . . . 1433.9.1 Eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . 1433.9.2 Eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . 1463.9.3 The Relationship A = C¤C¡1 . . . . . . . . . . . . . . . 1483.9.4 Left and Right-Hand Eigenvectors . . . . . . . . . . . . . 1513.9.5 Symmetric and Orthogonal Matrices . . . . . . . . . . . . 152

3.10 Linear and Quadratic Functions in <n+1 . . . . . . . . . . . . . . 1543.10.1 Linear Functions . . . . . . . . . . . . . . . . . . . . . . . 1543.10.2 Quadratics . . . . . . . . . . . . . . . . . . . . . . . . . . 1553.10.3 Positive and Negative De…nite Matrices . . . . . . . . . . 1573.10.4 Using Determinants to Check for De…niteness . . . . . . 1643.10.5 Using Eigenvalues to Check for De…niteness . . . . . . . . 1663.10.6 Maximizing and Minimizing Quadratics . . . . . . . . . . 168

3.11 Idempotent Matrices . . . . . . . . . . . . . . . . . . . . . . . . . 1723.11.1 Important Properties of Idempotent Matrices . . . . . . 1743.11.2 The Spectral Representation . . . . . . . . . . . . . . . . 176

3.12 Positive Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . 1793.12.1 The Perron-Frobenius Theorem . . . . . . . . . . . . . . . 1793.12.2 Markov Chains . . . . . . . . . . . . . . . . . . . . . . . . 1803.12.3 General Equilibrium and Matrix Algebra . . . . . . . . . 184

4 Multivariate Calculus 1884.1 Functions of Many Variables . . . . . . . . . . . . . . . . . . . . . 1884.2 Partial Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . 189

4.2.1 The Gradient . . . . . . . . . . . . . . . . . . . . . . . . 1924.2.2 Interpreting Partial Derivatives . . . . . . . . . . . . . . . 1944.2.3 The Economic Language of Partial Derivatives . . . . . . 1964.2.4 The Use of the Word Marginal . . . . . . . . . . . . . . . 1974.2.5 Elasticities . . . . . . . . . . . . . . . . . . . . . . . . . . 1994.2.6 The Chain Rule . . . . . . . . . . . . . . . . . . . . . . . 2004.2.7 A More General Multivariate Chain Rule . . . . . . . . . 2034.2.8 Homogeneous Functions . . . . . . . . . . . . . . . . . . . 2034.2.9 Homogeneity and the Absence of Money Illusion . . . . . 2064.2.10 Homogeneity and the Nature of Technology . . . . . . . . 207

4.3 Second-Order Partial Derivatives . . . . . . . . . . . . . . . . . . 2074.3.1 The Hessian . . . . . . . . . . . . . . . . . . . . . . . . . . 2094.3.2 Concavity and Convexity . . . . . . . . . . . . . . . . . . 2124.3.3 First and Second-Order Taylor Series . . . . . . . . . . . 217

4.4 Unconstrained Optimization . . . . . . . . . . . . . . . . . . . . . 2184.4.1 First-Order Conditions . . . . . . . . . . . . . . . . . . . . 2184.4.2 Second-Order Conditions . . . . . . . . . . . . . . . . . . 223

4.5 Quasi-Concavity and Quasi-Convexity . . . . . . . . . . . . . . . 227

Page 6: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CONTENTS iv

4.5.1 Ordinal and Cardinal Properties . . . . . . . . . . . . . . 2274.5.2 Su¢cient Conditions for a Global Maximum or Minimum 2294.5.3 Indi¤erence Curves and Quasi-Concavity . . . . . . . . . 231

4.6 Constrained Optimization . . . . . . . . . . . . . . . . . . . . . . 2364.6.1 The Lagrangian . . . . . . . . . . . . . . . . . . . . . . . 2364.6.2 First-Order Conditions . . . . . . . . . . . . . . . . . . . 2404.6.3 Second-Order Conditions . . . . . . . . . . . . . . . . . . 2464.6.4 Su¢cient Conditions for a Global Maximum or Minimum 251

4.7 Econometrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2584.7.1 Linear Regression . . . . . . . . . . . . . . . . . . . . . . . 2584.7.2 Maximum Likelihood . . . . . . . . . . . . . . . . . . . . . 260

Page 7: Introduction to Mathematical Economics Part 1 - Loglinear Publications

Preface

I would like to thank my students for struggling through earlier versions of thistext. In particular I would like to thank Maxime Comeau, Bulent Yurtsever,Patricia Carvajal, Alain Lumbroso and Saif Al-Haroun for pointing out errorsand typos.Here are ‘some points of view’ on economics and mathematics:

It is clear that Economics, if it is to be a science at all, must bea mathematical science. -William Jevons ( Jevons was one of theearly mathematical economists).

There can be no question, however, that prolonged commitment tomathematical exercises in economics can be damaging. It leads tothe atrophy of judgement and intuition. -John Kenneth Galbraith(Galbraith is a famous Canadian economist; an advisor to PresidentKennedy in the 1960’s; author of many popular books of which ourformer prime minister Trudeau was a big fan. Gets no respect fromacademic economists.)

The age of chivalry is gone. That of sophisters, economists andcalculators has succeeded. -Edmund Burke.

I advise my students to listen carefully the moment they decide totake no more mathematics courses. They might be able to hear thesound of closing doors. -James Caballero.

The e¤ort of the economist is to ‘see,’ to picture the interplay ofeconomic elements. The more clearly cut these elements appear inhis vision, the better; the more elements he can grasp and hold inhis mind at once, the better. The economic world is a misty region.The …rst explorers used unaided vision. Mathematics is the lanternby which what before was dimly visible now looms up in …rm, boldoutlines. The old phantasmagoria disappear. We see better. Wealso see further. -Irving Fisher (early 20th century US monetary

v

Page 8: Introduction to Mathematical Economics Part 1 - Loglinear Publications

PREFACE vi

economist, famous for the Fisher equation: nominal interest rateequals real interest rate plus the rate of in‡ation).

In mathematics you don’t understand things. You just get used tothem. -John von Neuman (One of the great mathematical brains ofthe 20th century. Famous in economics for developing game theoryand for the von Neuman growth model)

One of the big misapprehensions about mathematics that we perpe-trate in our classrooms is that the teacher always seems to know theanswer to any problem that is discussed. This gives students the ideathat there is a book somewhere with all the right answers to all of theinteresting questions, and that teachers know those answers. And ifone could get hold of the book, one would have everything settled.That’s so unlike the true nature of mathematics. -Leon Henkin.

Mathematics. - Let us introduce the re…nement and rigor of math-ematics into all sciences as far as this is at all possible, not in thefaith that this will lead us to know things but in order to determineour human relation to things. Mathematics is merely the means forgeneral and ultimate knowledge of man. -Friedrich Nietzsche ( 19th

century philosopher, an atheist, famous for his claim that “God isdead”. )

If we have no aptitude or natural taste for geometry, this does notmean that our faculty for attention will not be developed by wrestlingwith a problem or studying a theorem. On the contrary it is almostan advantage. It does not even matter much whether we succeed in…nding the solution or understanding the proof, although it is im-portant to try really hard to do so. Never in any case whatever is agenuine e¤ort of the attention wasted. It always has its e¤ect on thespiritual plane and in consequence on the lower one of the intelli-gence, for all spiritual light lightens the mind. If we concentrate ourattention on trying to solve a problem of geometry, and if at the endof an hour we are no nearer to doing so than at the beginning, wehave nevertheless been making progress each minute of that hour inanother more mysterious dimension. Without our knowing or feelingit, this apparently barren e¤ort has brought more light into the soul.The result will one day be discovered in prayer. Moreover, it mayvery likely be felt in some department of the intelligence in no wayconnected with mathematics. Perhaps he who made the unsuccessfule¤ort will one day be able to grasp the beauty of a line of Racine morevividly on account of it. But it is certain that this e¤ort will bearits fruit in prayer. There is no doubt whatever about that. -SimoneWeil ( 20th century Christian mystic, her brother Andre Weil wasone of the great mathematicians of the 20th century).

Page 9: Introduction to Mathematical Economics Part 1 - Loglinear Publications

Chapter 1

The Mathematical Method

1.1 De…nitions

Mathematics has no symbols for confused ideas. -Anonymous

“When I use a word,” Humpty Dumpty said in a rather a scornfultone, “it means just what I choose it to mean – neither more norless.” “The question is,” said Alice, “whether you can make wordsmean di¤erent things.” “The question is,” said Humpty Dumpty,“which is to be master – that’s all.” -Lewis Carroll, Through theLooking Glass

In economics we strive for precise thinking, and one of the ways we do thisis by using mathematics. The beginning of this practice is to be clear aboutwhat we are talking about, and for this we need de…nitions.We begin with some elementary number theory in order to illustrate the

mathematical methods that we will later apply to economic models. Supposethen we are interested in the properties of odd and even numbers. Now intu-itively you may know that 4 is even and 5 is an odd. If however we wish toprove things about odd and even numbers, then we have to be able to de…newhat we mean by an odd and an even number.Consider then proving that the product of an odd and an even number is

always an even number. It is not enough to make a list such as:

4£ 5 = 20

2£ 3 = 6

12£ 37 = 444

etc:

1

Page 10: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 1. THE MATHEMATICAL METHOD 2

and note that 20 , 6 and 444 are even numbers. This is not a proof! Nor wouldit be a proof to make the list even longer because there are an in…nite numberof odd and even combinations.Without de…nitions we have nowhere to begin!Now one possible de…nition of even and odd numbers would be:

De…nition 1 An integer m is an even number if and only if there exists aninteger n such that:

m = 2£ n:De…nition 2 An integer m is an odd integer if and only if there exists aninteger n such that:

m = 2£ n+ 1:

For example according to the de…nition 18 is an even integer because we canwrite it as 18 = 2 £ n where n = 9; while 5 is an odd integer because we canwrite it as 5 = 2£ n+ 1 where n = 2:Armed with these de…nitions we can now prove something:

Theorem 3 The product of an odd and an even number is an even number.

Proof. If a is even and b is odd then a = 2m and b = 2n+ 1 and:

a£ b = 2m£ (2n+ 1)= 2£ (m£ (2n+ 1))= 2£ r

where r = (m£ (2n+ 1)) is an integer. Thus a£ b is an even number.Notice the power of this kind of reasoning. In a few short lines we have been

able to prove a result that applies to an in…nity of numbers! This in…nity is alist of numbers which would go past the moon or even past the most distantstar and yet we are able to say something quite de…nite about it. This is themagic of mathematics!

1.2 The Di¤erence Between ‘ = ’ and ‘´ ’

One day in microeconomics, the professor was writing up the typical“underlying assumptions” in preparation to explain a new model. Iturned to my friend and asked, “What would Economics be withoutassumptions?” He thought for a moment, then replied, “Account-ing.”

Page 11: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 1. THE MATHEMATICAL METHOD 3

Sometimes things are equal to each other simply by de…nition. For exampleif

A = “the number of bachelors in Montreal”

B = “the number of unmarried men in Montreal”

then A and B are equal to each other by de…nition. There is nothing to provehere and it says nothing about the world or Montreal.To emphasize the nature of this kind of equality we use a special kind of

equal sign: ‘´’ so that for bachelors and unmarried men in Montreal we write:A ´ B:

This says then thatA andB are equal by de…nition or that this is an accountingidentity.When you see this equality sign you can relax! There is nothing to prove,

these things are merely di¤erent notations that mean the same thing.In economics a good example of an accounting identity is the GNP identity

you learn in macroeconomics:

Y ´ C + I +G+X ¡M

where Y is GNP; C is consumption, I is investment, G is government expendi-ture, X is exports and M is imports.On the other hand, sometimes things are equal in a more important way.

For example E = mc2 expresses an important fact in physics while f (x) = x2

and f 0 (x) = 2x give us real information about the function f (x) : In these caseswe use = as a way of emphasizing that real information is being provided.

1.3 Implication

In mathematical economics we begin with assumptions and from there attemptto deduce true implications of these assumptions. Fundamental to this kind ofreasoning is the idea of logical implication; that if A is true then it follows thatB must also be true. We write this formally as:

A =) B;

which is to say that A implies B:1

Example 1: If

A = “Mr. Smith lives in Montreal”

B = “Mr. Smith lives in the province of Quebec”

1Sometimes you will see the notation: A ¾ B instead.

Page 12: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 1. THE MATHEMATICAL METHOD 4

then since the city of Montreal is in the province of Quebec it follows thatA =) B:

We are often will often be attempting to construct proofs of statements like:A =) B: Often the link between A and B is not obvious and we need to …nd aseries of intermediate implications so that a proof often takes the form:

A =) S1 =) S2 =) ¢ ¢ ¢ =) Sn =) B

from which we conclude then that: A =) B: Thus the general strategy inproving A =) B is to begin with A and to use a series of correct implicationsto …nally obtain the statement B:

Example 2: Suppose that:

A = “a is odd and b is odd”

B = “a+ b is an even number”

and we wish to prove that:

A =) B;

that is:

Theorem 4 The sum of two odd numbers is even.

Proof. Given A it follows that a and b are odd so that a = 2r + 1 andb = 2s+ 1 for some integers r and s so that:

A =) a = 2r + 1; b = 2s+ 1

=) a+ b = (2r + 1) + (2s+ 1)

=) a+ b = 2 (r + s+ 1)

=) a+ b = 2t where t = r + s+ 1

=) ‘a+ b is an even number’ = B.

Note that there is a direction to the arrow: =) : This is to convey the ideathat the truth of statement A is communicated to the truth of the statement Bbut it is not necessarily the case that the truth of B implies the truth of A: Itis incorrect to conclude from A =) B that B =) A:

Example 1: If B is true so that Mr. Smith lives in the province of Quebec wecannot conclude that A is true, that he lives in the city of Montreal. He mayfor example live in another city in Quebec, say Sherbrooke. Thus A =) B istrue while B =) A is false.

Example 2: If B is “the sum a+b is an even number”, we cannot conclude: A;that is “a and b are each odd numbers.” For example if a = 4 and b = 6 thenB is true since 4 + 6 = 10 but A is not true since neither a nor b is odd. ThusA =) B is true while B =) A is false.

Page 13: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 1. THE MATHEMATICAL METHOD 5

1.4 Negation

Let » A denote the negation of A; that is “not” A or A is ‘not true’ or A is‘false’. For example if A is the statement: “Mr. Smith lives in Montreal” then» A is the statement: “Mr. Smith does not live in Montreal”. The negationsign acts like a negative sign in arithmetic since:

» (» A) = AIf we have shown: A =) B we have seen that we cannot conclude from

this that B =) A: However we can correctly conclude from A =) B that:~B =) ~A or:

A =) B then » B =)» A:

Example 1: In the Montreal/Quebec example we can correctly conclude fromA =) B that » B =)» A: If » B; Mr. Smith does not live in Quebec, then» A follows, he does not live in Montreal.

Example 2: In the arithmetic example we can correctly conclude from » B;that “a+ b is not an even number” then » A; that it is not the case that botha and b are odd ”.

1.5 Proof by ContradictionReductio ad absurdum, which Euclid loved so much, is one of a math-ematician’s …nest weapons. It is a far …ner gambit than any chessplay: a chess player may o¤er the sacri…ce of a pawn or even a piece,but a mathematician o¤ers the game. -G. H. Hardy.

Proof by contradiction or ‘Reductio ad absurdum’ involves proving a state-ment A by assuming the opposite » A and deriving a contradiction. Thusif:

» A =) B and » A =)» Bthen » A must be false and hence A must be true.

Example:

He is unworthy of the name of man who is ignorant of the fact thatthe diagonal of a square is incommensurable with its side. -Plato

Consider proving that:

Page 14: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 1. THE MATHEMATICAL METHOD 6

Theorem 5p2 is irrational; that is there are no integers a and b such that:

p2 =

a

b:

Proof. Let us assume, to the contrary, thatp2 is rational so that there

exists an a and b such that:p2 =

a

b

where a and b are integers. We can furthermore assume, without loss of gener-ality, that a and b are not both even since if a = 2r and b = 2s then:

p2 =

a

b=2r

2s=r

s:

For example if it were the case that a = 8 and b = 6 then since 86 =

43 we could

instead use a = 4 and b = 3 instead. Now we have:p2 =

a

b=) a2 = 2b2

=) a2 is an even number

=) a is an even number

since if a were odd then a2 would also be odd (you might want to prove this).Therefore we can write a as:

a = 2n

where n is some integer. Using this in a2 = 2b2 we have:

2b2 = a2 = (2n)2 = 4n2

=) b2 = 2n2

=) b is an even number.

Thus both a and b are even numbers which contradicts the requirement thata and b cannot both be even. Therefore the original assumption that

p2 is

rational must be false so thatp2 is irrational. QED.

Remark: You will often see the letters ‘QED’ put at the end of a proof. Theseletters stand for the Latin phrase: “Quod erat demonstandum” which means:“That which was to be demonstrated”. This just means that the proof is …nishedso you should not be looking for further arguments. We use the symbol: ¥ toindicate that a proof is …nished.

1.6 Necessary Conditions and Su¢cient Condi-tions

In mathematics you will often hear of “necessary conditions” and “su¢cientconditions”. For example a necessary condition for Mr. Smith to live in Mon-treal is that he live in the province of Quebec, while a su¢cient condition for Mr.

Page 15: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 1. THE MATHEMATICAL METHOD 7

Smith to live in the province of Quebec is that he live in Montreal. However,living in Montreal is not a necessary condition for living in Quebec and livingin Quebec not a su¢cient condition for living in Montreal. We have:

De…nition 6 If

» B =)» Aor equivalently A =) B; then B is a necessary condition for A:

De…nition 7 Su¢cient Conditions: If

A =) B

or equivalently » B =)» A; then A is a su¢cient condition for B;

Remark 1: Thus a necessary condition for A is one which necessarily must besatis…ed if A is to be true while a su¢cient condition for B is satis…ed then thisguarantees the truth of B:

Remark 2: Thus if we have proven a statement of the form

A =) B

then it follows that A is a su¢cient condition for B and B is a necessarycondition for A:

Remark 3: It is important to realize that if A is su¢cient for B it does notfollow that A is necessary for B: Similarly if B is necessary for A it does notfollow that B is su¢cient for A:

Example: We proved that:

“a is odd and b is odd ” =) “ a+ b is even ”.

Thus a su¢cient condition that the sum: a+b be an even number is that botha and b both be odd numbers while a necessary condition for both a and b tobe odd is that a+ b be even.However a+ b being an even number is not a su¢cient condition for both a

and b to odd numbers since a+ b can be even without a and b both being odd;for example, if a = 2 and b = 4 then a+ b = 6.Nor is a and b both being odd a necessary condition for a+ b to be even; for

example if a = 2 and b = 4 then a+ b = 6.

Page 16: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 1. THE MATHEMATICAL METHOD 8

1.7 Necessary and Su¢cient Conditions

Sometimes it is possible to prove both A =) B and B =) A: In this case Ais a necessary and su¢cient condition for B and B is a necessary and su¢cientcondition for A since it then is true that:

A =) B and » A =)» BB =) A and » B =)» A:

We therefore have:

De…nition 8 If

A =) B and B =) A

then A is a necessary and su¢cient condition for B and we write

A, B:

Notice that with A , B the arrow points in both directions. This is toindicate that the truth of A communicated to B just as the truth of B iscommunicated to A:If you can prove A , B then you have a much stronger statement than

either A =) B or B =) A alone.

Example: Consider

A = “The integers a and b are odd numbers”

B = “The product of the integers: a£ b is an odd number”Theorem 9 We have: a and b are odd numbers if and only if a£ b is an oddnumber so that: A, B:

Proof. Suppose a and b are odd so that: a = 2m + 1 and b = 2n + 1: Itfollows that:

a£ b = (2m+ 1)£ (2n+ 1)= 4mn+ 2m+ 2n+ 1

= 2 (2mn+m+ n) + 1

so that a£ b is odd. Now suppose that B is true so that:a£ b = 2m+ 1

and consider a proof by contradiction to show that A is true. Thus suppose Ais false, so that one of a and b is even. Without loss of generality suppose a iseven so that a = 2n and hence:

2n£ b = 2m+ 1

=) n£ b =m+ 12:

Now n £ b is an integer but m+ 12 is not an integer, which is a contradiction.

It follows then that A is true.

Page 17: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 1. THE MATHEMATICAL METHOD 9

1.8 ‘Or’ and ‘And’“You are sad,” the Knight said in an anxious tone: “let me sing youa song to comfort you.” “Is it very long?” Alice asked, for she hadheard a good deal of poetry that day. “It’s long,” said the Knight,“but it’s very, very beautiful. Everybody that hears me sing it – eitherit brings the tears into their eyes, or else ” “Or else what?” saidAlice, for the Knight had made a sudden pause. “Or else it doesn’t,you know.”

-Lewis Carroll, Through the Looking Glass

In life and in mathematics we often connect di¤erent statements using theword “or” which is denoted by _.

Example: If A is the statement “n is odd” and B is the statement “n > 10”then: A _B means that either n is odd or n is greater than 10:If n = 13 then A_B would be true since n satis…es both A and B; it is odd

and greater than 10: If n = 7 then A _B would also be true since n satis…es Aand we do not need to satisfy B: Similarly if n = 22 then A _B would be truebecause n satis…es B and we then do not need to satisfy B: The only way A_Bcan be false is if both A and B are false. Thus if n = 8 then A _ B would befalse.

Remark: The use of “or” here is the “inclusive or” which is di¤erent fromthe “exclusive or” that your mother used when she said: “You can either havecake or pie”, which meant you can have cake, you can have pie, but you cannothave both cake and pie. If your mother used the inclusive or then you couldalso have cake and pie.

Here are some results involving the connector _ :

Theorem 10 The Law of the Excluded Middle: For any statement A: thestatement:

A_ » A

is true.

Theorem 11 For any statements A and B :

(A =) B), (» A _B) :

Another important connector of statements is “and” which is denoted by:^: Thus in the previous example A ^ B means n is odd and n is greater than10: For the statement A ^B to be true it must be then that both A and B aretrue.

Page 18: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 1. THE MATHEMATICAL METHOD 10

Negating statements involving _ is equivalent to negating each individualstatement and changing the _ to ^: Similarly negating statements involving ^is equivalent to negating each individual statement and changing the _ to ^:Thus:

Theorem 12 For any statements A and B

» (A _B),» A^ » B» (A ^B),» A_ » B:

1.9 The Quanti…ers 9 and 8Sometimes in mathematics we use the quanti…er 9 to say that something exists:For example to express the idea that the integer a is odd we might write:

9nj (a = 2n+ 1)which says that there exists an integer n such that a = 2n+ 1:Other times we wish to make a universal statement that all members of some

class have a property using the quanti…er: 8: For example we might write:8nj (n > n¡ 1)

which says that all integers n are greater than n¡ 1:In intermediate mathematics and economics the symbols 9 and 8 are some-

times used as a convenient short-hand but are not that important. They do getused a lot in advanced mathematics and economics.

1.10 Proof by Counter-Example

In mathematics we are often led by our intuition to believe without a proof thatsomething is always true. We therefore form a guess or a conjecture that thisstatement is always true.

Example: In the seventeenth century the French mathematician Fermat con-jectured that if n is an integer then all numbers of the form:

22n

+ 1

are prime numbers. Thus with n = 0; 1; 2; 3; 4 we have:

220

+ 1 = 3; 221

+ 1 = 5; 222

+ 1 = 17; 223

+ 1 = 257; 224

+ 1 = 65537

and it is a fact that 3; 5; 17; 257 and 65537 are all prime numbers. We have:(An integer n is a prime number if its only divisors are 1 and n: Thus 5

is prime because only 1 and 5 divides evenly into 5 while 9 is not prime since93 = 3.)Since we do not know if a conjecture is true or false, there are two strategies

for dealing with a conjecture:

Page 19: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 1. THE MATHEMATICAL METHOD 11

1. Prove the conjecture is true.

2. Use a proof by counter-example to …nd one case where the conjecture isfalse.

The …rst strategy is generally the most di¢cult since we have to prove thatsomething holds for an in…nite number of cases. For Fermat’s conjecture it wouldvery likely be a deep and di¢cult proof that would show that all numbers ofthe form 22

n

+ 1 are prime. Of course if the conjecture really is true this is theonly strategy that will lead to success.Often however you will …nd that no matter how hard you try you cannot

prove a conjecture. In this case you might try the second strategy and searchfor a counter-example. If you are lucky this can be much easier since unlike the…rst strategy, you only need one counter-example to prove the conjecture false.Fermat died without being able to prove his conjecture. Later Euler was

able to show that Fermat’s conjecture is in fact false since for n = 5:

225

+ 1 = 4294967297 = 641£ 6700417and so 22

5

+ 1 is not prime.

1.11 Proof by Induction

Just the place for a Snark! I have said it twice:

That alone should encourage the crew.

Just the place for a Snark! I have said it thrice:

What I tell you three times is true.

-Lewis Caroll, The Hunting of the Snark

Often students attempt to prove results by simply listing the …rst few cases,verifying that the statement is true, and then by putting a ‘: : : ’ or an ‘etc.’afterwards. For example suppose you wished to prove the following conjecture:

Conjecture: The sum of the …rst n integers is:

1 + 2 + 3 + ¢ ¢ ¢+ n = n (n+ 1)

2:

Thus one might write:

1 =1 (1 + 1)

2= 1

1 + 2 =2 (2 + 1)

2= 3

1 + 2+ 3 =3 (3 + 1)

2= 6

etc:

Page 20: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 1. THE MATHEMATICAL METHOD 12

and concludes that the statement is true. As we should now know this is in-correct since it does not exclude the possibility that the conjecture is false forn = 4 or at some really huge number like: n = 1010

10

:A correct method to proof these kind of conjectures is proof by induction. A

proof by induction proceeds as follows. We are given a sequence of statementsS1; S2; : : : and we want to prove that each Si is true. For example we wish toprove that Sn is true where Sn is the statement:

Sn = \1 + 2 + 3 + ¢ ¢ ¢+ n = n (n+ 1)

2":

Proof by induction proceeds in two steps:

Proof by Induction

1. Prove that S1 is true. This is usually trivial involving nothing more thana mere calculation.

2. Assume that S1; S2 : : : Sn¡1 are true (this is called the induction hypothe-sis), and use this to prove that Sn is true.

A proof by induction is very much like setting up an in…nite row of dominos.To get every domino to fall over two things are needed. First, one must tip overthe …rst domino to get the chain reaction started. This corresponds to the …rststep in the proof by induction. Next, the dominos must be spaced so that ifone domino falls then its next neighbour must also fall. This corresponds to thesecond part of the proof. Together they imply that all of the dominos will falldown.

Example: Consider proving:

1 + 2 + 3 + ¢ ¢ ¢+ n = n (n+ 1)

2:

Proof. The …rst step is to verify that it is true for n = 1; which is easysince:

1 =1 (1 + 1)

2:

Now assume the induction hypothesis, that the statement is true up to n¡ 1 sothat in particular:

1 + 2 + 3 + ¢ ¢ ¢+ n¡ 1 = (n¡ 1) ((n¡ 1) + 1)2

=n (n¡ 1)

2:

Page 21: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 1. THE MATHEMATICAL METHOD 13

Now we need to prove the statement is true for Sn: We have:

1 + 2 + 3 + ¢ ¢ ¢+ n¡ 1| {z }=n(n¡1)

2

+ n =n (n¡ 1)

2+ n

= n

µ(n¡ 1)2

+ 1

¶= n

µ(n¡ 1) + 2

2

¶=

n (n+ 1)

2:

1.12 Functions

The basic mathematical object that we will be working with is a function,de…ned as:

De…nition 13 Function: A function y = f (x) is a rule which assigns aunique number y to every allowed value of x :

The key requirement here is that there be a unique y for every x: Forexample the function:

y = f (x) = x2

assigns to the value x = 2 the unique value y = 22 = 4:An example of something which is not a function is:

y = f (x) =px

since to x = 4 it assigns two values: y = 2 and y = ¡2 while to x = ¡4 itassigns no value since

p¡4 is not de…ned. Similarly without any restrictions onx: f (x) = 1

x is not a function since f (0) =10 is not de…ned.

Implicit in any de…nition of a function is its domain and range:

De…nition 14 Domain: The domain of a function: y = f (x) is the set ofall values of x for which f (x) is de…ned.

De…nition 15 Range: The range of a function y = f (x) is the set of possibley values over the domain of the function.

Often we can insure that f (x) is a function by restricting its domain andrange.

Example: The problem with

f (x) =px

Page 22: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 1. THE MATHEMATICAL METHOD 14

can be …xed by: 1) restricting the domain to be x ¸ 0 and 2) restricting therange to be y ¸ 0 or in other words interpreting p as the positive square root

(e.g.p4 = 2 and not ¡2 ). With these restrictions we have a perfectly good

function as can be seen by the plot below:

0

1

2

3

4

5

6

5 10 15 20 25 30 35x

y =px for x ¸ 0

:

This is actually an example of a Cobb-Douglas production function, one of theworkhorses of economic theory.Similarly the problem with f (x) = 1

x can be …xed by restricting the domainto be x > 0 in which case we have:

2

4

6

8

10

1 2 3 4 5x

f (x) = 1x

:

Remark: Quite often we de…ne the range and domain in a way that ensuresthat the function makes economic sense. If for example

Q = f (P )

is a demand function with P the price andQ the quantity demanded, the domainof f (P ) will be P ¸ 0 and the range Q ¸ 0 since prices and quantities cannotbe negative.

Page 23: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 1. THE MATHEMATICAL METHOD 15

1.12.1 Integer Exponents

An important class of functions take the form:

f (x) = xn

where n is an integer. The meaning of xn for n > 0 is simply x multiplied byitself n times. For example

x3 = x£ x£ x:In this case we can allow the domain of f (x) to be all x; that is

¡1 · x · 1:We can also allow negative integer exponents (i.e., ¡1;¡2;¡3 : : : ): By x¡n

we mean 1xn : For example:

x¡3 =1

x3=1

x£ 1

x£ 1

x:

Note that for negative integer exponents we need to exclude x = 0 from thedomain of the function since, 10 =1 is not de…ned.Integer exponents obey the following rules, which you might want to prove

on your own:

Theorem 16 If m and n are either positive or negative integers then

1. xm xn = xm+n

2. (xm)n = xmn

3. x0 = 1

4. x¡n = 1xn

5. (xy)n = xnyn:

Proof. To prove 1 for example we have:

xmxn =

0@x£ x£ ¢ ¢ ¢ £ x| {z }m

1A0@x£ x£ ¢ ¢ ¢ £ x| {z }n

1A= x£ x£ ¢ ¢ ¢ £ x| {z }

m+n

= xm+n:

The results 2 and 5 can be proven in a similar manner. To prove x0 = 1 from1) and 4) we have: from 1) with n = ¡m:

xmx¡m = xm¡m = x0

Page 24: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 1. THE MATHEMATICAL METHOD 16

while from 4) we have: x¡m = 1xm and so:

x0 = xmx¡m = xm1

xm= 1:

Remark: Note that (x+ y)n 6= xn + yn. For example with n = 2

(x+ y)2 = x2 + 2xy + y2 6= x2 + y2:

1.12.2 Polynomials

An polynomial a weighted sum of xn de…ned as:

De…nition 17 Polynomial: An nth degree polynomial is a function of theform:

f (x) = anxn + an¡1xn¡1 + ¢ ¢ ¢+ a1x+ ao

where an 6= 0:An important property of a polynomial are its roots:

De…nition 18 The roots of a function f (x) are those values r which satisfyf (r) = 0:

For a polynomial a root r satis…es:

f (r) = anrn + an¡1rn¡1 + ¢ ¢ ¢+ a1r + ao = 0:

One of the most important results in mathematics is that a polynomial ofdegree n has n (possibly complex) roots. This is important enough that it iscalled the Fundamental Theorem of Algebra. It was …rst proved by Gauss.

Theorem 19 Fundamental Theorem of Algebra: An nth degree polynomialhas n roots: r1; r2; : : : rn; that is n (possibly complex2) solutions to the equation

f (r) = anrn + an¡1rn¡1 + ¢ ¢ ¢+ a1r + ao = 0:

Two important special cases are:

De…nition 20 A linear function is a 1st degree polynomial:

y = f (x) = ax+ b

De…nition 21 A quadratic is a 2nd degree polynomial:

y = f (x) = ax2 + bx+ c:

2A complex number is of the form a+ bi where i =p¡1:

Page 25: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 1. THE MATHEMATICAL METHOD 17

Example 1: A 1st degree polynomial: f (x) = ax+ b has one root:

r = ¡b=a

as the solution to f (r) = ar + b = 0: Thus

f (x) = 4x+ 8

has a single root at r = ¡2 as illustrated below:

-10

0

10

20

-4 -2 2 4x

f (x) = 4x+ 8

:

Example 2: We have

Theorem 22 The quadratic

f (x) = ax2 + bx+ c:

has two roots r1 and r2 given by:

r1 =¡b+pb2 ¡ 4ac

2aand r2 =

¡b¡pb2 ¡ 4ac2a

:

Thus the quadratic:

x2 ¡ 9x+ 14

has two roots:

r =¡ (¡9)§

q(¡9)2 ¡ 4 (1) (14)2

or

r1 = 2 and r2 = 7

Page 26: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 1. THE MATHEMATICAL METHOD 18

as can also be seen by the graph below where f (x) crosses the x axis:

0

20

40

60

80

-4 -2 2 4 6 8 10 12x

f (x) = x2 ¡ 9x+ 14:

An implication of the fundamental theorem of algebra is that a polynomialcan always be factored as follows:

Theorem 23 Let r1; r2; : : : rn be the n roots of the polynomial

f (x) = anxn + an¡1xn¡1 + ¢ ¢ ¢+ a1x+ ao:

Then f (x) can be factored as:

f (x) = an (x¡ r1)£ (x¡ r2)£ ¢ ¢ ¢ £ (x¡ rn) :

Example 1: The quadratic:

f (x) = 3x2 ¡ 27x+ 60:

has two roots r1 = 5 and r2 = 4 so that

f (x) = 3 (x¡ 5) (x¡ 4)

which you can verify by multiplying out the second expression.

Example 2: The cubic:

x3 ¡ 19x2 + 104x¡ 140

Page 27: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 1. THE MATHEMATICAL METHOD 19

has roots at r1 = 2; r2 = 7 and r3 = 10 as can be seen by the graph below:

-100

-50

0

50

100

150

200

2 4 6 8 10 12x

f (x) = x3 ¡ 19x2 + 104x¡ 140or by noting that it can be factored as:

x3 ¡ 19x2 + 104x¡ 140 = (x¡ 2) (x¡ 7) (x¡ 10) :

1.12.3 Non-integer Exponents

In economics we will often want to consider non-integer exponents that is: f (x) =xa where a is not an integer.

Example: Two functions with non-integer exponents are

f (x) = x0:3143 and

Q = f (L) = L12

where in the …rst case a = 0:3143 and in the second a = 0:5: The latter case isan example of a Cobb-Douglas production function.

For non-integer exponent functions y = xa we run into very di¢cult anddeep mathematical waters if we allow either x or y to be negative. For examplewith f (x) = x

12 if we allow x = ¡1 then f (¡1) = p¡1 is not de…ned while if

x = 4 and we allow y < 0 then y =p4 = 2 and y =

p4 = ¡2:

For this reason, whenever we work with y = xa with an exponent a which isnot integers we always assume that x > 0 and that y > 0:

With this quali…cation non-integer exponents obey the same rules as withinteger exponents. Thus:

Theorem 24 If x > 0 and a is any number (integer or non-integer, negativeor positive) then xa is de…ned and:

1. xa > 0

Page 28: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 1. THE MATHEMATICAL METHOD 20

2. xa xb = xa+b

3. (xa)b = xab

4. (xy)a = xaya

5. x0 = 1

6. x¡a = 1xa :

Often we will need to …nd the unique positive root of the function:

f (x) = Axb ¡ cfor x > 0 and where b is not an integer. We have:

Theorem 25 The unique positive root f (r) = Arb ¡ c = 0 is given by:

r =³ cA

´ 1b

:

Proof. Since r satis…es:

Arb = c

we have:

rb =c

A:

To get r by itself we take both sides to the power 1b to get:¡rb¢ 1b =

³ cA

´ 1b

and since¡rb¢ 1b = rb

1b = r1 = r we have:

r =³ cA

´ 1b

:

Example: Given:

f (x) = 10x7:3 ¡ 23where A = 10; b = 7:3 and c = 23 we …nd that:

r =

µ23

10

¶ 17:3

= 1:120:

Page 29: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 1. THE MATHEMATICAL METHOD 21

This can also be seen in the plot below:

0

50

100

150

y

0.2 0.4 0.6 0.8 1 1.2 1.4x

f (x) = 10x7:3 ¡ 23:

1.12.4 The Geometric Series

An important result in economics is the geometric series:

Theorem 26 Finite Geometric Series: If x 6= 1 then

1 + x+ x2 + x3 + ¢ ¢ ¢xn¡1 = 1¡ xn1¡ x :

Proof. Let S be given by::

S = 1 + x+ x2 + x3 + ¢ ¢ ¢+ xn¡1:

If we multiply S by x we obtain:

xS = x+ x2 + x3 + ¢ ¢ ¢+ xn

and if we subtract these two equations from each other we obtain:

S ¡ xS = (1¡ x)S = 1 + x+ x2 + x3 + ¢ ¢ ¢xn¡1 ¡ ¡x+ x2 + x3 + ¢ ¢ ¢+ xn¢= 1¡ xn

so that assuming x 6= 1 and solving for S yields the …nite geometric series:Now consider letting n!1: If ¡1 < x < 1 then xn ! 0 so that:

Theorem 27 In…nite Geometric Series: If ¡1 < x < 1 then1

1¡ x = 1 + x+ x2 + x3 + ¢ ¢ ¢ :

Page 30: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 1. THE MATHEMATICAL METHOD 22

Example 1: If x = 12 then:

2 =1

1¡ 12

= 1 +1

2+

µ1

2

¶2+

µ1

2

¶3+ ¢ ¢ ¢ :

Thus if you have 2 pies in the fridge and each day you eat 12 of the pie in the

fridge, eating 1 pie the …rst day, 12 of a pie the second,14 of a pie the third etc.,

you will eventually eat all the pie in the fridge.

Example 2: Suppose you have a bond that pays $a a year forever and theinterest rate is r > 0: The price of the bond is then the present discountedvalue:

PB =a

1 + r+

a

(1 + r)2+

a

(1 + r)3+ ¢ ¢ ¢ :

We have:

PB =a

1 + r

0BBB@1 +xz }| {1

1 + r+

x2z }| {1

(1 + r)2+ ¢ ¢ ¢

1CCCA=

a

1 + r

¡1 + x+ x2 + ¢ ¢ ¢ ¢

so the term in brackets is just the geometric series with x = 11+r with 0 < x < 1.

Then from the geometric series:

PB =a

1 + r

1

1¡ x=

a

1 + r

1

1¡ 11+r

=a

1 + r

1 + r

r

=a

r:

Thus with an interest rate of r = 0:05=year (or 5% per year), a bond thatpayed a = $20 per year forever would be worth

PB =$20=year

0:05=year= $400:

Page 31: Introduction to Mathematical Economics Part 1 - Loglinear Publications

Chapter 2

Univariate Calculus

2.1 Derivatives

2.1.1 Slopes

You can think of a function f (x) as a system of mountains and valleys with xdenoting your position along the x axis (say how far east or west from somepoint) and y your height above the x axis (or how high you are above sea level).This is illustrated below:

y

x

Mountains and Valleys

An important consideration for both a hiker and an economist then is theslope. Hikers clearly care if they are going uphill or downhill, and economistscare if a function is upward sloping (as a supply curve) or downward sloping (asa demand curve).The slope at any point x can be measured by moving ¢x (say ¢x = 5 or 5

to the right), measuring the change in elevation by ¢y; (say ¢y = ¡20 or 20feet down) and taking the ratio ¢y

¢x to get the slope (here¢y¢x =

¡205 = ¡4 so

for every foot forward you fall 4 feet with the negative indicating a downwardslope). This leads to the following de…nition:

23

Page 32: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 24

De…nition 28 Slope: The slope of f (x) at x for a given change in x : ¢x isdenoted by ¢y

¢x and is:

¢y

¢x´ f (x+¢x)¡ f (x)

¢x:

If ¢y¢x > 0 the function is upward sloping so that increasing (decreasing) xleads to an increases (decrease) in y: If ¢y¢x < 0 the function is downward slopingso that increasing (decreasing) x leads to a decrease (increase) in y:

Example: If f (x) = x2 and we want to measure the slope at x = 1 with¢x = 2 then we obtain:

¢y

¢x=

(x+¢x)2 ¡ x2¢x

=(1 + 2)2 ¡ 12

2= 4:

On the other hand if we use ¢x = 0:25 we obtain:

¢y

¢x=

(x+¢x)2 ¡ x2¢x

=(1 + 0:25)2 ¡ 12

0:25= 2: 25

while if we use ¢x = 0:001 we obtain:

¢y

¢x=

(1 + 0:001)2 ¡ 120:001

= 2: 001

Note that as we make ¢x smaller the slope appears to be approaching 2:In general we have:

Theorem 29 For f (x) = x2 the slope is:

¢y

¢x= 2x+¢x:

2.1.2 Derivatives

And what are these derivatives? ... They are neither …nite quan-tities, nor quantities in…nitely small, nor yet nothing. May we notcall them ghosts of departed quantities?-George Berkeley

Page 33: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 25

A problem with slopes is that we may get a di¤erent slope depending onwhich ¢x we choose. For example with x2 at x = 1 the slope is 2 +¢x and sowe obtained slopes of 2:25 and 2:001 for ¢x = 0:25 and ¢x = 0:001:Since we get di¤erent slopes for di¤erent ¢x0s; one might wonder then

whether there is a best choice for ¢x? This question has a surprising answer.It turns out that the best choice is to make ¢x zero so that we let:

¢x! 0:

Remark: Sometimes rather than 0 it is better to follow the earlier inventors ofcalculus and imagine that:

¢x! dx

where dx (note the¢, delta, is the Greek letter for d ), known as an in…nitesimal,is a quantity that is as close to zero as possible without actually being 0. Aswe make our step ¢x in…nitesimally small, the amount by which we rise or fallwill also get in…nitesimally small so that:

¢y ! dy:

The ratio however of the two or the slope will approach something sensible, thederivative of the function:

dy

dx= f 0 (x) :

The use of in…nitesimals was frowned upon by many such as the Englishphilosopher Berkeley. It was not until over a hundred years after the inventionof calculus that Cauchy was able to provide foundations for calculus that didnot require the use of in…nitesimals. In…nitesimals are nevertheless a real aid tointuition and especially in applied work they get used all the time.

Example: If y = x2 we have from Theorem 29 that as ¢x! 0:

¢y

¢x= 2x+¢x! 2x+ dx = 2x

where we ignore the dx because it is so small. This gives the well-known resultobtained by multiplying by the exponent and subtracting one from the exponentthat:

dy

dx=d

dx

¡x2¢= 2x:

Thus at x = 1 we obtain a slope of

dy

dx= f 0 (1) = 2:

In general a derivative is de…ned as:

Page 34: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 26

De…nition 30 Derivative: The derivative of a function y = f (x) ; denotedby f 0 (x) or dy

dx ; is the limit of the slope as ¢x! 0 or:

lim¢x!0

¢y

¢x= lim¢x!0

f (x+¢x)¡ f (x)¢x

:

A graphical depiction of the di¤erence between a slope and a derivative isgiven below:

The …rst rule you learn in calculus is how to calculate the derivative of xn

as:

Page 35: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 27

Theorem 31 Given f (x) = xn then:

f 0 (x) = nxn¡1:

Example: Given f (x) = x7 then:

f 0 (x) = 7x7¡1

= 7x6:

2.1.3 The Use of the Word ‘Marginal’ in Economics

In economics we often use the wordmarginal; for example the marginal productof labour, the marginal utility of apples, the marginal propensity to consumeand so on.The original meaning of ‘marginal’ in say the marginal product of labour

was the e¤ect of adding one more unit of labour L, at the margin, to outputQ. Translating this into mathematics, if we write the production function asQ = f (L) then the marginal product of labour is:

MPL ´ ¢Q

¢L=f (L+ 1)¡ f (L)

1

where ¢L = 1 and ¢Q = f (L+ 1) ¡ f (L) : Thus the marginal product oflabour is the slope: ¢Q¢L when ¢L = 1.In advanced economics we want to have the tools of calculus at our disposal.

For this reason it is much more convenient to use derivatives rather than slopesto measure the marginal product of labour. Consequently instead of setting¢L = 1 and using ¢Q

¢L we let ¢L! 0 and use the derivative dQdL = f

0 (L).This re…nement of then notion of marginal extends now to all marginal

concepts so that today:

De…nition 32 In economics when we refer to marginal concepts we mean thederivative.

Example 1: Given the Cobb-Douglas production function:

Q = f (L) = L12

the marginal product of labour is the derivative of f (L) or:

MPL (L) ´ f 0 (L) = 1

2L¡

12 :

Example 2: Given a utility function for say apples Q:

U (Q) = Q23

the marginal utility of apples is:

MU (Q) ´ U 0 (Q) = 2

3Q¡

13 :

Page 36: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 28

2.1.4 Elasticities

Often economists work with elasticities rather than derivatives. The problemwith derivatives is that they depend on the units in which x and y are measured.Suppose for example we have a demand curve:

Q = 100¡ 3P $

where the price P $ is measured in dollars and the derivative is dQdP = ¡3: If we

decide to measure the price instead in cents P/c we have, using: P $ = P/c100 and:

Q = 100¡ 3

100P/c

so now dQdP = ¡ 3

100 : Thus a change in units thus causes the derivativedQdP to

change from ¡3 to ¡ 3100 :

Elasticities avoid this problem by working with percentage changes. Whilethe slope is the change in y divided by the change in x or ¢y

¢x ; the elasticity ´is the percentage change in y: ¢yy £ 100% divided by the percentage changein x : ¢xx £ 100% or:

´ ´¢yy £ 100%¢xx £ 100% =

¢y

¢x

x

y:

Notice that here the elasticity ´ is the slope ¢y¢x multiplied by

xy : This is known

as an arc elasticity and is typically used in elementary economics.In more advanced economics we let ¢x ! 0 and use the derivative in the

elasticity: dydx rather than the slope:

¢y¢x . This leads to the point elasticity or

simply the elasticity:

De…nition 33 Elasticity: The elasticity of the function y = f (x) at x denotedby ´ (x) is:

´ (x) ´ dy

dx

x

y´ f 0 (x) x

f (x):

An easy way to remember the formula for the elasticity is to follow thefollowing recipe:

Elasticity Recipe

1. Write down the derivative as:

dy

dx:

Page 37: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 29

2. Note that with the derivative y is upstairs and x is downstairs. To obtainthe elasticity put the y downstairs and the x upstairs as:

x

y

and multiply this with 1 as:

´ =dy

dx£ xy:

3. Now to obtain the elasticity as a function of x replace y with f (x) toobtain:

f 0 (x)x

f (x):

Remark 1: In economics typically x > 0 and y > 0: This means that thederivative f 0 (x) and the elasticity ´ (x) always have the same sign. Thus if theelasticity of demand is negative this is equivalent to saying the demand curveslopes downwards.

Remark 2: If ´ (x) = ¡2 then 1% increase in x leads to a 2% decrease in y: If´ (x) = 3 then a 1% increase in x leads to a 3% increase in y:

Remark 3: An elasticity can be calculated for any function y = f (x) ; not justfor demand curves.

Example 1: If

y = 4¡ 2x

(a demand curve perhaps if y = Q and x = P ) then following the recipe:

1. We …rst write down the slope as:

dy

dx= ¡2:

2. Since there is a y upstairs and an x downstairs we multiply this by xy as:

´ = ¡2xy:

3. To obtain the elasticity as a function of x replace y with 4¡ 2x as:

´ (x) = ¡2xy= ¡2 x

4¡ 2x:

Page 38: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 30

Thus at x = 12 ; we have:

´

µ1

2

¶=

µ ¡2x4¡ 2x

¶jx= 1

2

= ¡13

so that at x = 12 a 1% increase in x leads to a 0:33% decrease in y:

Notice that while the derivative is ¡2 for all x; the elasticity decreases as xincreases as shown in the plot below:

-8

-6

-4

-2

00.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8x

´ (x) = ¡2x4¡2x

:

Example 2: If

y = x2 + 5

then following the recipe:

1. We …rst write down the slope as:

dy

dx= 2x:

2. Since there is a y upstairs and an x downstairs we multiply this by xy as:

´ = 2x£ xy:

3. To obtain the elasticity as a function of x replace y with x2 + 5 as:

´ (x) = 2x£ x

x2 + 5=

2x2

x2 + 5:

Page 39: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 31

2.1.5 The Constant Elasticity Functional Form

Generally speaking the elasticity ´ (x) changes with x: Now with slopes we knowthere exists a function

f (x) = ax+ b

where the derivative f 0 (x) = a does not change with x although the elasticity:

´ (x) = f 0 (x)x

y=

ax

ax+ b

does change with x:Given the importance of elasticities in economics, a natural question to ask

whether there is a functional form which has the property that the elasticity ´ (x)does not change with x? This is often convenient since it means, for example,that a demand curve has the same elasticity no matter what the price.The functional form which has this property is

f (x) = Axb:

In fact we have an even stronger result:

Theorem 34 A function f (x) has the same elasticity for all x if and only if itcan be written as:

f (x) = Axb:

Proof. To prove that f (x) = Axb has a constant elasticity note that:

f 0 (x) = bAxb¡1

and consequently:

´ (x) = f 0 (x)x

f (x)= bAxb¡1

x

Axb= b:

To prove that ´ (x) = b =) f (x) = Axb requires either integral calculus ordi¤erential equations and so we omit it here.

Example 1: The demand curve

Q = 1000P¡3

has the functional form Axb and hence the elasticity of demand is ´ (P ) = ¡3;the exponent on P:

Example 2: If we add a constant 10 to the demand curve in Example 1 as:

Q = 1000P¡3 + 10

Page 40: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 32

then the demand curve no longer has the functional form Axb and so does nothave the same elasticity for all P: In fact:

´ (P ) =¡3000P¡31000P¡3 + 10

=¡3

1 + P3

100

and so changing P changes the elasticity as illustrated in the plot below:

-3

-2.8

-2.6

-2.4

-2.2

-2

-1.8

-1.6

-1.4

0 1 2 3 4 5P

´ (P ) = ¡31+ P3

100

:

2.1.6 Local and Global Properties

When a hiker says: “just after the stream the trail climbs” he is talking about alocal property of the trail since it may be that later on the trail descends. Onthe other hand if he says: “it rained today and the entire trail is muddy” he ismaking a global statement, one that applies to the entire trail.When we refer to functions we also will want to distinguish between local

and global properties. We have:

De…nition 35 Local Properties: We say f (x) has some property locally atxo if there is a neighborhood around xo (perhaps very small) where f (x) hasthat property.

De…nition 36 Global Properties: We say f (x) has some property globally ifthe function has that property for all x in the domain of f (x).

Making a global statement is always stronger than making a local statement.If the trail is globally muddy, then it follows that it is locally muddy just afterthe stream. However if it is locally muddy just after the stream, it does notfollow that it is globally muddy. This clearly also holds for functions and so:

Theorem 37 If A is the statement \f (x) has property P globally" and B isthe statement \f (x) has property P locally" then

A =) B

but it is not true that B =) A:

Page 41: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 33

A function can be either locally or globally increasing or decreasing accordingto the following de…nitions:

De…nition 38 Locally Increasing: If f 0 (xo) > 0 the function is locally in-creasing (upward sloping) at x = xo.

De…nition 39 Globally Increasing (or Monotonic): If f 0 (x) > 0 for all xin the domain of f (x) then the function is globally increasing or monotonic.

De…nition 40 Locally Decreasing: If f 0 (xo) < 0 the function is locally de-creasing (or downward sloping) at x = xo:

De…nition 41 Globally Decreasing: If f 0 (x) < 0 for all x in the domain off (x) then the function is globally decreasing.

Example 1: Demand curves are globally downward sloping while supplycurves are globally upward sloping or monotonic.

Example 2: Consider the function graphed below:

0.2

0.4

0.6

0.8

1

0 2 4 6 8 10 12x :

This function is locally increasing for at say x0 = 4 and in general for any x < 6:It is locally decreasing at x0 = 8; and in general for any x > 6: Since it increasesfor some x and decreases for others, it is neither globally increasing nor globallydecreasing.

Page 42: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 34

Example 3: Consider the function graphed below:

0

10

20

30

40

50

1 2 3 4x :

You can verify from the graph that this function is locally increasing at x0 = 1and at x0 = 3: In fact it is increasing for all x and so this function is globallyincreasing or monotonic.

2.1.7 The Sum, Product and Quotient Rules

There are a small number of rules to remember when calculating derivatives.Three of the more important rules are the sum, product rule and quotient rulesgiven below:

Theorem 42 Sum Rule: If h (x) = af (x)+bg (x) where a and b are constantsthen:

h0 (x) = af 0 (x) + bg0 (x) :

Theorem 43 Product Rule: If h (x) = f (x) g (x) then

h0 (x) = f 0 (x) g (x) + f (x) g0 (x) :

Theorem 44 Quotient Rule: If h (x) = f(x)g(x) then

h0 (x) =g (x) f 0 (x)¡ f (x) g0 (x)

g (x)2:

Example 1: Given

f (x) = 3x5 + 4x3

we have from the sum rule that:

f 0 (x) = 3d

dx

¡x5¢+ 4

d

dx

¡x3¢

= 15x4 + 12x2:

Page 43: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 35

Example 2: Given

h (x) = x2f (x) :

then from the product rule we have:

h0 (x) = 2xf (x) + x2f 0 (x) :

Example 3: Suppose that P (Q) is the inverse demand curve that a monopolistfaces with

P 0 (Q) < 0

so that the inverse demand curve slopes downwards. Total revenue (or sales) asa function of Q is then equal to

R (Q) = P (Q)£Q:Marginal revenue then is de…ned by:

MR (Q) ´ R0 (Q) :Using the product rule we obtain:

MR (Q) =

¡z }| {P 0 (Q)Q+ P (Q)

since P 0 (Q) < 0 and Q > 0: It follows that:

MR (Q) < P (Q)

so that the marginal revenue curve is always less than price. This divergencebetween price and marginal revenue is the reason why a monopolist produces ata lower level of output than is socially optimal (more precisely Pareto optimal).For a perfectly competitive …rm on the other hand P does not depend on Q

and hence is a constant. Since the derivative of a constant is 0 it follows thatP 0 (Q) = 0 and hence:

MR (Q) = P:

Example 4: Suppose we want to calculate the derivative of

h (x) =f (x)

x2:

Then from the quotient rule we have

h0 (x) =x2f 0 (x)¡ 2xf (x)

(x2)2:

Page 44: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 36

Example 5: If C (Q) is the …rm’s cost function, then marginal cost is given by:

MC (Q) ´ C 0 (Q)

while average cost is

AC (Q) ´ C (Q)

Q:

Di¤erentiating AC (Q) and using the quotient rule we …nd that:

AC0 (Q) =QC0 (Q)¡C (Q)

Q2

=1

Q

µC0 (Q)¡ C (Q)

Q

¶=

MC (Q)¡AC (Q)Q

:

From this we see that

AC0 (Q) > 0()MC (Q) > AC (Q)

AC0 (Q) < 0()MC (Q) < AC (Q)

so that the or AC 0 (Q) > 0 when marginal cost exceeds average cost, and averagecost curve is decreasing or AC0 (Q) < 0 when marginal cost is less than averagecost.

Example 6: If

C (Q) = 10Q2 + 20

then:

MC (Q) = 20Q; AC (Q) =10Q2 + 20

Q= 10Q+

20

Q

so that:

AC 0 (Q) = 10¡ 20

Q2=10

Q2¡Q2 ¡ 2¢ = 10

Q2

³Q+

p2´³Q¡

p2´

so that AC (Q) is falling for Q <p2 and hence AC >MC, AC (Q) is increasing

for Q >p2 and hence AC < MC; and that marginal and average cost are equal

when Q =p2; the minimum point of the average cost curve. You can see these

Page 45: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 37

relationships below where the straight line is MC (Q):

20

40

60

80

100

1 2 3 4 5Q

MC (Q) and AC (Q)

:

2.1.8 The Chain Rule

We will often be working with a function of a function. For example considerthe function:

h (x) =1

1 + x2:

We can think of h (x) as consisting of two functions: an outside function:f (x) = 1

x and an inside function: g (x) = 1 + x2; that is:

h (x) = f (g (x))

=1

g (x)

=1

1 + x2:

In general we have:

De…nition 45 Given:

h (x) = f (g (x))

we call f (x) the outside function and g (x) the inside function.

At the moment we have no rule for …nding h0 (x) : Suppose however thatwe know how to calculate the derivative of the outside function f (x) and theinside function g (x). The chain rule then allows us to calculate the derivativeof: h (x) = f (g (x)) as:

Theorem 46 Chain Rule: If h (x) = f (g (x)) then

h0 (x) = f 0 (g (x)) g0 (x) :

Page 46: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 38

In the beginning it is common for students to have trouble with the chainrule. It should eventually become second nature but until then you might bebetter of being very systematic. The chain rule can be broken down into a recipeas follows:

A Recipe for the Chain Rule

1. Identify the outside function f (x) and the inside function g (x) : (If youare not sure verify by putting the inside function g (x) inside f (x) asf (g (x)) and make sure you get h (x).)

2. Take the derivative of the outside function: f 0 (x).

3. Replace x in f 0 (x) in 2. with the inside function g (x) to obtain: f 0 (g (x)) :

4. Take the derivative of the inside function: g0 (x).

5. Multiply the result in 3. by that in 4. to get: h0 (x) = f 0 (g (x)) g0 (x).

Remark: It is important to correctly identify the outside and inside functions.If instead we were to put f (x) inside g (x) as g (f (x)) we obtain a di¤erentfunction. For example with f (x) = 1

x and g (x) = 1 + x2 if instead of f (g (x))

one calculated:

g (f (x)) = 1 + f (x)2

= 1 +

µ1

x

¶2= 1 +

1

x2

which is not the same as f (g (x)) = 11+x2 :

Example 1: For h (x) = 11+x2 and following the recipe we have:

1. The outside function is f (x) = 1x and the inside function is g (x) = 1+x

2:

2. Taking the derivative of the outside function we obtain: f 0 (x) = ¡ 1x2 :

3. Putting the inside function inside the result in 2: we obtain:

f 0 (g (x)) = ¡ 1

g (x)2= ¡ 1

(1 + x2)2:

4. Taking the derivative of the inside function we obtain: g0 (x) = 2x:

Page 47: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 39

5. Multiplying 3: and 4: we obtain:

h0 (x) =

á 1

(1 + x2)2

!| {z }

from step 3

£ 2x|{z}from step 4

= ¡ 2x

(1 + x2)2:

Example 2: For h (x) =p1 + x4 we have:

1. The outside function is f (x) =px = x

12 and the inside function is g (x) =

1 + x4: We verify this as: f (g (x)) =pg (x) =

p1 + x4:

2. Taking the derivative of the outside function we obtain: f 0 (x) = 12x

¡ 12 =

12px:

3. Putting the inside function inside the result in 2: we obtain:

f 0 (g (x)) =1

2pg (x)

=1

2p1 + x4

:

4. Taking the derivative of the inside function we obtain: g0 (x) = 4x3:

5. Multiplying 3: and 4: we obtain:

h0 (x) =µ

1

2p1 + x4

¶| {z }from step 3

£ 4x3|{z}from step 4

=4x3

2p1 + x4

=2x3p1 + x4

:

2.1.9 Inverse Functions

Given a function: y = f (x) we will often want to reverse x and y and make xthe dependent variable and y the independent variable.For example we usually think of a demand curve as havingQ, the quantity, as

the dependent variable and P , the price, as the independent variable so we writeQ = Q (P ). In some applications however it is easier to make P the dependentvariable and Q the independent variable and write P = P (Q) ; which is theinverse demand curve.

Example: Suppose we have the function y = f (x) = 6 ¡ 3x or changing thenotation:

Q = Q (P ) = 6¡ 3P

and so we think of this as an ordinary demand curve. This demand curve treatsquantity Q as the dependent variable and price P as the independent variable.

Page 48: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 40

Suppose we wanted instead to have P as the dependent variable or the inversedemand curve. By putting P on the left-hand sides as:

Q = 6¡ 3P =) 3P = 6¡Qwe have:

P = P (Q) = 2¡ 13Q:

Translating back into the x; y notation the inverse demand curve takes the form:y = g (x) = 2¡ 1

3x:

The essence of an inverse function then is that we reverse the role of theindependent variable x and the dependent variable y: Visually one obtains aninverse function by taking a graph and ‡ipping it around to put the y axis belowand the x axis above. Thus a function and its inverse really express the samerelationship between y and x:Of course in order to prove things about inverse functions we need to de…ne

them. To see how this is done consider the demand curve and the inversedemand curve in the x; y notation:

f (x) = 6¡ 3x; g (x) = 2¡ 13x:

Suppose we put g (x) inside f (x) : Then we obtain a remarkable result:

f (g (x)) = 6¡ 3µ2¡ 1

3x

¶= x:

If instead we put f (x) inside g (x) then we get the same remarkable result:

g (f (x)) = 2¡ 13(6¡ 3x) = x:

In both cases we obtain x: This is in fact the basis for the de…nition of an inversefunction:

De…nition 47 Inverse Function: Given a function f (x) if there exists an-other function g (x) such that

f (g (x)) = g (f (x)) = x

then we say that g (x) is the inverse function of f (x) and f (x) is the inversefunction of g (x) :

Remark: If you think of x as trapped inside f (x) ; then applying the inversefunction g (x) liberates x from f (x) since:

g (f (x)) = x:

Page 49: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 41

Similarly f (x) liberates x from g (x) since:

f (g (x)) = x:

Often we when attempting to solve equations we will want to do just this, toget x outside by itself. In this case inverse functions are the tool we need.

We have:

Theorem 48 If f (x) = xn for x > 0 then the inverse function of f (x) isg (x) = x

1n :

Proof. If f (x) = xn then

g (f (x)) = g (xn) = (xn)1n = xn£

1n = x1 = x:

Example: If f (x) = x7 with x > 0 then the inverse function is g (x) = x17 .

Suppose you wish to solve the equation:

f (x) = x7 = 3

for x; that is you wish to get x alone by itself. Using the inverse function tofree x we …nd that:

g (f(x)) = g (3)

=) x = 317 :

Not all functions have an inverse function; in fact only globally increasingor decreasing functions have inverses:

Theorem 49 Existence of an Inverse Function: The inverse function forf (x) exists if and only if f (x) is either globally increasing or globally decreasing.

Example: The function

f (x) = (x¡ 1)2

does not have an inverse function since f 0 (x) < 0 for x < 1 and f 0 (x) > 0 for

Page 50: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 42

x > 1 as illustrated below:

0

1

2

3

4

y

-1 1 2 3x

f (x) = (x¡ 1)2

The problem with this function is that if we ‡ip the graph around and make xthe dependent variable as:

-1

0

1

2

3

y

1 2 3 4 5x

Note that associated with each y are not one but two x0s and so this is not aproper function.

2.1.10 The Derivative of an Inverse Function

The question we now address is the relationship between the derivatives of twoinverse functions f (x) and g (x) :We have:

Theorem 50 Derivative of an Inverse Function: If f (x) has an inversefunction g (x) then g0 (x) is given by:

g0 (x) =1

f 0 (g (x)):

Page 51: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 43

Proof. Taking the derivative of both sides of

f (g (x)) = x

and using the chain rule we obtain:

f 0 (g (x)) g0 (x) = 1:

Solving for g0 (x) then gives the result.Thus the slope of the inverse function g (x) is the inverse of the slope of the

original function (with x replaced with g (x)).

Example 1: We saw that demand and inverse demand curves written in x; ynotation:

f (x) = 6¡ 3x; g (x) = 2¡ 13x

are inverses of each other. We have: f 0 (x) = ¡3 and g0 (x) = ¡13 so that the

two functions have derivatives which are inverses of each other.

Example 2: If

f (x) = x2

then f 0 (x) = 2x: The inverse function of f (x) is

g (x) = x12

and so g0 (x) = 12x

¡ 12 . Alternative we could calculate g0 (x) from f (x) as:

g0 (x) =1

f 0 (g (x))

=1

2£ g (x)=

1

2£ x 12

=1

2x¡

12 :

2.1.11 The Elasticity of an Inverse Function

Suppose that f (x) has an inverse function: g (x) and that the correspondingelasticities are:

´f (x) ´f 0 (x)xf (x)

; ´g (x) ´g0 (x)xg (x)

:

We have:

Page 52: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 44

Theorem 51 If f (x) has an inverse function: g (x) then:

´g (x) =1

´f (g (x)):

Proof. Since g (x) is the inverse function of f (x) we have:

x = f (g (x)) :

Replacing x by f (g (x)) we obtain:

´g (x) =g0 (x)xg (x)

=g0 (x) f (g (x))

g (x)

Now replacing g0 (x) by 1f 0(g(x)) from Theorem 50 we obtain:

´g (x) =g0 (x) f (g (x))

g (x)

=f (g (x))

f 0 (g (x)) g (x)

=1³

f 0(g(x))g(x)f(g(x))

´=

1

´f (g (x)):

Example 1: Consider the function f (x) and its inverse g (x) given by:

f (x) = x3; g (x) = x13 :

Since both f (x) and g (x) are of the form Axb the elasticities of each are givenby the exponents on x: Thus:´f (x) = 3 and ´g (x) =

13 =

1´f (x)

:

Example 2: Suppose a monopolist faces a demand curve: Q = Q (P ) thathas an elasticity ´Q (P ) and that the inverse demand curve P = P (Q) has anelasticity: ´P (Q). Then from Theorem 51:

´P (Q) =1

´Q (P (Q)):

Now revenue for the monopolist as a function of Q is given by:

R (Q) = Q£ P (Q)

Page 53: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 45

so that marginal revenue is from the product rule:

MR (Q) = R0 (Q)= P (Q) +Q£ P 0 (Q)= P (Q)

µ1 +

P 0 (Q)£QP (Q)

¶= P (Q) (1 + ´P (Q))

= P (Q)

µ1 +

1

´Q (P (Q))

¶:

Since the monopolist choose Q where MR = MC; and since MC > 0 itfollows that

P (Q)

µ1 +

1

´Q (P (Q))

¶> 0 =) ´Q (P (Q)) < ¡1

so the monopolist always acts on the elastic part of the demand curve.

2.2 Second DerivativesHe who can digest a second or third derivative ... need not, we think,be squeamish about any point of divinity. -George Berkeley

Since the derivative f 0 (x) is a function, it too has a derivative, which is thesecond derivative of f (x). We have then:

De…nition 52 Second Derivative: The second derivative of f (x) ; denotedby f 00 (x) or d2y

dx2 orddx (f

0 (x)) is the …rst derivative of f 0 (x) :

Example: Consider the function:

f (x) = x3 ¡ x =) f 0 (x) = 3x2 ¡ 1:The second derivative f 00 (x) is then the …rst derivative of the …rst derivative or:

f 0 (x) = 3x2 ¡ 1 =) f 00 (x) = 6x:

2.2.1 Convexity and Concavity

Alice didn’t dare to argue the point, but went on: “and I thoughtI’d try and …nd my way to the top of that hill.” “When you say‘hill’,” the Queen interrupted,” I could show you hills, in comparisonwith which you’d call that a valley.” “No, I shouldn’t,” said Alice,surprised into contradicting her at last: “a hill can’t be a valley, youknow, That would be nonsense.”

-Lewis Carroll Through the Looking Glass.

Page 54: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 46

While the sign of f 0 (x) tells you if the function is upward or downwardsloping, the sign of f 00 (x) tells you whether you or standing on a mountain or ina valley or in mathematical jargon, whether the function is concave or convex.We have:

De…nition 53 Local Concavity: The function f (x) is locally concave at xo(or locally mountain-like) if and only if f 00 (xo) < 0:

De…nition 54 (Global) Concavity: The function f (x) is (globally) concave(or globally mountain-like) if and only if f 00 (x) < 0 for all x in the domain off (x) :

De…nition 55 Local Convexity: The function f (x) is locally convex xo (orlocally valley-like) if and only if f 00 (xo) > 0:

De…nition 56 (Global) Convexity: The function f (x) is (globally) convex(or globally valley-like) if and only if f 00 (x) > 0 for all x in the domain off (x) :

Concavity and convexity are fundamental concepts and so we will be referringto them often. It will quickly become tedious if we always have to qualifyconcavity and convexity with either ‘local’ or ‘global’. For this reason we willadopt the following convention:

Convention: When we say a function is concave without saying ‘global’ or‘local’, we mean the function is globally concave. Similarly if we say a functionis convex without saying ‘global’ or ‘local’, we mean the function is globallyconvex.

Example 1: Given

f (x) = x3 ¡ xif x0 = ¡1 then

f 0 (¡1) = 2 > 0 and f 00 (¡1) = ¡6 < 0and so at x0 = ¡1; f (x) is locally increasing (or upward sloping) and locallyconcave (locally mountain-like ).At x0 = 1

f 0 (1) = 2 > 0 and f 00 (1) = 6 > 0

and so at x0 = 1 f (x) is locally increasing and locally convex (or locally valley-like).More generally since f 00 (x) = 6x it follows in general that f 00 (x) < 0 for

x < 0 and hence f (x) is locally concave (or locally mountain-like) for x < 0:Similarly for x > 0 f 00 (x) > 0 and so f (x) is locally convex (or locally valley-like).

Page 55: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 47

Example 2: Given:

f (x) =1

x= x¡1 for x > 0

we have

f 0 (x) = ¡x¡2 = ¡ 1

x2< 0

for all x: Hence f (x) is globally decreasing. Furthermore:

f 00 (x) = 2x¡3 =2

x3> 0

for all x and so that f (x) is globally convex. These properties of f (x) areillustrated in the plot below:

1

2

3

4

5

y

1 2 3 4 5x

f (x) = 1x

:

Example 3: Given the function:

f (x) = ¡1x= ¡x¡1 for x > 0

we have

f 0 (x) = x¡2 =1

x2> 0

for all x: Hence f (x) is (globally) increasing or monotonic. Furthermore:

f 00 (x) = ¡2x¡3 = ¡ 2

x3< 0

Page 56: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 48

for all x and so f (x) is globally concave or globally mountain-like. This isillustrated in the plot below:

-5

-4

-3

-2

-1

y

1 2 3 4 5x

f (x) = ¡ 1x

:

2.2.2 Economics and ‘Diminishing Marginal ...’

In economics one often hears the expression: ‘diminishing marginal ...’. Recallthat the marginal is the derivative f 0 (x) : Thus if the marginal is decreasing the…rst derivative of f 0 (x) or f 00 (x) must be negative or

f 00 (x) =d

dx(f 0 (x)) < 0:

Thus stating that the marginal is decreasing is equivalent to stating thatf 00 (x) < 0 or that the function is concave (mountain-like).

Example: The Cobb-Douglas production function:

Q = f (L) =pL

plotted below:

1

2

3

4

5

Q

0 5 10 15 20 25L

The Production Function: Q =pL

Page 57: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 49

has a marginal product of labour:

MPL (L) = f0 (L) =

1

2pL

which although positive, decreases as L increases as plotted below:

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

5 10 15 20 25L

MPL (L) =1

2pL

:

Equivalently

MP 0L (L) = f00 (L) =

¡14³pL´3 < 0

and so a diminishing marginal product of labour is equivalent to the productionfunction being concave or mountain-like.

2.3 Maximization and Minimization

2.3.1 First-Order Conditions

The cornerstone of economic thinking is that people are assumed to be rational.Rational behavior generally means maximizing or minimizing something. Thusrational households maximize utility and rational …rms maximize pro…ts. Bothpro…t maximizers and utility maximizers must minimize costs.A maximum (minimum) is found on the top of a mountain (bottom of a

valley) at a point where the mountain (valley) is ‡at. If it were not ‡at thenyou could always go a little higher (lower) by either moving up or down theslope.This intuition leads to the …rst-order conditions for a maximum or minimum:

Theorem 57 First-Order Condition for a Maximum. If f (x) is maxi-mized at x = x¤ then f 0 (x¤) = 0:

Page 58: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 50

Theorem 58 First-Order Condition for a Minimum. If f (x) is mini-mized at x = x¤ then f 0 (x¤) = 0:

Remark: Calculating …rst-order conditions is one of the most basic skills thatan economist must have. Although this is often straightforward, there are nev-ertheless common problems that occur when students do not derive them sys-tematically. For this reason you may wish at the beginning to use the followingrecipe:

Recipe for Calculating First-Order Conditions

Given a function f (x) to be maximized or minimized:

1. Calculate the …rst derivative f 0 (x) :

2. Replace all occurrences of x in 1: with x¤ and the resulting expressionequal to 0:

3. If possible solve the expression in 2. for x¤ or if this is not possible, tryand learn something about x¤ from the …rst-order conditions.

Example 1: Consider the function:

f (x) = x3 ¡ xFollowing the recipe we have:

1. Calculate the derivative of f (x) :

f 0 (x) = 3x2 ¡ 1:

2. Put a ¤ on x in 1: and set the result equal to 0: Thus:

f 0 (x¤) = 3 (x¤)2 ¡ 1 = 0:

3. Solving the expression in 2: we …nd that:

3 (x¤)2 ¡ 1 = 0

=) (x¤)2 =1

3

=) x¤1 =1p3and x¤2 = ¡

1p3:

Example 2: Consider the function

f (x) = x1=2 ¡ xwith domain x ¸ 0: Following the recipe we have:

Page 59: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 51

1. Calculate the derivative of f (x) :

f 0 (x) =1

2x¡1=2 ¡ 1:

2. Put a ¤ on x in 1: and set the result equal to 0: Thus:

f 0 (x¤) =1

2(x¤)¡1=2 ¡ 1 = 0:

3. Solving the expression in 2. we …nd that:

1

2(x¤)¡1=2 ¡ 1 = 0

=) (x¤)¡1=2 = 2

=) x¤ = 2¡2 =1

4:

2.3.2 Second-Order Conditions

The …rst-order conditions for a maximum and a minimum are identical and sojust from: f 0 (x¤) = 0 we have no way of knowing whether x¤ is a maximumor a minimum. This is then where the second derivative f 00 (x) and the second-order conditions become useful. The basic principle is that mountains (concavefunctions with f 00 (x) < 0 ) have tops or maxima and valleys (convex functionswith f 00 (x) > 0 ) have bottoms or minima.For the moment we will only deal with local maxima and minima. If x¤

satis…es the …rst-order conditions: f 0 (x¤) = 0 and at x¤ the function is locallymountain-like, then x¤ must be a local maximum. If at x¤ the function islocally valley-like, then x¤ must be a local minimum.Consequently we have:

Theorem 59 Second-Order Conditions for a Local Maximum. If f 0 (x¤) =0 and f 00 (x¤) < 0 (i.e. f (x) is locally concave or mountain-like at x¤) then x¤

is a local maximum.

Theorem 60 Second-Order Conditions for a Local Minimum. If f 0 (x¤) =0 and f 00 (x¤) > 0 (i.e. f (x) is locally convex or valley-like at x¤) then x¤ is alocal minimum.

Example 1 (continued): For the function:

f (x) = x3 ¡ xthe solutions to the …rst-order conditions are: x¤1 =

1p3and x¤2 = ¡ 1p

3: Calcu-

lating f 00 (x) we have:

f 0 (x) = 3x2 ¡ 1 =) f 00 (x) = 6x:

Page 60: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 52

To begin consider x¤1 =1p3we …nd that:

f 00µ1p3

¶= 6£ 1p

3= 3:4641 > 0

so that f (x) is locally convex at x¤1 and hence from the second-order conditionsx¤1 =

1p3is a local minimum.

For x¤2 = ¡ 1p3we …nd that

f 00µ¡ 1p

3

¶= 6£ ¡1p

3= ¡3:4641 < 0

so that f (x) is locally concave at x¤2 and hence from the second-order condi-tions x¤2 = ¡ 1p

3is a local maximum.

Example 2 (continued): For the function:

f (x) = x1=2 ¡ xthe solution to the …rst-order conditions is: x¤ = 1

4 : Now since

f 00 (x) = ¡x¡3=2

4=) f 00

µ1

4

¶=¡ ¡14¢¡3=2

4= ¡2 < 0

it follows that f (x) is locally concave (or mountain-like) at x¤ = 14 and hence

x¤ = 14 is a local maximum.

2.3.3 Su¢cient Conditions for a Global Maximum or Min-imum

In economics we are usually only interested in global maxima or minima. Apro…t maximizing …rm would not chose a local pro…t maximum if it were nota global maximum. If x¤ satis…es the …rst and second-order conditions then allwe can say is that x¤ is a local maximum or a local minimum. We do not knowif we are at a global maximum or a global minimum.If a function f (x) is globally concave then it is everywhere mountain-like; es-

sentially f (x) is one mountain. Now if you …nd a ‡at spot on this one mountainit must be a global maximum, there can be no higher point on the mountain. .Similarly if a function f (x) is globally convex then it is everywhere valley-

like so that essentially f (x) is one valley. Now if you …nd a ‡at spot on this onevalley it must be a global minimum.Thus local concavity or convexity insures that if f 0 (x¤) = 0 then x¤ is a

local maximum or minimum. Global concavity or convexity on the other handinsures that x¤ is a global maximum or minimum.

Theorem 61 If a function f (x) is globally concave so that f 00 (x) < 0 for allx and x¤ satis…es the …rst-order conditions: f 0 (x¤) = 0 then x¤ is a uniqueglobal maximum.

Page 61: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 53

Theorem 62 If a function f (x) is globally convex so that f 00 (x) > 0 for allx and x¤ satis…es the …rst-order conditions: f 0 (x¤) = 0 then x¤ is a uniqueglobal minimum.

Note that this is a su¢cient condition for x¤ to be a global maximum (min-imum), it is not a necessary condition for a global maximum (minimum). Aswe shall see there are functions which are not globally concave or convex whichnevertheless have a unique global maximum or minimum.

Example 1: The function:

f (x) = x3 ¡ x

actually has no global maximum or minimum since f (x)!§1 as x!§1 asshown in the plot below:

-6

-4

-2

0

2

4

6

y

-2 -1 1 2x

f (x) = x3 ¡ x:

However if we restrict the domain to be x > 0 then since x > 0:

f 00 (x) = 6x > 0

and so the function is globally convex.We saw there were two solutions to the …rst-order conditions: x¤1 =

1p3and

x¤2 = ¡ 1p3: Since x¤2 is negative it is not now in the domain of f (x) so that

Page 62: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 54

x¤1 =1p3is the unique global minimum. This is illustrated in the plot below:

0

1

2

3

4

5

6

y

0.5 1 1.5 2x

f (x) = x3 ¡ x for x > 0:

Example 2: Consider the function

f (x) = x1=2 ¡ x

with domain x ¸ 0 which is plotted below:

-0.6

-0.4

-0.2

0

0.2

y

0.5 1 1.5 2x

f (x) = x1=2 ¡ x:

Now since

f 00 (x) = ¡14x¡3=2 < 0

for all x (since x¡3=2 > 0) it follows that f (x) is globally concave and x¤ = 14

is the unique global maximum.

Page 63: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 55

2.3.4 Pro…t Maximization

Then I looked on all the works that my hands had wrought, and onthe labour that I had laboured to do; and, behold, all was vanity anda striving after wind, and there was no pro…t under the sun.-Ecclesiastes 2:11

One should always generalize. -Karl Jacobi

Consider the problem of a …rm maximizing pro…ts in the short-run. We aregoing to look at this problem from various levels of generality, beginning with avery simple special case and from there increasing the level of generality. Oftenin textbooks one sees the most general case …rst and then one looks at speci…cexamples. Knowledge in real life does not usually progress in this manner;rather new ideas typically begin with special cases from which a researcherdevelops some understanding and curiosity. From there he or she then attemptsto generalize the results of the special case.

Example 1: Consider the least general case of a …rm with Cobb-Douglas pro-duction function where:

Q = f (L) = L12 :

Pro…ts for the …rm are:

¼ (L) = Pf (L)¡WL= PL

12 ¡WL

where P is the price the …rm receives andW is the nominal wage. Di¤erentiatingwith respect to L we …nd that:

¼0 (L) =1

2PL¡

12 ¡W

so that putting a ¤ on L and setting the derivative equal to 0 yields the …rst-ordercondition for pro…t maximization:

1

2PL¤¡

12 ¡W = 0:

If we de…ne w = WP as the real wage then this implies that:

1

2(L¤)¡

12 = w

or:

L¤ =1

4w¡2:

Page 64: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 56

This gives the pro…t maximizing L¤ as a function of the real price of labour wand so is the …rm’s labour demand curve. Since L¤ has the functional form Axb

the elasticity of demand is the exponent on w or ¡2.

0

1

2

3

4

5

6

L

0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2w

L¤ (w) = 14w

¡2

Furthermore L¤ is a global maximum since:

¼00 (L) = ¡14PL¡

32 < 0

and hence ¼ (L) is globally concave.If P = 4 and W = 2 then w = 2

4 =12 and the …rm would hire

L¤ =1

4w¡2 =

1

4

1¡12

¢¡2 = 1or one worker. If there were a 100% in‡ation so that P andW doubled to P = 8and W = 4 then the real wage remains the same as w = 1

2 and so L¤ stays at

L¤ = 1: This re‡ects the fact that a rational …rm only cares about the realwage when hiring.The …rm’s supply curve is found by substituting the optimal amount of

labour L¤ = 14w

¡2 into the production function: Q = f (L) = L12 so that:

Q¤ =

µ1

4w¡2

¶ 12

=1

2w¡1

=1

2

µW

P

¶¡1=

1

2p:

where p = PW is the real price of the good Q: Thus: dQ

¤dp = 1

2 > 0 and thesupply curve slopes upwards. Note that the supply curve has the form Axb andso the elasticity of supply is the exponent on p or 1:

Page 65: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 57

Example 2: We have found that the …rm’s labour demand curve slopes down-wards and its supply curve slopes upwards, that both depend on the real pricesw and p, and both have a constant elasticity. Suppose you were the …rst toactually derive these results. If you are curious you would then ask yourself ifthese results are merely a coincidence or indicative of more general results?We therefore seek to generalize. To do this we could replace the exponent: 12

exponent on Q = L12 with say 1

3 and redo the analysis. Once this done we couldthen replace 1

3 with25 and so on. The problem with this approach is that there

are an in…nite number of possible exponents on L we could use so we wouldnever arrive at any …rm conclusions.Suppose instead we replace 1

2 not with a number but with a letter ® andwrite the …rm’s production function as:

Q = f (L) = AL®:

where A > 0 and where we assume that:

0 < ® < 1:

By doing this we will be able to analyze an in…nite number of possible exponentsat once!The assumptions on ® insure a positive and diminishing marginal product

of labour since:

MPL (L) ´ f 0 (L) = ®AL®¡1 > 0and:

f 00 (L) =dMPL (L)

dL= ®

¡z }| {(®¡ 1)AL®¡2 < 0

since ® < 1.Pro…ts for the …rm are given by:

¼ (L) = Pf (L)¡WL= PAL® ¡WL

so that:

¼0 (L) = ®PAL®¡1 ¡W:Putting a ¤ on L and setting the result equal to 0 yields the …rst-order conditionfor pro…t maximization:

®PA (L¤)®¡1 ¡W = 0:

Solving for L¤ we obtain the labour demand curve:

L¤ = (®A)1

1¡® w1

®¡1

Page 66: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 58

where w is the real wage. This has the form Axb and so the elasticity of demandis the exponent on w or:

1

®¡ 1 < 0

since ® < 1: It follows then that the labour demand curve slopes downwards.Note this includes the results of the previous example as a special case since if® = 1

2 the elasticity becomes

112 ¡ 1

= ¡2:

This is a global pro…t maximum since:

¼00 (L) = P®

¡z }| {(®¡ 1)AL®¡2 < 0

so that ¼ (L) is globally concave.The supply curve is found by substituting L¤ into Q = f (L) so that:

Q¤ = f (L¤) = A³(®A)

11¡® w

1®¡1´®

= A1

1¡®®®

1¡®w®

®¡1

= A1

1¡®®®

1¡®

µW

P

¶ ®®¡1

= A1

1¡®®®

1¡® p®

1¡®

= Bp®

1¡®

where: B = A1

1¡®®®

1¡® and p = PW is the real price of Q: This also has the form

Axb and so the elasticity of supply is the exponent on p:

®

1¡ ® > 0

and so the supply curve slopes upwards. This includes the previous result as aspecial case since if ® = 1

2 the elasticity of supply is

12

1¡ 12

= 1:

Example 3: We see from the previous example that the elasticities changewhen we change ® from 1

2 but that the …rm’s labour demand curve still slopesdownward and the supply curve still slopes upward. If however we allowed® > 1 then the expression for the labour demand curve elasticity would becomepositive; that is:

1

®¡ 1 > 0:

Page 67: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 59

However ® > 1 would also mean that we would no longer have a diminishingmarginal product of labour and the pro…t function would no longer be concavebut instead would be convex. All these clues point to the fact that it is thediminishing marginal product of labour that is the key requirement in obtaininga downward sloping labour demand and upward sloping supply curves.We now attempt to generalize even further. Consider replacing Q = AL®

with:

Q = f (L)

where the only assumptions we now make are that the marginal product oflabour is positive and diminishing so that:MPL (L) = f 0 (L) > 0 and:MP 0L (L) =f 00 (L) < 0:Pro…ts as a function of L are then given by:

¼ (L) = Pf (L)¡WLDi¤erentiating with respect to L we …nd that:

¼0 (L) = Pf 0 (L)¡Wso that putting a ¤ on L and setting the derivative equal to 0 yields the …rst-ordercondition for pro…t maximization:

¼0 (L¤) = 0 =) Pf 0 (L¤)¡W = 0

=) MPL (L¤) = w:

This result shows that the inverse labour demand curve is in fact the marginalproduct of labour curve. Furthermore it shows that labour demand L¤ is afunction of the real wage w:Since

¼00 (L) = Pf 00 (L) < 0

for all L; the pro…t function is globally concave and L¤ is a global maximum.Consider now the problem of showing that the labour demand curve L¤ =

L¤ (w) is downward sloping. The labour demand curve is implicitly de…ned by:

MPL (L¤ (w)) = w:

We would like …nd the derivative dL¤(w)dw and show that: dL¤(w)

dw < 0; that isthat the demand for labour slopes downwards. The problem is that L¤ (w) istrapped inside the marginal product of labourMPL (L) and so we cannot get atit directly. We can however use the chain rule to get an expression for dL¤(w)

dw :Here MPL (L) is the outside function and L¤ (w) is the inside function. Thusdi¤erentiating both sides with respect to the real wage w we obtain:

d

dwMPL (L

¤ (w)) =d

dww =)MP 0L (L

¤ (w))dL¤ (w)dw

= 1:

Page 68: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 60

Note that the chain rule forces dL¤(w)dw outside where we can now work with it.

Since we assume a diminishing marginal product of labour or MP 0L (L) < 0 weconclude that:

dL¤ (w)dw

=1

MP 0L (L¤ (w))< 0

which shows then that the labour demand curve is downward sloping.The …rm’s supply curve is given by replacing L with L¤ (w) in the production

function as:

Q¤ = f (L¤ (w)) = f¡L¤¡p¡1

¢¢:

where p = PW and hence p¡1 = W

P = w: Consider now showing that the supplycurve slopes upwards or: dQ

¤dp > 0: Here p is buried deep inside: f (L) ; L¤ (w)

and p¡1 so we will need to use the chain rule three times!. We have:

dQ¤

dp= f 0 (L¤ (w))

dL¤ (w)dw

dp¡1

dp

=

+z }| {f 0 (L¤ (w))

¡z }| {dL¤ (w)dw

£

¡z}|{¡ 1p2

> 0:

Thus the …rm’s supply curve is upward sloping.This problem illustrates how one can obtain very general results with very

minimal assumptions. We have shown that given only a positive and diminishingmarginal product of labour that labour demand must slope downwards andsupply must slope upwards. Furthermore demand and supply are functions ofthe real prices w and p:

2.4 Econometrics

Econometrics is the bridge between theory and real life. -James Ram-sey

2.4.1 Least Squares Estimation

Estimation of a Constant: ¹

Consider the simple linear regression model:

Yi = ¹+ ei; i = 1; 2; : : : n:

Here ei is random noise. If it were not for this random noise each Yi wouldbe identical as: Yi = ¹: Since the data is corrupted however we do not get to

Page 69: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 61

directly observe ¹ but only a sample of the Yi 0s . Our problem is to guess what¹ is.Our guess is ¹; the least squares estimator of ¹ which minimizes the sum of

squares function:

S (¹) =nXi=1

(Yi ¡ ¹)2

= (Y1 ¡ ¹)2 + (Y2 ¡ ¹)2 + ¢ ¢ ¢+ (Yn ¡ ¹)2 :

Here ¹ plays the role of the x variable and ¹ is x¤: The function: S (¹) is in factjust a quadratic.Using the sum and chain rule to di¤erentiate S (¹) we have:

S0 (¹) = ¡2 (Y1 ¡ ¹)¡ 2 (Y2 ¡ ¹)¡ ¢ ¢ ¢ ¡ 2 (Yn ¡ ¹)= ¡2 (Y1 + Y2 + ¢ ¢ ¢+ Yn ¡ n¹) :

It follows then that the …rst-order conditions require:

S0 (¹) = 0 = ¡2 (Y1 + Y2 + ¢ ¢ ¢+ Yn ¡ n¹)=) n¹ = Y1 + Y2 + ¢ ¢ ¢+ Yn=) ¹ =

Y1 + Y2 + ¢ ¢ ¢+ Ynn

= ¹Y :

Thus our best guess of ¹ is the sample mean of the Yi 0s.S (¹) is globally convex since:

S00 (¹) = 2 + 2+ ¢ ¢ ¢+ 2| {z }n times

= 2n > 0

and so ¹ is in fact a global minimum of S (¹).

Example: Suppose one has data on the consumption of n = 4 families:

i : 1 2 3 4Yi = 72 58 63 55:

Here each family consumes di¤erent amounts than ¹ because of random noiseei (e.g., unexpected dental bills). To …nd ¹ we construct the sum of squaresfunction:

S (¹) = (72¡ ¹)2 + (58¡ ¹)2 + (63¡ ¹)2 + (55¡ ¹)2

Page 70: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 62

which is plotted below:

1000

2000

3000

4000

5000

6000

7000

20 40 60 80 100mu

S (¹)

:

As illustrated in the graph, the minimum of S (¹) occurs at the sample mean

¹ = ¹Y =1

4(72 + 58 + 63 + 55) = 62:

Linear Regression

Now suppose Yi varied systematically with another variable, called a regressor:Xi as:

Yi = Xi¯ + ei; i = 1; 2; : : : n:

For example if Yi is the consumption of family i then Xi might be their incomeso that ¯ would be the marginal propensity to consume since: ¯ = dYi

dXi. We

cannot again directly observe ¯ from the data because the data is corrupted bythe random noise ei: Instead we estimate ¯ from a set of n observations on Yiand Xi using the least squares estimator ^ which minimizes:

S (¯) =nXi=1

(Yi ¡Xi¯)2

= (Y1 ¡X1¯)2 + (Y2 ¡X2¯)2 + ¢ ¢ ¢+ (Yn ¡Xn¯)2 :

Using the sum and chain rule to di¤erentiate S (¯) we have:

S0 (¯) = ¡2X1 (Y1 ¡X1¯)¡ 2X2 (Y2 ¡X2¯)¡ ¢ ¢ ¢ ¡ 2Xn (Yn ¡Xn¯)

Page 71: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 63

so that the …rst-order conditions require:

S0³^´

= 0 = ¡2X1³Y1 ¡X1 ^

´¡ 2X2

³Y2 ¡X2 ^

´¡ ¢ ¢ ¢ ¡ 2Xn

³Yn ¡Xn ^

´=) X1

³Y1 ¡X1 ^

´+X2

³Y2 ¡X2 ^

´+ ¢ ¢ ¢+Xn

³Yn ¡Xn ^

´= 0

=) ¡X21 +X

22 + ¢ ¢ ¢+X2

n

¢^ = X1Y1 +X2Y2 + ¢ ¢ ¢+XnYn

=) ^ =X1Y1 +X2Y2 + ¢ ¢ ¢+XnYnX21 +X

22 + ¢ ¢ ¢+X2

n

:

S (¯) is globally convex as long as Xi 6= 0 for at least one i (i.e., at least onefamily has a non-zero income) since:

S00 (¯) = 2X21 + 2X

22 + ¢ ¢ ¢+ 2X2

n > 0:

It follows that the least squares estimator ^ is a global minimum.

Example: Suppose one has data on the consumption of n = 4 families alongwith their income as:

i : 1 2 3 4Yi = 72 58 63 55Xi = 98 80 91 73

so that for example family 2 has consumption of 58 and an income of 80: Weseek the best line Y = ¯X which goes through the data plotted below:

75

80

85

90

95

56 58 60 62 64 66 68 70 72Income :

The sum of squares is then:

S (¯) = (72¡ 98¯)2 + (58¡ 80¯)2 + (63¡ 91¯)2 + (55¡ 73¯)2

Page 72: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 64

which is plotted below:

0

500

1000

1500

2000

2500

3000

0.4 0.5 0.6 0.7 0.8 0.9 1beta

S (¯)

::

As illustrated in the graph, the minimum of S (¯) occurs at:

^ =X1Y1 +X2Y2 + ¢ ¢ ¢+XnYnX21 +X

22 + ¢ ¢ ¢+X2

n

=98£ 72 + 80£ 58 + 91£ 63 + 73£ 55

982 + 802 + 912 + 732

= 0:724:

Thus the estimated marginal propensity to consume from this data set is ^ =0:724:

2.4.2 Maximum Likelihood

Maximum likelihood is a very general technique used in econometrics which canbe applied to almost any problem including those where linear regression fails.The basic approach is to calculate the likelihood L (µ) where µ is a parameter

of interest. The maximum likelihood estimator of µ is then that µ which maxi-mizes L (µ) : This is traditionally denoted by µ and hence solves the …rst-orderconditions:

dL³µ´

dµ= 0:

It is usually easier to maximize the log-likelihood de…ned by l (µ) = ln (L (µ))which gives the same result since ln (x) is a monotonic; that is:

dL³µ´

dµ= 0,

dl³µ´

dµ=

1

L³µ´ dL

³µ´

dµ= 0:

Page 73: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 65

Once µ is found from the …rst-order conditions, we often wish to constructa con…dence interval which will then indicate how accurate our guess is. Tra-ditionally one constructs a 95% con…dence interval for the unknown µ; whichtakes the form:

µ § 1:96£p±

where ± is the variance of µ: This formula says that µ will lie within the interval:

µ ¡ 1:96£p± · µ · µ + 1:96£

95 time out of 100; or equivalently, 19 times out of 20:To construct our con…dence interval calculate ± using:

± =

0@¡d2l³µ´

dµ2

1A¡1

:

Note that since we are maximizing the log-likelihood from the second-order

conditions:d2l(µ)dµ2

< 0 so that ± > 0.

Example: Suppose an unknown proportion of the population: µ favour somepolicy while 1¡ µ are against this policy. You decide to conduct a survey of then randomly chosen people to estimate the unknown µ: Suppose mi = 1 if the ith

person says he supports the policy, and mi = 0 if he says he does not supportthe policy. Since µ is the probability that mi = 1 and 1 ¡ µ is the probabilitythat mi = 0 the probability of mi is given by:

Pr [mi] = µmi (1¡ µ)1¡mi :

Since each person is chosen independently, the likelihood is the product of theseprobabilities:

L (µ) = µm1 (1¡ µ)1¡m1 £ µm2 (1¡ µ)1¡m2 £ ¢ ¢ ¢ £ µmn (1¡ µ)1¡mn

= µm (1¡ µ)n¡m

wherem is the number of people in your survey who favour the policy and n¡mis the number of period in the survey against the policy. The log-likelihood isthen:

l (µ) = ln (L (µ)) =m ln (µ) + (n¡m) ln (1¡ µ) :Using the chain rule we …nd that:

dl (µ)

dµ=

m

µ¡ n¡m1¡ µ =)

dl³µ´

dµ= 0 =

m

µ¡ n¡m1¡ µ

=) µ =m

n:

Page 74: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 66

Thus if m = 525 say they are in favor and n = 2000 are interviewed then thelog-likelihood is:

l (µ) = 525 ln (µ) + (2000¡ 525) ln (1¡ µ)which is plotted below:

-3400-3200-3000-2800-2600-2400-2200-2000-1800-1600-1400-1200

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9theta

l (µ)

As illustrated in the graph, the maximum occurs at:

µ =525

2000= 0:26

which says that the best guess about µ; based on the poll, is that 26 percent ofthe population are in favour of the policy.To calculate a con…dence interval for µ = 0:26 we use:

± =

0@¡d2l³µ´

dµ2

1A¡1

=

0BBB@=nµz}|{m

µ2 +

=n(1¡µ)z }| {n¡m³1¡ µ

´21CCCA¡1

=µ³1¡ µ

´n

so that a 95% con…dence interval for the unknown µ takes the form:

µ § 1:96£

vuut µ³1¡ µ

´n

:

Thus if m = 525 out of n = 2000 people polled are in favour of the policy, then:

± =0:26 (1¡ 0:26)

2000

and the con…dence interval is:

0:2625§ 1:96£r0:26£ (1¡ 0:26)

2000

or 0:26§0:019: Thus the poll would be accurate to within 1:9 percentage points19 times out of 20 (or 95 times out of 100):

Page 75: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 67

2.5 Ordinal and Cardinal Properties

2.5.1 Class Grades

Consider a class of 4 students: John, Mary, Joe, and Sue. Suppose the instructorgives an A to the student with the highest grade, a B to the next highest andso on. The numerical and letter grades might then look like this:

John Mary Joe Sue75 B 50 D 65 C 85 A

:

Now suppose instead that the instructor adjusts (or bells) the grades byapplying a monotonic function g (x) (with g0 (x) > 0) to each grade. This couldbe g (x) = x¡ 3:75; which would insure that the class average is 65 and whichsatis…es g0 (x) = 1 > 0; or something crazy like g (x) = x2 which yields:

John Mary Joe Sue752 = 5625 B 502 = 2500 D 652 = 4225 C 852 = 7225 A

:

Notice that the numerical grades change when g (x) is applied (for exampleJoe’s grade changes from 65 to 4225) but that the letter grades do not change(for example Joe received a C before the grades were adjusted and he receivesa C after the grades are adjusted.)If instead the instructor used g (x) = x3 we would …nd that:

John Mary Joe Sue753 = 421875 B 503 = 125000 D 653 = 274625 C 853 = 614125 A

and all letters grades still remain unchanged.Adjusting the grades with some monotonic g (x) is known as a monotonic

transformation. Letter grades are an example of what is known as an ordinalproperty, one which does not change no matter what monotonic transformationis applied. Cardinal properties, on the other hand, are properties which dochange when a monotonic transformation is applied. Thus the student’s numer-ical grades are cardinal properties.It is important that we restrict ourselves to monotonic transformations; that

is we do require that g0 (x) > 0: To see why suppose instead that the instructorused: g (x) = x¡1. Then we obtain:

John Mary Joe Sue75¡1 = 0:0133 C 50¡1 = 0:020 A 65¡1 = 0:0154 B 85¡1 = 0:0118D :

Now it is Mary who receives an A and Sue who receives a D and the lettergrades do change here. The problem here is that:

g0 (x) = ¡x¡2 = ¡ 1

x2< 0

and so g (x) = 1x is not a monotonic transformation.

Page 76: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 68

2.5.2 Ordinal and Cardinal Properties of Functions

Let us now turn our attention to ordinal and cardinal properties of a function: f (x).We have:

De…nition 63 Monotonic Transformation: If g (x) is a monotonic (i.e.,g0 (x) > 0 for all x) then applying g (x) to f (x) as g (f (x)) is called a monotonictransformation of f (x) :

We then have:

De…nition 64 Ordinal Property: An ordinal property of a function y =f (x) is one which does not change when any monotonic transformation g (x)is applied to f (x) :

De…nition 65 Cardinal Property: A cardinal property of a function y =f (x) is one which does change when at least one monotonic transformationg (x) is applied to f (x) :

With class grades the student with the highest grade (Jane) and the studentwith the lowest grade (Mary) are always the same no matter what monotonicg (x) is applied. For functions Jane and Mary correspond to the global maximumand minimum x¤ and so we have:

Theorem 66 The global maximum (global minimum) x¤ of f (x) is anordinal property. If f (x) = g (h (x)) with g0 (x) > 0 then x¤ is a globalmaximum (minimum) of f (x) if and only if x¤ is a global maximum (minimum)of h (x) :

Example: Consider:

f (x) = x¡ 12x2

for 0 < x < 2: You can easily show that f (x) has a global maximum at x¤ = 1:Now suppose we apply the monotonic transformation: g (x) = x2 which leadsus to:

h (x) = g (f (x)) =

µx¡ 1

2x2¶2:

Page 77: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 69

According to the theorem x¤ = 1 is also a global maximum of h (x) : This canbe seen from the plot of f (x) and h (x) below:

0

0.1

0.2

0.3

0.4

0.5

0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2x

f (x) = x¡ 12x

2; h (x) =¡x¡ 1

2x2¢2 :

2.5.3 Concavity and Convexity are Cardinal Properties

We have seen that a very important property of a function f (x) is whether it ismountain-like or concave, or whether it is valley-like or convex. Now we mightask, is global concavity or convexity an ordinal or a cardinal property of f (x)?In other words if f (x) is globally concave (convex) does it follow that g (f (x))is globally concave (convex) when g (x) is monotonic. Surprisingly the answeris no!

Theorem 67 Concavity and Convexity are Cardinal properties. Iff (x) is concave (convex) then it does not follow that a monotonic transformation: h (x) =g (f (x)) is concave (convex).

Proof. Here we use proof by counter-example. Suppose for x > 0 thatf (x) = x

12 : f (x) is globally concave since:

f 00 (x) = ¡14x¡

32 < 0:

Now suppose we let g (x) = x4 so that g0 (x) = 4x3 > 0 and so g (x) is monotonic.Then:

h (x) = g (f (x))

=³x12

´4= x2:

But h (x) = x2 is globally convex (since h00 (x) = 2 > 0 ). Thus while f (x)is concave and a monotonic transformation h (x) is not concave (in fact it isconvex).

Page 78: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 70

More generally note that if h (x) = g (f (x)) then using the chain rule:

h0 (x) = g0 (f (x)) f 0 (x)

=) h00 (x) =

?z }| {g00 (f (x))

+z }| {(f 0 (x))2 +

+z }| {g0 (f (x))f 00 (x) :

We cannot show that h00 (x) and f 00 (x) have the same sign because we do notknow the sign of g00 (f (x)) ; that is we only make assumptions about the …rstderivative of g (x) and not about the second derivative. This then is the basicreason why concavity and convexity are cardinal and not ordinal properties ofa function.

2.5.4 Quasi-Concavity and Quasi-Convexity

Since concavity and convexity are cardinal properties, let us de…ne a new kind ofconcavity and convexity, called quasi-concavity and quasi-convexity which areordinal properties of a function:

De…nition 68 Quasi-Concavity: A function f (x) is quasi-concave if andonly if it is a monotonic transformation of a concave function; that is:

f (x) = g (h (x))

where g0 (x) > 0 for all x and h (x) is globally concave.

De…nition 69 Quasi-Convexity: A function f (x) is quasi-convex if and onlyif it is a monotonic transformation of a convex function; that is:

f (x) = g (h (x))

where g0 (x) > 0 and h (x) is globally convex.

If f (x) is convex (concave) then it is also quasi-convex (quasi-concave) sincewe can always let g (x) = x (with g0 (x) = 1 > 0) in which case f (x) = h (x) :Thus:

Theorem 70 All convex functions are quasi-convex but not all quasi-convexfunctions are convex.

Theorem 71 All concave functions are quasi-concave but not all quasi-concavefunctions are concave.

If g (x) is monotonic in

f (x) = g (h (x)) (2.1)

then it follows that g (x) has an inverse function, ~g (x) If we apply ~g (x) to bothsides of (2:1) we free h (x) from inside g (x) and obtain:

h (x) = ~g (f (x)) :

Thus if f (x) is quasi-concave (quasi-convex) there exists a monotonic trans-formation of f (x) which makes it concave (convex). We therefore have:

Page 79: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 71

Theorem 72 A function f (x) is quasi-concave (quasi-convex) if and only ifthere exists a monotonic transformation ~g (x) such that:

h (x) = ~g (f (x))

is concave (convex).

Remark: There are thus two methods for showing that a function f (x) isquasi-concave (quasi-convex). We can either 1) show that f (x) is a monotonictransformation of a concave (convex) function or 2) show that a monotonictransformation of f (x) is concave (convex):

Example: Consider the function:

f (x) =1

1 + x2:

This function is not globally concave since:

f 00 (x) =2¡3x2 ¡ 1¢(1 + x2)3

so that f (x) is convex or f 00 (x) > 0 for jxj > 1p3:

We can however show that f (x) is quasi-concave.Using the …rst method we have:

f (x) = g (h (x))

with monotonic transformation: g (x) = 11¡x (with g

0 (x) = 1(1¡x)2 > 0) and

concave function and h (x) = ¡x2 (with h00 (x) = ¡2 < 0 ) since:

f (x) = g (h (x)) =1

1¡ h (x) =1

1¡ (¡x2) =1

1 + x2:

Using the second method let g (x) = ¡ 1x be the monotonic transformation

(since g0 (x) = 1x2 > 0 ) that we apply to f (x) : We then obtain:

h (x) = g (f (x)) = ¡ 11

1+x2= ¡ ¡1 + x2¢

where h (x) = ¡ ¡1 + x2¢ is globally concave since h00 (x) = ¡2 < 0:2.5.5 New Su¢cient Conditions for a Global Maximum or

Minimum

Suppose we have a function f (x) that is quasi-concave (quasi-convex) so thatf (x) = g (h (x)) where h (x) is concave (convex). Suppose further that we have

Page 80: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 72

a solution to the …rst-order conditions f 0 (x¤) = 0: From the chain rule thisimplies that:

f 0 (x¤) = g0 (h (x¤))h0 (x¤) = 0 =) h0 (x¤) = 0:

Since x¤ is also a solution to the …rst-order conditions for h (x) and since h (x) isconcave (convex) it follows that x¤ is a global maximum (minimum) for h (x) :Since a global maximum (minimum) is an ordinal property from Theorem 66,it follows that x¤ is a global maximum for f (x) as well! Thus we have thefollowing su¢cient conditions for a global maximum (minimum):

Theorem 73 If f (x) is quasi-concave and x¤ satis…es the …rst-order condi-tions: f 0 (x¤) = 0 then x¤ is the unique global maximum of f (x) :

Theorem 74 If f (x) is quasi-convex and x¤ satis…es the …rst-order conditions:f 0 (x¤) = 0 then x¤ is the unique global minimum of f (x) :

Remark: Since all concave (convex) functions are quasi-concave (quasi-convex)but not all quasi-concave (quasi-convex) functions are concave (convex), thesesu¢cient conditions for a global maximum (minimum) are more widely appli-cable than the earlier su¢cient conditions that relied on concavity (convexity).

Example: We have seen that the

f (x) =1

1 + x2

is quasi-concave. From the …rst-order conditions we have:

f 0 (x¤) = ¡ 2x¤

(1 + x¤2)2= 0 =) x¤ = 0:

Since f (x) is quasi-concave we conclude that x¤ = 0 is a global maximum. Thisis illustrated in the plot below:

0.2

0.4

0.6

0.8

1

y

-4 -2 0 2 4x

f (x) = 11+x2

:

Page 81: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 73

2.6 Exponential Functions and Logarithms

Almost all of the functions we have considered so far involve terms of theform: xa for some value of a: For some a > 0 consider reversing a and x toobtain a new kind of function:

De…nition 75 Exponential Function: An exponential function takes the form:

f (x) = ax

where a > 0 is referred to as the base.

Example: If we reverse the 2 and the x in x2 we obtain:

f (x) = 2x

with f (3) = 23 = 8 and f (¡3) = 2¡3 = 18 . The exponential f (x) = 2x is

illustrated below:

0

5

10

15

20

25

30

y

-2 -1 1 2 3 4 5x

f (x) = 2x

:

Note from the graph that this function is non-negative, monotonic and convex.In mathematics, as in economics, it turns out that there is a best base a for

exponentials ax. This is the number e denoted by:

e ´ 1 +1

1!+1

2!+1

3!+ ¢ ¢ ¢ or

e ´ limn!1

µ1 +

1

n

¶n:

One can show that the two de…nitions are equivalent and lead to:

e ¼ 2:718281828:

Page 82: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 74

Remark: The second de…nition of e has an economic interpretation in terms ofcompound interest. If you put $1 in the bank at r = 1 or 100% interest com-pounded annually then after one year you will have (1 + r)1 = $2: Now supposeinterest is compounded every 1

n

thof a year so that for n = 2; 3 and 365 (i.e.,

every 6 months, 4 months and daily interest) you would receive respectively:³1 +

r

2

´2=

µ1 +

1

2

¶2= 2:25;

³1 +

r

3

´3=

µ1 +

1

3

¶3= 2:3704;³

1 +r

365

´365= 2: 7146:

Thus as interest is compounded more and more often the amount of money youreceive converges on e dollars, or approximately $2:72; so that e is amount ofinterest you would receive from continuous compounding.We have:

De…nition 76 The exponential function to the base e is denoted as: f (x) = ex

or f (x) = exp (x) is de…ned as:

ex ´ limn!1

³1 +

x

n

´n:

Remark 1: If you get confused with ex when using say the chain rule, tryrewriting the problem with ex replaced by: exp (x) and think of the letters: exptaking the place of f as in f (x) :

Remark 2: It follows from the de…nition that er is the amount of money youwould obtain from investing $1 at interest rate r when interest is compoundedcontinuously. This is why in economics you will often see expressions like er fordiscounting and compound interest. For example one dollar at 10% interest orr = 0:1 compounded continuously will give you after 1 year:

e0:1 = 1:1052:

Mathematically the most important reason for choosing e as the base is thatthe derivative of f (x) = ex is also ex so that:

Theorem 77 If f (x) = ex then f 0 (x) = ex:

Proof. (Informal) From the de…nition we have that en (x)! ex as n!1where:

en (x) =³1 +

x

n

´n:

Page 83: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 75

To …nd the derivative of ex di¤erentiate en (x) and then let n!1 so that:

e0n (x) =³1 +

x

n

´n¡1=

en (x)¡1 + x

n

¢ :Now since limn!1

¡1 + x

n

¢= 1 we have:

dex

dx= limn!1 e

0n (x) =

limn!1 en (x)limn!1

¡1 + x

n

¢ = ex:Since ex > 0 it follows that f 0 (x) = ex > 0 and so the function f (x) is

monotonic . Furthermore since f 00 (x) = ex > 0 it follows that ex is globallyconvex. These properties are illustrated in the plot below:

0

5

10

15

20

y

-2 -1 1 2 3x

f (x) = ex

:

Theorem 78 The function f (x) = ex has the following properties:

1. ex > 0 for all x

2. ex is de…ned for all x (it has an unrestricted domain)

3. ex is globally increasing (i.e., f 0 (x) = ex > 0 )

4. ex is globally convex (i.e., f 00 (x) = ex > 0 )

5. e0 = 1

6. exey = ex+y

7. exey = ex+y

8. e¡x = 1ex :

Since f (x) = ex is monotonic, it follows that it has an inverse function,which is the logarithm to the base e or ln (x) de…ned as:

Page 84: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 76

De…nition 79 The function ln (x) is the inverse function of ex so that:

eln(x) = x; ln (ex) = x:

The function ln (x) is plotted below:

-4

-3

-2

-1

0

1

2

y

2 4 6 8 10x

ln (x)

:

Remark: Note from the graph that ln (x) is not de…ned for x · 0:

We can use the fact that ln (x) is the inverse function of ex to prove that:

Theorem 80 The derivative of ln (x) is d ln(x)dx = 1

x :

Proof. Since ln (x) is the inverse function of ex we have: eln(x) = x: Di¤er-entiating both sides with respect to x and using the chain rule we have:

d

dxeln(x) =

d

dxx =) eln(x)| {z }

=x

d

dxln (x) = 1

=) xd

dxln (x) = 1

=) d

dxln (x) =

1

x:

We also have:

Theorem 81 The function ln (x) has the following properties:

1. ln (xy) = ln (x) + ln (y)

2. ln (xy) = y ln (x) :

3. ln (x) is de…ned only for x > 0 (it has a restricted domain)

Page 85: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 77

4. ln (x) an take on both negative and positive values (it has an unrestrictedrange)

5. ln (x) is globally increasing

6. ln (x) is globally concave

7. ln (1) = 0

8. ln¡1x

¢= ¡ ln (x) :

Proof. The …rst follows from eln(x)+ln(y) = eln(x)eln(y) = xy and then takingthe ln ( ) of both sides. The second follows from xy =

¡eln(x)

¢y= ey ln(x) and

then taking the ln ( ) of both sides. The third follows from ex > 0 for all x andso if x < 0 we would have the contradiction eln(x) = x < 0: The …fth followssince: d ln(x)

dx = 1x > 0 for x > 0. This sixth follows since d2 ln(x)

dx2 = ¡ 1x2 < 0.

To show the seventh note that e0 = eln(0) = 1: The …nal result follows from:ln¡x£ 1

x

¢= ln (x) + ln

¡1x

¢= ln (1) = 0:

Remark: The function ln (x) gets used a lot in economics. For example inapplied econometrics, rather than working directly with a price P one usuallyworks with ln (P ) :On of the reasons for this is that ln (x) converts multiplicationinto addition, and converts powers into multiplication.

Example 1: Suppose you had data on Q and P and wished to estimate aconstant elasticity of demand curveQ = AP¯. Since this is a non-linear functionyou cannot directly apply linear regression to your data. However using theproperties of the ln ( ) function we obtain:

Q = AP b =) ln (Q) = ln¡AP b

¢= ln (A) + ln

¡P¯¢

=) q = ®+ ¯p

where q = ln (Q) ; ® = ln (A) and p = ln (P ) : You now have a linear relationshipbetween q and p which can be estimated by linear regression. Furthermore thecoe¢cient on the regressor p is the elasticity of demand: ¯:

Example 2: Consider the function

y = f (x) = x3e¡x

Page 86: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 78

for x ¸ 0 which is plotted below:

0

0.2

0.4

0.6

0.8

1

1.2

y

2 4 6 8 10x

y = x3e¡x:

To …nd the maximum of this function take the …rst-order conditions (using theproduct and chain rules) to obtain:

f 0 (x) = 3x2e¡x ¡ x3e¡x = x2e¡x (3¡ x)=) f 0 (x¤) = x¤2e¡x

¤(3¡ x¤) = 0

=) (3¡ x¤) = 0which has a solution x¤ = 3: Note that x¤2 > 0 since x > 0 and e¡x

¤> 0 since

ex > 0 for all x:Here we cannot show that x¤ = 3 is a global maximum by showing global

concavity since f (x) is not globally concave. This follows since:

f 00 (x) = xe¡x¡x2 ¡ 6x+ 6¢

= xe¡x³x¡

³3¡

p3´´³

x¡³3 +

p3´´

and so f (x) is concave in the interval¡3¡p3¢ < x <

¡3 +

p3¢and convex

outside this interval.We can show however that f (x) is quasi-concave since:

f (x) = g (h (x))

where:

g (x) = ex

is monotonic and:

h (x) = 3 ln (x)¡ xis globally concave since:

h00 (x) = ¡ 3

x2< 0:

Page 87: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 79

It follows that: x¤ = 3 is a global maximum for both h (x) and f (x) :

Example 3: The standard normal distribution, easily the most importantprobability distribution, has a probability density given by:

p (x) =1p2¼e¡

12x

2

which is plotted below:

0

0.1

0.2

0.3

0.4

-4 -2 2 4x

p (x) = 1p2¼e¡

12x

2

:

Note from the graph that p (x) appears to be symmetric around 0: This isin fact the case since: p (¡x) = p (x) as:

p (¡x) = 1p2¼e¡

12 (¡x)2 =

1p2¼e¡

12x

2

= p (x) :

The mode of p (x) (the maximum value of p (x)) is at x¤ = 0 . To show thiswe use the chain rule to obtain the …rst-order conditions as:

p0 (x) =1p2¼e¡

12x

2 £¡x

=) p0 (x¤) = 0 =) 1p2¼e¡

12x

¤2 £¡x¤ = 0=) x¤ = 0:

We might try to show that x¤ = 0 is a global maximum by showing thatp (x) is globally concave. We have however that:

p00 (x) = ¡ 1p2¼e¡

12x

2

+1p2¼e¡

12x

2 £ x2

=1p2¼e¡

12x

2 ¡x2 ¡ 1¢

from which it follows that p (x) is concave for ¡1 < x < 1 but convex for x > 1or x < ¡1: It follows then that p (x) is not globally concave.

Page 88: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 80

We can however show that p (x) is quasi-concave since

p (x) = exp

µln

µ1p2¼

¶¡ 12x2¶

with monotonic function: g (x) = exp (x) and

h (x) = ln

µ1p2¼

¶¡ 12x2

h00 (x) = ¡1 < 0so that h (x) is globally concave. It follow then that p (x) is quasi-concave sothat x¤ = 0 is a global maximum.It will sometimes occur that you are confronted with exponential functions

which do not have e as the base, and less frequently logarithms which are notto the base e; such as log10 (x) : Given such problems the best strategy is toconvert the problem from base a to base e using:

Theorem 82 The functions ax or loga (x) can be converted to base e using:

1. ax = eln(a)x

2. loga (x) = ln (x) = ln (a) :

Proof. Since a = eln(x) we have: ax =¡eln(a)

¢x= eln(a)x: To derive the

second result use: x = aloga(x) =¡eln(a)

¢loga(x) = eln(a) loga(x) and take the ln ( )of both sides.

Example 1: Given the function

y = 2x

we can convert to base e using

2x =³eln(2)

´x= eln(2)x:

It then follows that:

d

dx2x = ln (2) eln(2)x = ln (2) 2x

using the chain rule.

Example 2: If you recall we never de…ned xa for a non-integer a: In fact it isde…ned using ex and ln (x) as:

xa ´ ea ln(x):Thus the reason xa is not de…ned for x < 0 is that ln (x) is not de…ned. Usingthis de…nition we can prove that:

Page 89: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 81

Theorem 83 For x > 0 and for any a we have:

dxa

dx= axa¡1:

Proof. Using the de…nition of xa and the chain rule we …nd that:

d

dxxa =

d

dxea ln(x)

= a1

xea ln(x)

= ax¡1xa

= axa¡1:

Example 3: For the function:

f (x) = xx

we now have x in the base and the exponent! We have no direct rule forcalculating derivatives of this function. We can however change from base x tobase e as:

f (x) = xx =³eln(x)

´x= ex ln(x):

Therefore using the chain and product rules yields:

f 0(x) = (1 + ln (x)) ex ln(x)

=) f 0 (x¤) = 0 = (1 + ln (x¤)) ex¤ ln(x¤)

=) 1 + ln (x¤) = 0 since ex¤ ln(x¤) > 0

=) ln (x¤) = ¡1=) x¤ = e¡1 = 0:36788:

Furthermore f (x) is globally convex since:

f 00 (x) = ex lnxµ(1 + ln (x))

2 +1

x

¶> 0

Page 90: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 82

and hence x¤ = e¡1 is a global minimum. This is illustrated in the plot below:

0.7

0.75

0.8

0.85

0.9

0.95

1

0 0.2 0.4 0.6 0.8 1x

f (x) = xx

Example 4: On your calculator you will see the log10 (x) which is the logarithmto the base 10 instead of base e: To …nd its derivative we use:

log10 (x) =1

ln (10)ln (x)

so that:

d log10 (x)

dx=

1

ln (10)x:

2.6.1 Exponential Growth and the Rule of 72

Eighty percent of rules of thumb only apply 20 percent of the time-David Gunn

Suppose we replace x by t; and think of t as time, and imagine that y issome variable (population, GNP etc.) that grows with time so that y = f (t) :

Theorem 84 The growth rate of y per unit period of t (e.g. growth per year)is:

lim¢t!0

f (t+¢t)¡ f (t)f (t)¢t

=f 0 (t)f (t)

:

Many economic variables appear to growth at approximately the same rateover time. For example since the industrial revolution many advanced economieshave grown at an average of around 2% a year.There is a functional form which has the property that the growth rate

remains constant over time. We have:

Page 91: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 83

Theorem 85 The function f (t) = Ae¹t grows at a constant rate ¹ for all t:

Proof. Using the chain rule and the properties of ex we have:

f 0 (t)f (t)

=¹Ae¹t

Ae¹t= ¹:

Example: Thus if t is measured in years and ¹ = 0:03; then y = Ae0:03t growsat 3% every year.

One way of understanding the implications of di¤erent growth rates is thetime it takes for y to double. Let this be ¢t which then satis…es: f (t+¢t) =2f (t) or

Ae¹(t+¢t) = 2Ae¹t:

Solving for ¢t we …nd that

¢t =ln (2)

¹=:69315

¹¼ 72

¹£ 100%which gives the rule of 72; where 72 is chosen because it is a nice number withlots of divisors that is not too far away from 69.Thus if GNP grows at 2% a year, it will double approximately every 72

2 or36 years. On the other hand if GNP grows at 4% a year it will double every 72

4or 18 years.This can make a huge di¤erence since the economy that doubles every 18

years will be 4 times as large after 36 years while the economy which grows at2% will only be twice as large. Thus imagine two countries with identical GNPat t = 0 (say in 1945) but where one grows at 2%=year and the other grows at4%=year:

1

2

3

4

5

6

7

8

9

0 10 20 30 40 50t

Plot of e0:02t and e0:04t:

:

Small di¤erences in growth rates make a huge di¤erence! As the above graphillustrates, after 55 years (the time from 1945 to today 2000) the country thatgrows at 4%=year will have an economy three times as large as the economythat grows at 2%=year:

Page 92: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 84

2.7 Taylor Series

Although you may not be aware of it, calculus is really a method for approxi-mating functions with polynomials. For example a derivative corresponds to theslope of a tangent line, this tangent line being a …rst degree polynomial. Sec-ond derivatives basically involve approximating a function with a second degreepolynomial or quadratic.The key concept which links these polynomial approximations with deriva-

tives is the Taylor Series.

Theorem 86 Taylor Series: A function f (x) can be approximated at x = x0by an nth order polynomial ~f (x), called a Taylor Series, given by:

~f (x) = f (x0) + f1 (x0) (x¡ x0) + f

2 (x0)

2!(x¡ x0)2 + ¢ ¢ ¢+ f

n (x0)

n!(x¡ x0)n

where fn (x0) is the nth derivative of f (x) evaluated at x = x0:

Remark: The approximation of ~f (x) to f (x) gets better the closer x is to x0:When x = x0 the approximation becomes exact, that is ~f (x0) = f (x0) sincethe terms (x¡ x0)n become 0:

Example 1: The …rst-order Taylor Series:

~f (x) = f (x0) + f0 (x0) (x¡ x0)

approximates an arbitrary function f (x) by a line.Consider:

f (x) = x2 + 9

and let us construct a …rst-order Taylor series around x0 = 1: We need tocalculate two number f (x0) and f 0 (x0) as:

f (x) = x2 + 9 =) f (x0) = f (1) = 12 + 9 = 10

f 0 (x) = 2x =) f 0 (x0) = f 0 (1) = 2£ 1 = 2so that:

~f (x) = f (1) + f 0 (1) (x¡ 1)= 10 + 2(x¡ 1)= 8 + 2x:

To see how the approximation works, consider an x close to x0 = 1; sayx = 1:2: Then we have:

f (1:2) = (1:2)2 + 9 = 10:44

Page 93: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 85

while the Taylor series approximation gives:

~f (1:2) ¼ 10 + 2(1:2¡ 1) = 10:4:On the other hand if x is far from x0 = 1; say x = 10 then:

f (10) = 102 + 9 = 109~f (10) = 10 + 2 (10¡ 1) = 29

and ~f (x) does a poor job of approximating f (x) :A plot of f (x) = x2 + 9 and its straight-line Taylor series approximation

~f (x) = 10 + 2(x¡ 1) is given below:

0

5

10

15

20

25

30

-4 -2 2 4x

f (x) and ~f (x)

:

Example 2: The second-order Taylor Series is given by:

~f (x) = f (x0) + f0 (x0) (x¡ x0) + f

00 (x0) (x¡ x0)2

2

which approximates the function f (x) at x0 by a quadratic.If

f (x) = x3 + 9

then in order to calculate a second-order Taylor series around x0 = 1 we needto calculate three numbers: f (x0) ; f 0 (x0) and f 00 (x0) which are given by:

f (x0) = 13 + 9 = 10

f 0 (x) = 3x2 =) f 0 (x0) = f 0 (1) = 3£ 12 = 3f 00 (x) = 6x =) f 00 (x0) = f 00 (1) = 6£ 1 = 6

and so the second-order Taylor series is:

~f (x) = f (1) + f 0 (1) (x¡ 1) + f00 (1)2

(x¡ 1)2

= 10 + 3(x¡ 1) + 62(x¡ 1)2

3x2 ¡ 3x+ 10:

Page 94: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 86

To see how the approximation does, let us …rst pick an x close to x0 = 1;say x = 1:2: We then have:

f (1:2) = (1:2)3 + 9 = 10:728

while the Taylor Series approximation gives

~f (1:2) = 10 + 3(1:2¡ 1) + 62(1:2¡ 1)2 = 10:72:

On the other hand if we choose an x far from x0 = 1; say x = 7; we obtain:

f (7) = (7)3 + 9 = 352

~f (1:2) = 10 + 3(7¡ 1) + 62(7¡ 1)2 = 136

and so we obtain a poor approximation.A plot of the cubic x3 + 9 and its quadratic second-order Taylor series ap-

proximation around x0 = 1 is given below.

0

10

20

30

40

50

60

70

-2 -1 1 2 3 4x

f (x) and ~f (x)

2.7.1 The Error of the Taylor Series Approximation

A natural question is to ask how well does ~f (x) approximate f (x)? The Frenchmathematician Lagrange showed that the error of an nth order Taylor seriesapproximation is equal to the n+ 1th term with x0 replaced by ¹x; where ¹x liesbetween x and x0: Thus:

Theorem 87 The error of the nth order Taylor series approximation is givenby:

fn+1 (¹x) (x¡ xo)n+1(n+ 1)!

Page 95: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 87

so that:

f (x) = f (xo) + f0 (xo) (x¡ xo) + ¢ ¢ ¢+ f

n (x0)

n!(x¡ xo)n + f

n+1 (¹x) (x¡ xo)n+1(n+ 1)!

where ¹x lies between x0 and x:

Example: For the …rst-order Taylor we have:

f (x) = f (xo) + f0 (xo) (x¡ xo) + f

00 (¹x)2!

(x¡ xo)2

where ¹x lies between x0 and x:

Page 96: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 88

To see how this can be used, let us now prove that a concave (convex)function has a unique global maximum (minimum) at x¤:Proof. Suppose that x¤ solves the …rst-order conditions: f 0 (x¤) = 0 and

f 00 (x) < 0 for all x: A …rst-order Taylor series (with the error term) of f (x)around x0 = x¤ takes the form:

f (x) = f (x¤) +

=0z }| {f 0 (x¤) (x¡ x¤) + f

00 (¹x)2!

(x¡ x¤)2

= f (x¤) +f 00 (¹x)2!

(x¡ x¤)2 :

Now since f (x) is concave (convex) it follows that: f 00 (x) < 0 for all x (f 00 (x) >0 for all x) it follows that f 00 (¹x) < 0 (f 00 (¹x) > 0): If x 6= x¤ it follows that(x¡ x¤)2 > 0 and hence we have:

f (x) = f (x¤) +

¡z }| {f 00 (¹x)2!

(x¡ x¤)2 < f (x¤) :

This says that for any x 6= x¤ that f (x) < f (x¤) so that x¤ is a global maximum(minimum).

Page 97: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 89

2.7.2 The Taylor Series for ex and ln (1 + x)

Consider calculating a Taylor series for ex around x0 = 0. The nth term is ofthe form:

fn (x0)

n!(x¡ x0)n = xn

n!

since if f (x) = ex then fn (x) = ex and hence fn (0) = e0 = 1: It turns out thatby letting n!1 we obtain an exact result, which is often used to de…ne ex asfollows:

Theorem 88 The in…nite order Taylor series for ex around x0 = 0 is exact forall x and is given by:

ex = 1 + x+x2

2!+x3

3!+x4

4!+ ¢ ¢ ¢ .

As an exercise take the derivative of both sides and show that f 0 (x) = ex:Another important result is the Taylor series for ln (1 + x):

Theorem 89 The Taylor series of ln (1 + x) around x0 = 0:

ln (1 + x) = x¡ x2

2+x3

3¡ x

4

4+ ¢ ¢ ¢ .

is exact for jxj < 1:

Example: From the …rst-order Taylor for ln (1 + x) we have:

ln (1 + x) ¼ xfor x small. For example:

ln (1 + 0:1) = :09531 ¼ 0:1:

The Taylor series of ln (1 + x) can be used to de…ne an alternative measureof percentage that is very useful in economics. Suppose you want to calculatethe percentage change from x1 to x2: The normal way of doing this would beas:

x2 ¡ x1x1

£ 100%:

Thus if x2 = 110 and x1 = 100 the percentage change so de…ned is 10%:All de…nitions of percentage su¤er from the fact that the choice of the base

is to some extent arbitrary. Thus instead of using 100 as the base we couldequally have well used 110; or indeed any number between 100 and 110 such as

Page 98: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 90

the midpoint 105. If for example we had used 110 as the base we would have asour de…nition of percentage:

x2 ¡ x1x2

£ 100%

which would lead a percentage change of 9:0909%:Now consider calculating the de…nition of percentage as:

(ln (x2)¡ ln (x1))£ 100%or equivalently:

ln

µx2x1

¶£ 100%:

Using this de…nition we would get 9:531%, which is intermediate between thetwo other de…nitions of percentage.To show why this third de…nition is sensible, use a …rst-order Taylor series

approximation of ln (1 + x) ¼ x noting that:

ln

µx2x1

¶= ln

µ1 +

x2 ¡ x1x1

¶¼ x2 ¡ x1

x1:

2.7.3 L’Hôpital’s Rule

Consider the following problem. Suppose two functions f (x) and g (x) have theproperty that f (x0) = 0 and g (x0) = 0: We wish to …nd out what happens tof(x)g(x) as x ! x0: In general the ratio 0

0 is indeterminate so it is not clear whatthe limit is. Consider using a …rst-order Taylor series for f (x) and g (x) aroundx0; an approximation which will get better as: x! x0:Since f (x0) = g (x0) = 0 we have:

f (x)

g (x)¼ f (x0) + f 0 (x0) (x¡ x0)g (x0) + g0 (x0) (x¡ x0) =

f 0 (x0) (x¡ x0)g0 (x0) (x¡ x0) =

f 0 (x0)g0 (x0)

:

This yields L’Hôpital’s rule:

Theorem 90 L’Hôpital’s Rule I If f (x0) = g (x0) = 0 then:

limx!x0

f (x)

g (x)= limx!x0

f 0 (x)g0 (x)

:

Another version of L’Hôpital’s rule is:

Theorem 91 L’Hôpital’s Rule II If f (x0) = g (x0) =1 then:

limx!x0

f (x)

g (x)= limx!x0

f 0 (x)g0 (x)

:

Page 99: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 91

Remark: L’Hôpital’s rule does not work if either f (x) or g (x) does not ap-proach 0 or 1:

Example: Suppose f (x) = x2 ¡ 1 and g (x) = x¡ 1 so that: f (1) = g (1) = 0:Since f 0 (x) = 2x and g0 (x) = 1 we have:

limx!1

x2 ¡ 1x¡ 1 = lim

x!12x

1= 2:

2.7.4 Newton’s Method

Consider the problem of …nding a root of a function f (x) ; that is we wish tocalculate an x+ which satis…es

f¡x+¢= 0:

This is a very common problem in economics and econometrics. For examplesuppose we wish to minimize or maximize a function g (x) : Then we would wantto calculate the root of g0 (x) ; that is the x+ = x¤ which satis…es the …rst-orderconditions:

g0 (x¤) = 0:

Solving for roots is easy to do for linear functions, quadratics and certainother special functions. Generally speaking functions for which a formula existsfor calculating a root are the exception.Although there exists no formula, there do exist numerical methods for cal-

culating x+ . These numerical methods, combined with the use of computers,make the solving of these sorts of problems routine today.The basic method was invented by Newton and involves approximating f (x)

with a …rst-order Taylor series. The …rst step is to make an educated guess whatthe root x+ might be. Call this guess x0 and approximate f (x) around x0 usinga …rst-order Taylor series so that:

f (x) ¼ ~f (x) = f (x0) + f0 (x0) (x¡ x0) :

Although we cannot solve f (x+) = 0; it is easy to solve ~f (x) = 0 since ~f (x) isjust a linear function. Let x1 be the value of x that solves ~f (x) = 0 so that:

~f (x1) = 0 =) f (x0) + f0 (x0) (x1 ¡ x0) = 0

=) x1 = x0 ¡ f (x0)

f 0 (x0)

While x1 is not a root of f (x) it will generally be closer to x+ than x0: To getan even better estimate of x we apply the same method again but now using x1that is we use:

f (x) ¼ ~f (x) = f (x1) + f0 (x1) (x¡ x1)

Page 100: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 92

so that solving for ~f (x2) = 0 we obtain:

x2 = x1 ¡ f (x1)

f 0 (x1):

The new guess x2 will generally be closer to x than x1:This procedure can be repeated again and again using:

xn = xn¡1 ¡ f (xn¡1)f 0 (xn¡1)

until xn close enough to x : This is Newton’s method which is representedgraphically below.

Page 101: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 93

Example: Suppose you wish to …nd a root of:

f (x) = x7 ¡ 3x+ 1

so that x satis…es

x7 ¡ 3x+ 1 = 0:

To apply Newton’s method we will need :

f 0 (x) = 7x6 ¡ 3:

Let x 0 = 1 be our initial guess. Then since

f(1) = 17 ¡ 3£ 1 + 1 = ¡1; f 0 (1) = 7£ 16 ¡ 3 = 4

we have:

~f (x) = ¡1 + 4 (x¡ 1)

so that:

~f (x1) = 0 =)¡1 + 4 (x1 ¡ 1) = 0=) x1 = 1:25:

Thus x1 = 1:25 is our new guess of what x is. Repeating the procedure withx1 = 1:25 we …nd that

f (1:25) = 2:018 4; f 0 (1:25) = 23: 703

so that:

~f (x) = 2: 018 4 + 23:703 (x¡ 1:25)~f (x2) = 0 =) x2 = 1:1648:

Repeating this again with x 2 = 1:1648 we …nd that:

~f (x) = f (1:1648) + f 0 (1:1648) (x¡ 1:1648)~f (x3) = 0 =) x3 = 1:1362:

To obtain more precision we iterate one more time with x3 = 1:1362 so that:

~f (x) = 0:033453 + 12:06 (x¡ 1:1362)~f (x4) = 0 =) x4 = 1:1334:

The actual root, to 5 decimal places is x= 1:1332 so our solution x 4 = 1:1334is o¤ by 0:0002 . For many purposes this is close enough although furtheraccuracy can be obtained by iterating further.

Page 102: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 94

You might want to try and …nd the other two real roots x = ¡1:2492 andx = 0:33349 which can be found by using other starting values (try the startingvalues x¤0 = 1:5 and x¤0 = 0 respectively). You can see all three roots graphicallybelow:

-10

-5

0

5

10

y

-1.5 -1 -0.5 0.5 1 1.5x

f (x) = x7 ¡ 3x+ 1:

2.8 Technical Issues

2.8.1 Continuity and Di¤erentiability

Nothing is accomplished all at once, and it is one of my great maxims... that nature makes no leaps.... This law of continuity declares thatwe pass from the small to the great - and the reverse - through themedium, in degree as well as in parts. -Leibniz

Natura non facit saltum (nature does not make a jump).

-On the title page of Alfred Marshall’s 1890 Principles of Economics

Not all functions are continuous, and not all functions have a derivative.These functions can often be ignored for the purposes of intermediate economicsas mathematical freaks. Nevertheless there is some virtue in having in the backof your mind the idea that sometimes continuity and di¤erentiability are issues.

Example 1: The function:

f (x) =

½x2 if x ¸ 2¡5x if x < 2

Page 103: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 95

plotted below:

-10

-5

0

5

10

15

20

25

y

-4 -2 2 4x

A Discontinuous Function

is not continuous at x = 2 and so does not have a derivative or a slope at x = 1:Thus at x = 2 we really cannot say if f (x) is increasing or decreasing.

Example 2: In order to have a derivative a function must be continuous, butthere are continuous functions that do not have derivatives. For example thefunction:

f (x) = ¡ jx¡ 1j1=2

which is plotted below:

-2.5

-2

-1.5

-1

-0.5

y

-4 -2 0 2 4 6 8x

f (x) = ¡ jx¡ 1j1=2

is continuous but does not have a derivative at x = 1. In particular f 0 (x) =¡ 1

2pjx¡1j for x < 1 and f

0 (x) = 1

2pjx¡1j for x > 1 and f

0 (x)! §1: as x! 1

from either the left or the right. The problem is that the function has a kink atx = 1; and so its derivative does not exist at this point.

Page 104: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 96

2.8.2 Corner Solutions

In a more advanced treatment of the …rst-order conditions we would have statedthat if x¤ is a maximum and x¤ does not lie on the boundary of thedomain of f (x) then: f 0 (x¤) = 0: If x¤ lies on the boundary of the domainwe have a corner solution and it is not necessarily the case that f 0 (x¤) = 0:To see why this might matter consider the fact that in economics prices and

quantities are positive we often require that the domain of f (x) have x ¸ 0:It follows then that 0 is on the boundary of the such a domain. Sometimes itoccurs where x¤ = 0 as when a …rm decides not to hire any labour or when ahousehold does not buy any of the good. In this case the …rst-order conditionsno longer require f 0 (x¤) = 0:

Example: Consider the problem of maximizing

f (x) = 10¡ (x+ 1)2

where we restrict the domain of f (x) to be x ¸ 0: If we use the …rst-orderconditions we …nd that:

f 0 (x) = ¡2 (x+ 1) =) f 0 (x¤) = 0 =) x¤ = ¡1:Note however that x¤ = ¡1 is not in the domain of f (x) since we require: x ¸ 0:So what value of f (x) maximizes f (x) for x ¸ 0? Consider the plot of f (x)below:

4

5

6

7

8

9

y

0 0.2 0.4 0.6 0.8 1 1.2 1.4x

f (x) = 10¡ (x+ 1)2:

Note that f (x) is maximized at x¤ = 0; that is we have a maximum on theboundary of the domain of f (x) : From the graph or from:

f 0 (0) = ¡2 (0 + 1) = ¡2 < 0:we see that at x¤ = 0 that f 0 (x) < 0.A more systematic treatment of the …rst-order conditions at corner solutions

leads to the Kuhn-Tucker conditions which are better left for more advancedstudy.

Page 105: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 97

2.8.3 Advanced Concavity and Convexity

If you go on and do more advanced work you will …nd that our de…nitions ofconcavity and convexity are not entirely adequate for all purposes.Consider for example the function:

f (x) = x4

which is plotted below:

0

0.2

0.4

0.6

0.8

1

y

-1 -0.5 0.5 1x

y = x4

:

From the plot f (x) certainly looks valley-like for all x and hence we would liketo say that x4 is globally convex. If we check the second derivative we have

f 00 (x) = 12x2 ¸ 0but for x = 0 we have f 00 (0) = 0: Thus according to our de…nition of convexityx4 is not convex since we require that f 00 (x) > 0 for all x:Another problem arises with the absolute value function

y = jxjwhich is plotted below:

0

1

2

3

4

5

y

-4 -2 2 4x

y = jxj:

Page 106: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 98

Again this function clearly looks valley-like and hence we would like to say thatit is convex. However f 00 (x) = 0 for x 6= 0 and f 00 (0) is not de…ned.These problems can be dealt with by using more sophisticated de…nitions of

concavity and convexity. In particular we have:

De…nition 92 A function f (x) is convex if and only if it is the case for all x1and x2 in the domain of f (x) that for all 0 · ¸ · 1:

x3 = ¸x1 + (1¡ ¸)x2is in the domain of f (x) and

f (x3) · ¸f (x1) + (1¡ ¸) f (x2) :

De…nition 93 A function f (x) is concave if and only if it is the case for allx1 and x2 in the domain of f (x) that for all 0 · ¸ · 1:

x3 = ¸x1 + (1¡ ¸)x2is in the domain of f (x) and

f (x3) ¸ ¸f (x1) + (1¡ ¸) f (x2) :

Remark 1: The de…nition says that if you draw a line between any two pointson the graph of a convex function f (x) then the line falls everywhere above thegraph of f (x) ; while if you draw a line between two points of a concave functionthe line falls below the graph. This is illustrated in the diagram below:

Page 107: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 99

Remark 2: Note that both f (x) = jxj and f (x) = x4 are convex according tothe more advanced de…nition.

Finally in more advanced work one makes the distinction between strictlyconvex (concave) functions which have no linear segments, and convex (concave)functions which are allowed to have linear segments. Thus f (x) = jxj is convexbut not strictly convex because it is linear to the right and left of 0 whilef (x) = x2; which has no linear segments, is strictly convex. In particular:

De…nition 94 A function f (x) is strictly convex if and only if it is the casefor all x1 and x2 in the domain of f (x) such x1 6= x2 that for all 0 < ¸ < 1:

x3 = ¸x1 + (1¡ ¸)x2is in the domain of f (x) and

f (x3) < ¸f (x1) + (1¡ ¸) f (x2) :

Page 108: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 2. UNIVARIATE CALCULUS 100

De…nition 95 A function f (x) is strictly concave if and only if it is the casefor all x1 and x2 in the domain of f (x) such x1 6= x2 that for all 0 < ¸ < 1 :

x3 = ¸x1 + (1¡ ¸)x2is in the domain of f (x) and

f (x3) < ¸f (x1) + (1¡ ¸) f (x2) :

Note the more severe requirements in these de…nitions that x1 6= x2 andthat ¸ = 0 and ¸ = 1 are not allowed. This means that all strictly convex(concave) functions are also convex (concave) but a convex (concave) functionis not necessarily strictly convex (concave).It also turns out that our de…nitions of quasi-concavity and quasi-convexity

are ‡awed. For completeness you might also want to see the advanced de…ni-tions:

De…nition 96 A function f (x) is quasi-concave if and only if for all x1; x2 inthe domain of f (x) that for 0 · ¸ · 1:

x3 = ¸x1 + (1¡ ¸)x2is also in the domain of f (x) and:

f (x3) ¸ min [f (x1) ; f (x2)] :

De…nition 97 A function f (x) is quasi-concave if and only if for all x1; x2 inthe domain of f (x) that for 0 · ¸ · 1:

x3 = ¸x1 + (1¡ ¸)x2is also in the domain of f (x) and:

f (x3) · max [f (x1) ; f (x2)] :

Page 109: Introduction to Mathematical Economics Part 1 - Loglinear Publications

Chapter 3

Matrix Algebra

Such is the advantage of a well constructed language that its simpli-…ed notation often becomes the source of profound theories. -Pierre-Simon Laplace

A matrix is basically just a table of numbers. For example the matrix Agiven by:

A =

24 50 7535 6565 85

35could be the grades of three students over two exams. We implicitly work withmatrices all the time when we work with data.Matrix algebra is the art of manipulating matrices in a manner similar to

manipulating ordinary numbers in ordinary algebra. Thus we will learn to addsubtract, multiply and divide matrices. It is even possible to calculate eA orln (A) where A is a matrix.In many ways matrix algebra is nothing more than a convenient notation.

It is always possible in principle to avoid matrix algebra by working directlywith the ordinary numbers inside the matrices. This is however rather likewalking from Los Angeles to New York rather than ‡ying! There are for examplederivations in econometrics that might require …ve pages without matrix algebrabut which can be performed in only a few lines using matrix algebra. Matrixalgebra is a profound notation, one that allows you to see things that you wouldnever see otherwise. Along with calculus, it is one of the two fundamentalmathematical skills that a student of economics must acquire.The cost of the power of matrix algebra is danger! Many of your instincts

from ordinary algebra will lead you astray when you work with matrices. Theclassic example of this is that for matrices A£B and B £A are no longer the

101

Page 110: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 102

same! For this reason you need to be careful in the beginning (and even lateron!) until you have developed reliable instincts.We begin by de…ning a matrix:

De…nition 98 Matrix: An m£n matrix A with m rows and n columns takesthe form:

A = [aij ] =

26664a11 a12 ¢ ¢ ¢ a1na21 a22 ¢ ¢ ¢ a2n...

.... . .

...am1 am2 ¢ ¢ ¢ amn

37775where aij is the element in the ith row and the jth column of A:

Example: The 3£ 2 matrix:

A =

24 a11 a12a21 a22a31 a32

35 =24 5 43 16 2

35has a12 = 4; a21 = 3 and a32 = 2:The case of square matrices and their diagonal elements will be of particular

importance:

De…nition 99 Square Matrix Anm£nmatrix A is a square matrix if m = n:

De…nition 100 The Diagonal of a Square Matrix: Given an n£n squarematrix A = [aij ] the diagonal elements are those elements aij for which i = j:

Example: A 2£ 2 square matrix is:·5 43 1

¸:

The diagonal elements are a11 = 5 and a22 = 1.

Remark: Note that the diagonal goes from the top left-hand corner to thebottom right-hand corner as: 24 . . .

. . .

35 :For our purposes there is nothing particularly interesting about the ‘other diag-onal’, the one that goes from the top right-hand corner to the bottom left-handcorner.

Also of special importance in matrix algebra are vectors, which come in two

‡avors, and scalars:

Page 111: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 103

De…nition 101 Row Vector: A row vector x = [xi] is a 1£ n matrix.

De…nition 102 Column Vector: A column vector x = [xi] is a n£1 matrix.

De…nition 103 Scalar: A scalar is a 1£1 matrix or just an ordinary number.

Example: Below x is a 3£ 1 column vector, y is a 1£ 3 row vector and z is a1£ 1 scalar:

x =

24 123

35 ; y = £ 5 4 2¤; z = 3:

Remark: Any m £ n matrix A can be usefully thought of as consisting of ncolumn vectors of dimension m£ 1 or m row vectors of dimension 1£ n:

Example: The 3£ 2 matrix:

A =

24 5 43 16 2

35 =24 24 5

36

35 24 412

35 35 =266664£5 4

¤£3 1

¤£6 2

¤

377775is made up of two column vectors:24 5

36

35 and

24 412

35or three row vectors: £

5 4¤;£3 1

¤and

£6 2

¤:

3.1 Matrix Addition and SubtractionYour instincts from ordinary algebra are probably quite reliable for the additionand subtraction of matrices. The rules are very simple. If A and B are bothm£ n matrices (of the same order) then:De…nition 104 If A = [aij ] and B = [bij ] are both m £ n matrices and C =A+B; then C is an m£ n matrix with: C = [aij + bij ] :

De…nition 105 If A = [aij ] and B = [bij ] are both m £ n matrices and C =A¡B; then C is an m£ n matrix with: C = [aij ¡ bij ] :

Page 112: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 104

Remark: The only way you are likely to wrong here is if you try and add orsubtract two matrices of a di¤erent order.

Example 1: 24 3 4¡2 16 2

35+24 5 38 39 1

35 =

24 8 76 415 3

3524 3 4¡2 16 2

35¡24 5 38 39 1

35 =

24 ¡2 1¡10 ¡2¡3 1

35 :Example 2: The sum: ·

3 4¡2 1

¸+

24 5 38 39 1

35is not de…ned since the two matrices are not of the same order.It is also possible to multiply any matrix by any scalar. Again the rule is

very simple:

De…nition 106 If C = ®A where ® is a scalar and A = [aij ] is an m £ nmatrix then C is an m£ n matrix with C = [®aij ] :

Example:

6

24 3 4¡2 16 2

35 =24 6£ 3 6£ 46£¡2 6£ 16£ 6 6£ 2

35 =24 18 24¡12 636 12

35 :3.1.1 The Matrix 0

We will often come across matrices which have all elements equal to zero: Wehave:

De…nition 107 In matrix algebra when we write A = 0 we mean that all ele-ments of A are equal to 0:

De…nition 108 In matrix algebra when we write A 6= 0 we mean that A is notthe 0 matrix; that is there exists at least one element of A which is not zero.

Example 1: Given the 3£ 2 matrix A;24 3 4¡2 16 2

35

Page 113: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 105

if we subtract it from itself we obtain:

A¡A = 0or: 24 3 4

¡2 16 2

35¡24 3 4¡2 16 2

35 =24 0 00 00 0

35 :Note that here 0 is not the ordinary number 0 but a 3£ 2 matrix of 00s; thatis:

0 =

24 0 00 00 0

35 :Thus if we were just to write A¡A = 0 under the assumption that A is 3£ 2;it is left implicit that the dimension of 0 matrix is also 3£ 2:

Example 2: If

A =

24 0 00 05 0

35then we can legitimately write A 6= 0 since a32 6= 0:

3.2 Matrix Multiplication

Unlike addition and subtraction, matrix multiplication is tricky and your in-stincts from ordinary algebra are likely unreliable. We begin with the simplestcase where we multiply a row and a column vector. We have:

De…nition 109 Let a = [ai] be a 1 £ n row vector and let b = [bi] be a n £ 1column vector. Then the product ab is a scalar given by:

ab ´ £ a1 a2 a3 ¢ ¢ ¢ an¤2666664b1b2b3...bn

3777775 ´ a1b1 + a2b2 + ¢ ¢ ¢+ anbn: (3.1)

Example: Given:

a =£1 3 6

¤and b =

24 247

35 ;

Page 114: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 106

then the product of these two matrices is:

ab =£1 3 6

¤ 24 247

35 = 1£ 2 + 3£ 4 + 6£ 7 = 56:Remark: Here the order is important; that is ab and ba are not equal!In the example above while ab is the scalar 56, we shall see that ba is in fact a3£ 3 matrix given by:

ba =

24 247

35 £ 1 3 6¤=

24 2 6 124 12 247 21 42

35 :Now consider calculating AB where A and B are not vectors. Part of the

trick is to think of A as a collection of row vectors and B as a collection ofcolumn vectors. The elements of AB are then found by multiplying the rowvectors of A with the column vectors of B in the manner we have just learned.

De…nition 110 If A is an m £ n matrix and B is an n £ s matrix then toobtain AB write A as a collection of m row vectors and B as a collection of scolumn vectors as:

A =

26664a1a2...am

37775 ; B = £ b1 b2 : : : bs¤

where the 1£n row vector ai is the ith row of A and the n£ 1 column vector bjis the jth column of B. The product C = AB is then an m£ s matrix de…nedas:

C ´

26664a1b1 a1b2 ¢ ¢ ¢ a1bsa2b1 a2b2 ¢ ¢ ¢ a2bs...

.... . .

...amb1 amb2 ¢ ¢ ¢ ambs

37775 :In order for the product AB to be de…ned the number of columns in Amust

equal the number of rows in B: A recipe for determining if AB is de…ned, andthen computing AB is found below:

Recipe for Matrix Multiplication

Page 115: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 107

Given two matrices: A which is m £ n and B which is r £ s; write thedimensions of the matrices in the order you wish to multiply them. Thus forAB we would write

m£ njr £ s:We have:

1. The product AB is de…ned if and only if the two numbers in the middleare the same; that is if n = r or

m£ njr|{z}n=r

£ s:

2. If 1. is satis…ed so that AB is de…ned, then the dimension of AB is foundby eliminating the two identical numbers n and r in the middle as:

m£ njr|{z}eliminate

£ s =)m£ s

so that AB is an m£ s matrix.3. Write A as a collection of m row vectors and B as a collection of s columnvectors. The i; jth element of C = AB = [cij ] is then found by multiplyingthe ith row vector in A with the jth column vector in B so that cij = aibj :

Example 1: Consider calculating AB for the two matrices:

A =

24 3 4¡2 16 2

35 and B =

24 6 78 46 3

35 :Following the recipe we have:

1. Writing out the dimensions of AB as:

3£ 2j3£ 2we see that the two inside numbers do not match (i.e., 2 6= 3 ) and so theproduct AB is not de…ned. There is therefore no AB to calculate!

Example 2: Consider:

A =

24 3 42 16 2

35 and B =·5 2 13 3 4

¸:

Following the recipe we have:

Page 116: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 108

1. Writing out the dimensions of AB as:

3£ 2j2£ 3we see that the two middle numbers match and so the product AB isde…ned.

2. Deleting the two middle numbers we …nd that:

3£ 2j2£ 3 =) 3£ 3so that AB is a 3£ 3 matrix.

3. To calculate AB we write A as a collection of 3 row vectors and B as acollection of 3 column vectors as:

A =

266664£3 4

¤£2 1

¤£6 2

¤

377775 ; B =· ·

53

¸ ·23

¸ ·14

¸ ¸:

Carrying out the multiplication we …nd that:

AB =

26666664

£3 4

¤£2 1

¤£6 2

¤

37777775· ·

53

¸ ·23

¸ ·14

¸ ¸

=

266666666664

£3 4

¤ · 53

¸ £3 4

¤ · 23

¸ £3 4

¤ · 14

¸£2 1

¤ · 53

¸ £2 1

¤ · 23

¸ £2 1

¤ · 14

¸£6 2

¤ · 53

¸ £6 2

¤ · 23

¸ £6 2

¤ · 14

¸

377777777775=

24 27 18 1913 7 636 18 14

35 :For example to …nd the 2; 3 element of AB we multiply£

2 1¤ · 1

4

¸= 2£ 1 + 1£ 4 = 6

while to obtain the 1; 1 element we multiply:£3 4

¤ · 53

¸= 3£ 5 + 4£ 3 = 27:

You should repeat the calculation of the remaining elements on your own.

Page 117: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 109

Example 3: Consider reversing the order of the multiplication in the previousexample and calculating BA; where A and B are given above. Following therecipe we have:

1. Since B is 2£ 3 and A is 3£ 2 we have2£ 3j3£ 2

so that the two inside numbers match and BA is de…ned.

2. Eliminating the two inside numbers we …nd that

2£ 3j3£ 2 =) 2£ 2so the resulting matrix will be a 2£ 2 matrix.

3. Writing B as a collection of 2 row vectors and A as a collection of 2 columnvectors as:

B =

24 £5 2 1

¤£3 3 4

¤35 ; A =

24 24 326

35 24 412

35 35we have:

BA =

24 £5 2 1

¤£3 3 4

¤3524 24 3

26

35 24 412

35 35

=

2666666664

£5 2 1

¤24 326

35 £5 2 1

¤24 412

35£3 3 4

¤24 326

35 £3 3 4

¤24 412

35

3777777775=

·25 2439 23

¸:

Note that BA is 2£2 while AB is 3£3: This illustrates the importantfact that even when the product exists: AB 6= BA:

Note that neither AA nor BB is de…ned.

3.2.1 The Identity Matrix

The identify matrix I in matrix algebra plays the same role as 1 in ordinaryalgebra.

De…nition 111 Identity Matrix: The identity matrix I is an n £ n squarematrix with ones along the diagonal and zeros on the o¤-diagonal.

Page 118: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 110

Note that just as the number 1 has the property:

1£ 5 = 5£ 1 = 5the identity matrix has the same property for matrices; that is:

Theorem 112 For all matrices:

IA = AI = A:

Example: The 3£ 3 identity matrix is given by:

I =

24 1 0 00 1 00 0 1

35and you can verify that:24 27 18 19

13 7 636 18 14

3524 1 0 00 1 00 0 1

35 =24 27 18 1913 7 636 18 14

35 :3.3 The Transpose of a Matrix

It is very common in matrix algebra to reverse the rows and columns of a matrixwhich results in the transpose of a matrix:

De…nition 113 Transpose: If A = [aij ] is an m£n matrix then the transposeof A; denoted by: AT is an n£m matrix where the i; jth element is aji or:

AT ´ [aji] :

Remark: A seemingly trivial but remarkably useful fact is that the transposeof a scalar is itself. For example: 5T = 5:

Example: 24 3 4¡2 16 2

35T =

·3 ¡2 64 1 2

¸and

·5 2 1

¡3 3 4

¸T=

24 5 ¡32 31 4

35 :Note that in the …rst case the transpose causes a 3£2 matrix to become a 2£3matrix while in the second it causes a 2£ 3 matrix to become a 3£ 2 matrix.We have:

Page 119: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 111

Theorem 114 Transposes satisfy:

1. If AB is de…ned then (AB)T = BTAT

2.¡AT¢T= A

3. (A+B)T = AT +BT

Remark: The …rst of these results is the trickiest. The key step in …nding(AB)T is that you must …rst reverse the order of multiplication beforeapplying T to A and B:

3.3.1 Symmetric Matrices

Recall that for square matrices the diagonal goes from the top left-hand cornerto the bottom right-hand corner. For example with:

A =

24 1 2 52 3 65 6 4

35the diagonal consists of the elements 1; 3 and 4: The o¤-diagonal elements arethen those elements above and below the diagonal. Notice that in this examplethe elements above the diagonal mirror the elements below the diagonal. Wecall such matrices symmetric matrices.The precise de…nition of a symmetric matrix is:

De…nition 115 Symmetric Matrix: A matrix A is symmetric if and only ifA = AT :

Remark: Only square matrices can be symmetric since if A is m£ n then ATis n£m and so A = AT implies that m = n:

Example: The matrix A above is symmetric since: A = AT or

A =

24 1 2 52 3 65 6 4

35 =24 1 2 52 3 65 6 4

35T =24 1 2 52 3 65 6 4

35 :However the matrices B and C below are not symmetric since:

B =

24 1 2 52 3 67 6 4

35 6= BT =24 1 2 52 3 67 6 4

35T =24 1 2 72 3 65 6 4

35C =

24 1 22 35 6

35 6= CT =24 1 22 35 6

35T = · 1 2 52 3 6

¸:

Page 120: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 112

3.3.2 Proof that ATA is Symmetric

In general it is not possible to take powers of matrices. Thus if A is an m£ nmatrix then the square of A or A2 = AA is not de…ned unless n = m. Squaresonly exist for square matrices. However we can always square A as: ATA whichis an n£ n matrix or as AAT which is an m£m matrix.Matrices such as a ATA and AAT turn out to be important in econometrics.

These matrices are always symmetric! Thus:

Theorem 116 The matrices: ATA and AAT are symmetric.

Proof. In general to prove symmetry we begin with CT and try and showthat it is equal to C: Thus if C = ATA then:

CT =¡ATA

¢T(de…nition)

= AT¡AT¢T

(since (DE)T = ETDT )

= ATA (since¡DT¢T= D)

= C (de…nition)

=) C is symmetric.

You can show that AAT is symmetric in the same way or use the above resultsince if D = AAT then D = BTB where B = AT and so D has the formATA and hence is symmetric.

Example: Given:

A =

24 3 4¡2 16 2

35we have ATA and AAT are symmetric as predicted by the Theorem since:

ATA =

·3 ¡2 64 1 2

¸24 3 4¡2 16 2

35 = · 49 2222 21

¸

and

AAT =

24 3 4¡2 16 2

35· 3 ¡2 64 1 2

¸=

24 25 ¡2 26¡2 5 ¡1026 ¡10 40

35 :3.4 The Inverse of a Matrix

Just as with ordinary numbers we will want to divide with matrices. Withordinary numbers we can express division a¥ b using multiplication and inverse

Page 121: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 113

as: a ¥ b ´ a £ b¡1. Now replacing a and b with two matrices A and B; wealready know how to multiply them as A£B; so if we can …nd the analogue ofB¡1 or the inverse of a matrix, we will be able to extend division to matrices as

A¥B ´ A£B¡1:

Returning to ordinary numbers, the inverse of 3 is 13 which satis…es 3£ 13 =

13 £ 3 = 1: Now in matrix algebra the role of 1 is played by the identity matrixI and so we have:

De…nition 117 Matrix Inverse: The inverse of a square n£ n matrix A isan n£ n; matrix denoted by A¡1; which satis…es:

A¡1A = AA¡1 = I:

Remark 1: Generally in matrix algebra we write A£B¡1 rather than A¥B:

Remark 2: With ordinary numbers we often expression division as a¥ b ´ ab :

The notation ab works for ordinary numbers since the order of multiplication

does not matter; that is a £ b¡1 = b¡1 £ a ´ ab : For matrices the order of

multiplication doesmatter since A£B¡1 6= B¡1£A: Thus it is a bad notationto write for matrices: AB since it does not indicate whether you mean A£B¡1or B¡1 £A: Thus for two matrices A and B do not write A

B :

Remark 3: A matrix must be square to have an inverse. For example:24 3 4¡2 16 2

35¡1

is not de…ned.

Remark 4: Not all square matrices have an inverse. For example the scalar 0does not have an inverse nor does any n £ n square matrix 0 have an inversesince 0A = A0 = 0 for all matrices but if A were the inverse of 0 we would haveA0 = I; a contradiction.

There are also square matrices with non-zero elements which do not have aninverse. We use the following terminology:

De…nition 118 Non-Singular Matrices: If a matrix A has an inverse wesay that A is non-singular or invertible.

De…nition 119 Singular Matrices: If a matrix A does not have an inversewe say A is singular or non-invertible.

Page 122: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 114

Example 1: An example of a non-singular matrix is:·49 2222 21

¸which has an inverse ·

49 2222 21

¸¡1=

1

545

·21 ¡22

¡22 49

¸since:

1

545

·21 ¡22

¡22 49

¸£·49 2222 21

¸=

1

545

·545 00 545

¸=

·1 00 1

¸which you can verify on your own by carrying out the multiplication.

Example 2: A matrix with non-zero elements which does not have an inverse(or is singular or non-invertible) is:

A =

·1 21 2

¸:

Proof. We use proof by contradiction. Assume to the contrary that amatrix B = A¡1 exists. Since BA = I by the de…nition of an inverse we have:·

b11 b12b21 b22

¸ ·1 21 2

¸=

·1 00 1

¸:

Carrying out the multiplication we …nd from multiplying the …rst row of B withthe …rst column of A that:

b11 + b12 = 1

while multiplying the …rst row of B with the second column of A gives:

2b11 + 2b12 = 0 =) b11 + b12 = 0:

Combining these two results we obtain the contradiction: 1 = 0. Thus A doesnot have an inverse.Here are some useful results for inverses:

Theorem 120 If A has an inverse then it is unique.

Theorem 121 If A¡1 exists then¡A¡1

¢¡1= A.

Theorem 122 If A¡1 exists then¡AT¢¡1

=¡A¡1

¢T:

Theorem 123 If A and B are non-singular matrices of the same order then

(AB)¡1 = B¡1A¡1:

Page 123: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 115

Theorem 124 If A is a 2£ 2 matrix then its inverse exists if and only ifa11a22 ¡ a12a21 6= 0

and is given by:

A¡1 =·a11 a12a21 a22

¸¡1=

1

a11a22 ¡ a12a21

·a22 ¡a12¡a21 a11

¸:

Example: The matrix:

A =

·1 21 2

¸does not have an inverse (or is singular) since:

a11a22 ¡ a12a21 = 1£ 2¡ 2£ 1 = 0:

Remark 1: Note the similarity of (AB)T = BTAT and (AB)¡1 = B¡1A¡1

where for both the order is reversed before applying either T or ¡1 to the indi-vidual matrices.

Remark 2: Later on we will see that for n = 2 the scalar a11a22¡a12a21 is thedeterminant of A or:

det [A] = a11a22 ¡ a12a21:

Remark 3: Note from Theorem 122 that we can always reverse the order ofthe transpose T and inverse ¡1: A consequence of this is that:

Theorem 125 If A is symmetric and A¡1 exists, then A¡1 is symmetric.

Proof. If A is symmetric then AT = A: Now:¡A¡1

¢T=¡AT¢¡1

= A¡1

and so A¡1 is symmetric.

Example 1: The symmetric matrix:

A =

·9 33 2

¸has an inverse since:

a11a22 ¡ a12a21 = 9£ 2¡ 3£ 3 = 9 6= 0and A¡1 is given by: ·

9 33 2

¸¡1=

1

9

·2 ¡3

¡3 9

¸=

·29 ¡1

3¡13 1

¸which is also symmetric.

Page 124: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 116

3.4.1 Diagonal Matrices

Generally speaking multiplying and inverting matrices is di¢cult and best left tocomputers. There is at least one important special case for which multiplicationand inversion is easy. We have:

De…nition 126 Diagonal Matrix: If A = [aij ] is an n£n matrix with aij = 0for i 6= j or

A =

26664a11 0 ¢ ¢ ¢ 00 a22 ¢ ¢ ¢ 0...

.... . . 0

0 ¢ ¢ ¢ 0 ann

37775then A is a diagonal matrix.

Example: For the matrices below:24 3 0 00 2 00 0 4

35 ;24 3 0 74 2 00 0 4

35 ;24 3 0 64 2 03 0 4

35the …rst is a diagonal matrix while the second and third are not.Diagonal matrices are easy to multiply, you just multiply the corresponding

diagonal elements. Thus:

Theorem 127 If A and B are diagonal matrices of the same order then:

A£B =

26664a11 0 ¢ ¢ ¢ 00 a22 ¢ ¢ ¢ 0...

.... . . 0

0 ¢ ¢ ¢ 0 ann

3777526664b11 0 ¢ ¢ ¢ 00 b22 ¢ ¢ ¢ 0...

.... . . 0

0 ¢ ¢ ¢ 0 bnn

37775

=

26664a11b11 0 ¢ ¢ ¢ 00 a22b22 ¢ ¢ ¢ 0...

.... . . 0

0 ¢ ¢ ¢ 0 annbnn

37775 :

Remark: Note that for diagonal matrices AB = BA!

Example: Given:

A =

24 2 0 00 3 00 0 4

35 ; B =24 5 0 00 6 00 0 7

35

Page 125: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 117

we have:

AB =

24 2 0 00 3 00 0 4

3524 5 0 00 6 00 0 7

35 =24 5£ 2 0 0

0 3£ 6 00 0 4£ 7

35 =24 10 0 00 18 00 0 28

35 :Finding the inverse of diagonal matrices is also very easy, you merely take

the inverse of each element along the diagonal. Thus

Theorem 128 A diagonal matrix A is non-singular if and only if all its diag-onal elements are non-zero in which case:

A¡1 =

266641a11

0 ¢ ¢ ¢ 0

0 1a22

¢ ¢ ¢ 0...

.... . . 0

0 ¢ ¢ ¢ 0 1ann

37775 :

Example 1: 24 3 0 00 2 00 0 4

35¡1 =24 1

3 0 00 1

2 00 0 1

4

35 :Example 2: Since the identity matrix I is diagonal with 10s along the diagonal,it follows that: I¡1 = I:

Example 3: The diagonal matrix:24 3 0 00 2 00 0 0

35is singular (or non-invertible or it does not have an inverse) since the thirddiagonal element is zero.

3.5 The Determinant of a Matrix

An important characteristic of a square matrix is its determinant. If A is ann£ n matrix then we write its determinant as j A j or det [A] :To begin the determinant when A is a 1£ 1 scalar or a 2£ 2 matrix is given

by:

De…nition 129 If A is a 1 £ 1 scalar then det [A] = A while if A = [aij ] is a2£ 2 matrix:

det [A] = det

·a11 a12a21 a22

¸= a11a22 ¡ a12a21:

Page 126: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 118

Example: det [5] = 5; det [¡3] = ¡3 while:

det

·5 14 3

¸= 5£ 3¡ 1£ 4 = 11:

To de…ne determinants properly for n ¸ 3 is somewhat complicated sinceit involves the concept of a permutation. Rather than going into this we willinstead use the Laplace expansion to reduce the calculation of an nth orderdeterminant to a series of (n¡ 1)th order determinants. These (n¡ 1)th orderdeterminants are called minors and are found by removing one row and onecolumn from a matrix.

De…nition 130 Minors: The i; jth minor of a matrix A; denoted by mij ; isgiven by:

mij = det [Aij ]

where Aij is the (n¡ 1)£ (n¡ 1) matrix obtained by removing the ith row andthe jth column of A:

We then de…ne the i; jth cofactor as either mij if i+ j is even, or ¡mij ifi+ j is odd. Thus:

De…nition 131 Cofactors: The i; jth cofactor of a matrix A; denoted by cij ;is given by:

cij = (¡1)i+jmij

where mij is the i; jth minor of A:

Example: Consider the 3£ 3 matrix:

A =

24 3 1 41 2 63 1 8

35 :The 1; 1 minor: m11 is obtained by removing the …rst row and …rst column sothat

m11 = det

·2 61 8

¸= 10:

Since 1 + 1 = 2 is even, the 1; 1 cofactor is

c11 = (¡1)1+1 £ 10 = 10:To calculate the 3; 2 minor: m32 we remove the third row and the second columnof A to obtain:

m32 = det

·3 41 6

¸= 14

Page 127: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 119

and since 3 + 2 = 5 is odd, the cofactor is the negative of m32:

c32 = (¡1)3+2 £ 14 = ¡14:

Remark: When calculating the cofactors cij there is a pattern of alternating10s and ¡10s that are applied to the minors mij that looks like this:26664

1 ¡1 1 ¢ ¢ ¢¡1 1 ¡1 ¢ ¢ ¢1 ¡1 1 ¢ ¢ ¢...

....... . .

37775 :Notice that the diagonal elements always have 1: For example with a 4 £ 4matrix this pattern is: 2664

1 ¡1 1 ¡1¡1 1 ¡1 11 ¡1 1 ¡1

¡1 1 ¡1 1

3775 :

The Laplace expansion then states that det [A] can be found by movingacross any row or down any column of A; multiplying each element in that rowor column aij by its cofactor cij , and then summing.

Theorem 132 Laplace Expansion: Given an n£n matrix A = [aij ] with co-factors cij then det [A] is given either as the sum of the products of the elementsof the ith row with their cofactors as:

det [A] = ai1ci1 + ai2ci2 + ai3ci3 + ¢ ¢ ¢+ aincinor as the sum of the products of the elements of the jth column with theircofactors as:

det [A] = a1jc1j + a2jc2j + a3jc3j + ¢ ¢ ¢+ anjcnj :

Here is a recipe for calculating a determinant:

Recipe for Calculating det [A]

1. Pick any row or column of A and move down that row or column.

2. When you get to a particular element aij delete the corresponding rowand column, take the determinant of what is left over to obtain the minormij , and multiply the two as: aij £mij .

Page 128: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 120

3. Multiply the result in step 2: by either ¡1 or 1 depending on whether i+jis odd or even.

4. Continue to the next element in the row or column and add all the termsyou obtained in step 3: together.

Remark: In general calculating determinants is di¢cult and best left to com-puters, which use more e¢cient algorithms than the Laplace expansion. UnlessA has some special properties, you are unlikely to have to calculate by handdeterminants larger than 4£ 4:

Example: Consider the 3£ 3 matrix:

A =

24 3 1 41 2 63 1 8

35 :To calculate det [A] let us begin by going across the …rst row. Coming to

the …rst element: a11 = 3 we remove the …rst row and …rst column, take thedeterminant of what is left over and multiply by a11 = 3 to obtain:

(¡1)1+1 £ 3£ det·2 61 8

¸= 30:

Since the sum of the rows and columns of a11 is 1 + 1 = 2, which is even, wehave (¡1)1+1 = 1 and so this term does nothing to the result.We now move across the row to the next element a12 = 1: Removing the

corresponding row and column and taking the determinant we obtain:

(¡1)1+2 £ 1£ det·1 63 8

¸= 10:

Since 1 + 2 = 3 is an odd number, the term (¡1)1+2 = ¡1 and so changes thesign of the result.Finally we come to the last element of the row a13 = 4: Removing the

corresponding row and column we obtain:

(¡1)1+3 4£ det·1 23 1

¸= ¡20:

Since 1 + 3 = 4 is even, the term (¡1)1+3 = 11 and this term does nothing tothe result.Thus adding all these results together we …nd that:

det [A] = a11c11 + a22c22 + a33c33 = 30 + 10¡ 20 = 20:Notice the pattern of pluses and minus here is: 1; ¡1; 1.

Page 129: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 121

We could also have calculated det [A] above by going across the second rowas:

det [A] = a21c21 + a22c22 + a23c23

= ¡1£ (1)£ det·1 41 8

¸+ (2)£ det

·3 43 8

¸+¡1£ (6)£ det

·3 13 1

¸= 20:

(here the pattern of pluses and minus here is: ¡1; 1;¡ 1) or by going down thethird column as:

det [A] = a13c31 + a23c23 + a33c33

= 4£ det·1 23 1

¸¡ 6£ det

·3 13 1

¸+ 8£ det

·3 11 2

¸= 20:

Notice the pattern of pluses and minus here is: 1; ¡1; 1.

Although determinants are hard to numerically calculate, there are a numberof results which make theoretical manipulations of determinants quite easy. Inparticular:

Theorem 133 If A and B are square n£ n matrices then

1. det [AB] = det [A] det [B] :

2. det£AT¤= det [A]

3. If A is non-singular then det£A¡1

¤= 1

det[A] :

4. IfB is obtained by switching any two rows or any two columns ofA then: det [B] =¡det [A] :

5. If B is obtained by adding one row or column of A to another row orcolumn of A then det [B] = det [A] :

6. If ® is a scalar and A is an n£ n matrix then det [®A] = ®n det [A] :

One of the reasons we are interested in determinants is that they tell uswhether or not a matrix A has an inverse; in particular a necessary and su¢cientcondition for the inverse to exist is that the determinant not be 0 or:

Theorem 134 Given an n£ n matrix A the inverse A¡1 exists if and only ifdet [A] 6= 0:

Theorem 135 Given an n£ n matrix A the inverse A¡1 does not exist if andonly if det [A] = 0:

Page 130: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 122

Remark: The only scalar that does not have an inverse is 0: While thereare square matrices with non-zero elements that do not have an inverse, theynevertheless have a zero-like quality, in particular their determinant must bezero. Later we will see that non-invertible matrices also must have a 0 eigenvalueas well.

Remark: From result 1. of the theorem it follows that if either A or B issingular then AB is also singular, that is if either det [A] = 0 or det [B] = 0then det [AB] = det [A] det [B] = 0:

Example: For the matrix:

A =

24 3 1 41 2 63 1 8

35we showed that det [A] = 20: It follows then that A¡1 exists. Without actuallycalculating A¡1 we know that

det£A¡1

¤=

1

det [A]=1

20:

We also know that det£AT¤= 20: Suppose we multiplied every element of A by

2 so that:

B = 2A = 2

24 3 1 41 2 63 1 8

35 =24 6 2 82 4 126 2 16

35 :Then since A is 3£ 3 it follows that:

det [B] = 23 det [A] = 8£ 20 = 160:

3.5.1 Determinants of Upper and Lower Triangular Ma-trices

Determinants are in general di¢cult to compute. Two types of matrices forwhich determinants are easy to compute are upper and lower triangular matri-ces:

De…nition 136 Upper Triangular Matrix: An n £ n matrix A = [aij ] isupper triangular matrix if it has all zeros below the diagonal or:

A =

26664a11 a12 ¢ ¢ ¢ a1n0 a22 ¢ ¢ ¢ a2n...

.... . . an¡1n

0 ¢ ¢ ¢ 0 ann

37775

Page 131: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 123

De…nition 137 Lower Triangular Matrix: An n £ n matrix A = [aij ] islower triangular matrix if it has all zeros above the diagonal or

A =

26664a11 0 ¢ ¢ ¢ 0a21 a22 ¢ ¢ ¢ 0...

.... . . 0

an1 ¢ ¢ ¢ ann¡1 ann

37775 :

Remark: A diagonal matrix is both upper and lower triangular.

Determinants of triangular matrices are easy to calculate. We have:

Theorem 138 For either upper or lower triangular matrices the determinantis the product of the diagonal elements.

From this it follows that:

Theorem 139 A lower or upper triangular matrix is non-singular if and onlyif all diagonal elements are non-zero.

Example 1: Given:

A =

24 3 1 40 2 60 0 8

35 ; B =24 3 0 01 2 03 1 8

35 ; C =24 3 0 00 2 00 0 8

35then A is upper triangular, B is lower triangular and C is both upper and lowertriangular, that is C is a diagonal matrix. We have:

det [A] = det

24 3 1 40 2 60 0 8

35 = 3£ 2£ 8 = 48;det [B] = det

24 3 0 01 2 03 1 8

35 = 3£ 2£ 8 = 48;det [C] = det

24 3 0 00 2 00 0 8

35 = 3£ 2£ 8 = 48:Example 2: The matrix:

D =

24 3 1 40 2 60 0 0

35

Page 132: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 124

does not have an inverse since det [D] = 0:

Example 3: Since the identity matrix is a diagonal matrix it follows thatdet [I] = 1: We can use this to prove that if det [A] = 0 then A does not havean inverse. The proof is by contradiction. Suppose then that A¡1 existed anddet [A] = 0: Then:

1 = det [I] = det£AA¡1

¤= det [A]| {z }

=0

det£A¡1

¤= 0

so that 1 = 0, a contradiction. It follows then that A¡1 does not exist ifdet [A] = 0:

3.5.2 Calculating the Inverse of a Matrix with Determi-nants

Determinants can be used to calculate the inverse of a matrix using the cofactorand adjoint matrices de…ned as:

De…nition 140 Cofactor Matrix: Let A be an n£n square matrix and de…nethe n£ n cofactor matrix C = [cij ] where cij is the i; jth cofactor of A.

De…nition 141 Adjoint Matrix: The adjoint matrix of A; written as adj [A] ;is de…ned as the transpose of the cofactor matrix C or :

adj [A] = CT :

The following result holds for the adjoint matrix:

Theorem 142 For any square matrix A :

adj [A]£A = A£ adj [A] = det [A] I:

Remark 1: If you carry out the matrix multiplication A£adj [A] for the ith rowof A and the ith column of adj [A] and equate this to the i; i element of det [A] I;which is just det [A] ; you will see that this is just the Laplace expansion fordet [A]. The result states further that the ith row of A is orthogonal to the jth

row of adj [A] :

The adjoint matrix adj [A] is nearly the inverse A¡1 since AA¡1 = I whileA£ adj [A] = I £ jAj. We thus have:

Theorem 143 If det [A] 6= 0 then:

A¡1 =1

det [A]adj [A] :

Page 133: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 125

Example 1: For the case of 2£ 2 matrices:

A =

·a11 a12a21 a22

¸since det [A] = a11a22 ¡ a12a22; and the cofactor and adjoint matrices are:

C =

·a22 ¡a21

¡a12 a11

¸and adj [A] = CT =

·a22 ¡a12

¡a21 a11

¸it follows that:

A¡1 =1

a11a22 ¡ a12a22

·a22 ¡a12

¡a21 a11

¸:

Example 2: Consider:

A =

24 3 1 41 2 63 1 8

35which we showed earlier had a determinant of det [A] = 20: The cofactor matrixis given by:

C =

24 10 10 ¡5¡4 12 0¡2 ¡14 5

35 :For example the 3; 2 element is calculated as the cofactor:

c32 = (¡1)3+2 det·3 41 6

¸= ¡14:

The adjoint matrix is then found by taking the transpose of C so that:

adj [A] =

24 10 10 ¡5¡4 12 0¡2 ¡14 5

35T =24 10 ¡4 ¡210 12 ¡14¡5 0 5

35 :Note that A£ adj [A] = I £ det [A] is satis…ed since:24 3 1 4

1 2 63 1 8

3524 10 ¡4 ¡210 12 ¡14¡5 0 5

35 =

24 20 0 00 20 00 0 20

35= 20

24 1 0 00 1 00 0 1

35 :Thus the inverse of A is:

A¡1 =1

20

24 10 ¡4 ¡210 12 ¡14¡5 0 5

35 =24 1

2 ¡15 ¡ 1

1012

35 ¡ 7

10¡14 0 1

4

35 :

Page 134: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 126

3.6 The Trace of a Matrix

Besides determinants another important characteristic of square matrices, espe-cially in econometrics, is the sum of the diagonal elements or the trace de…nedas:

De…nition 144 Trace: If A is a square matrix then the trace of A is denotedby: tr [A] is

tr

26664a11 a12 ¢ ¢ ¢ a1na21 a22 ¢ ¢ ¢ a2n...

.... . . an¡1;n

an1 ¢ ¢ ¢ an;n¡1 ann

37775 = a11 + a22 + ¢ ¢ ¢+ ann:

Example:

tr

24 3 1 41 2 63 1 8

35 = 3 + 2 + 8 = 13:Two important results to remember when manipulating traces are:

Theorem 145 tr [A+B] = tr [A] + tr [B]

Theorem 146 tr [AB] = tr [BA]

Remark: The second property is often very useful in econometrics. Weknow that for matrices: AB 6= BA: Inside the trace operator however we arefree to reverse the order of multiplication.

Example 1: Note that:24 3 4¡2 16 2

35· 5 2 1¡3 3 4

¸=

24 3 18 19¡13 ¡1 224 18 14

35has a trace of 3 +¡1 + 14 = 16 while:·

5 2 1¡3 3 4

¸24 3 4¡2 16 2

35 = · 17 249 ¡1

¸has a trace of 17 +¡1 = 16: Thus while the two matrix products AB and BAare di¤erent, their traces, or the sums of their diagonal elements, are the same.

Example 2: If X is an n£p matrix then an important matrix in econometricsis the n£ n matrix:

P = X¡XTX

¢¡1XT :

Page 135: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 127

Note that¡XTX

¢and

¡XTX

¢¡1are p£ p matrices. We have:

tr [P ] = tr

2664Az }| {

X¡XTX

¢¡1 Bz}|{XT

3775= tr

hXTX

¡XTX

¢¡1i= tr [I] = p

since I is the p£ p identity matrix.

3.7 Higher Dimensional Spaces

3.7.1 Vectors as Points in an n Dimensional Space: <n

This work is dedicated by a humble native of Flatland in the hopethat, even as he was initiated into the mysteries Of THREE Di-mensions, having been previously conversant with ONLY TWO, sothe citizens of that celestial region may aspire yet higher and higherto the secrets of FOUR FIVE OR EVEN SIX dimensions, therebycontributing to the enlargement of THE IMAGINATION and thepossible development of that most rare and excellent gift of MOD-ESTY among the superior races of SOLID HUMANITY. -EdwinAbbott-Flatland

Edwin Abbott’s book is about the inhabitants of Flatland, a world than un-like our three dimensional world has only two dimensions: forwards and back-wards, right and left but no up and down). In the book a native of ‡atlandcommunicates with someone from our three dimensional world who tries to con-vince him that, besides the two dimensions he experiences there is yet anotherthird dimension: up and down. The di¢culties the ‡atlander experiences grasp-ing this third dimension then mirror our own di¢culties in trying to understandthe possibility of say a four dimensional space.As economists we work with higher dimensional spaces all the time. For

example in econometrics if you have 100 observations of data, then this is rep-resented as a point in a 100 dimensional space. Fortunately we do not haveto visually imagine such a space, instead we simply write down our data as a100£ 1 column vector.To see why this makes sense think of a point in one dimension; that is along

a line or say along a particular street that runs north/south. Someone asks youwhere your favourite cafe is and you tell them its 3 blocks north of here. Thisnumber 3 then can be thought as a 1 £ 1 column vector: [3] as can any pointalong the street with negative numbers used to indicate points south.

Page 136: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 128

Now consider a two-dimensional space, say the location in a city on anystreet. Now someone asks you where your favorite cafe is and you say: “Go 3blocks north of here and 4 blocks east.” This can now be represented by a 2£1column vector: ·

34

¸:

Now consider three dimensional space. Suppose the cafe is on the 10th ‡oor ofa building. You now say “Go 3 blocks north of here and 4 blocks east and goup 10 ‡oors”. This can be represented by a 3£ 1 column vector:24 3

410

35 :Let us now try and imagine two four-dimensional beings where one tells his

friend how to get to his favourite cafe. Just as we would give directions withthree numbers, he would have to give directions with four numbers: one for eachdimension. Although we cannot visually imagine it, we could easily write downthe 4£ 1 column vector he would give, for example it might be:2664

34102

3775where 2 would represent how far you would have to go in the extra fourthdirection.Thus while we cannot visualize spaces of four dimensions or higher, we can

easily write down vectors of any dimension and so we are actually able to inves-tigate spaces of any dimension. We thus have:

De…nition 147 A point in an n dimensional space is represented by a n£ 1column vector. This n dimensional space or Euclidean space is denoted by <n:

3.7.2 Length and Distance

Once we make this leap to higher dimensional spaces, it is natural ask which ofthe properties of 3 dimensional space that we are familiar can be extended to ndimensional spaces.The …rst important characteristic is length or distance. No doubt a 4 di-

mensional person would also want to know how far away his favourite cafe! Wehave:

De…nition 148 The length of an n£ 1 vector x is:

kxk =pxTx =

qx21 + x

22 + ¢ ¢ ¢+ x2n:

Page 137: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 129

De…nition 149 The distance between two vectors x and y is kx¡ yk.

Example: If

x =

24 123

35 ; y =24 567

35then the length of x and y and the distance between x and y are given by:

kxk =p12 + 22 + 32 =

p14 = 3:74

kyk =p52 + 62 + 72 =

p110 = 10:49

kx¡ yk =

q(1¡ 5)2 + (2¡ 6)2 + (3¡ 7)2 =

p48 = 6:93:

Two important results for advanced work are:

Theorem 150 kxk = 0 if and only if x = 0; that is x is an n £ 1 vector ofzeros.

Theorem 151 Triangle Inequality: kx+ yk · kxk+ kyk :

Remark: The triangle inequality basically states that if you walk along astraight line to the point x+y; then you walk a shorter distance than if you walk…rst to x or y and then to x+ y; in other words the shortest distance betweentwo points in an n dimensional space is still a straight line!

Example: Given x and y above the triangle inequality is satis…ed since:

kx+ yk =

q(1 + 5)2 + (2 + 6)2 + (3 + 7)2 =

p200 = 14:14

< kxk+ kyk =p14 +

p110 = 14:23:

3.7.3 Angle and Orthogonality

The second important basic concept for higher dimensional spaces is angle. Theangle between two vectors x and y can be sensibly de…ned as follows:

De…nition 152 Angle: Given two n£ 1 vectors x and y the angle between xand y is µ de…ned by:

cos (µ) =xTy

kxk kyk :

Page 138: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 130

Example: If

x =

24 123

35 ; y =24 617

35then you can verify that:

xT y = 29; kxk =p14; kyk =

p86

so that:

cos (µ) =xTy

kxk kyk =29p14p86= 0: 835:

Using the inverse function of cos (µ) : cos¡1 from your calculator we can liberateµ as:

µ = cos¡1 (0: 835) = 33:3

so that the angle between x and y is 33:3 degrees.

Corresponding to the requirement in trigonometry that jcos (µ)j · 1 we have:Theorem 153 Cauchy-Schwarz Inequality:

j xTy j· kxk kyk =pxTx

pyT y:

Proof. Let ® be a scalar and de…ne f (®) by: f (®) = k®x¡ yk2 ¸ 0: Now:f (®) = (®x¡ y)T (®x¡ y)

= ®2 kxk2 ¡ 2®xTy + kyk2 :Now the global minimum of f (®) occurs at ®¤ where f 0 (®¤) = 0 since f 00 (®) =2 kxk2 > 0 so that:

2®¤ kxk2 ¡ 2xTy = 0 =) ®¤ =xTy

kxk2 :

Thus:

f (®¤) ¸ 0 =) ®¤2 kxk2 ¡ 2®¤xTy + kyk2 ¸ 0

=)ÃxT y

kxk2!2kxk2 ¡ 2

ÃxTy

kxk2!xT y + kyk2 ¸ 0

=) kyk2 ¸¡xT y

¢2kxk2

=) ¡xTy

¢2 ¸ kxk2 kyk2=) j xTy j·

pxTx

pyTy:

Page 139: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 131

Remark: The equality j xTy j= kxk kyk occurs only if y = ±x where ± is somescalar. In this case the angle between x and y is 0 so that cos (0) = 1:

Example: As an illustration of the Cauchy-Schwarz inequality note that in theexample above: ¯

xTy¯= 29 < kxk kyk =

p14£

p86 = 34:7:

The most important angle that we will be concerned with is where twovectors are at right-angles, or µ = 90o in which case cos (90o) = 0 and hencexTy = 0:

De…nition 154 Orthogonality: If xTy = 0 we say that x and y are orthog-onal to each other, or are at right-angles to each other. Sometimes this isdenoted as: x?y:

Orthogonality and non-orthogonality in <2 are illustrated below:

Page 140: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 132

Remark: Since xTy is a scalar if follows that

xTy =¡xTy

¢T= yT

¡xT¢T= yTx

so that:

xTy = 0, yTx = 0

and so you can check for orthogonality either by calculating xT y or yTx:

Example 1: In previous example x and y are not orthogonal since xTy = 29 6=0:

Page 141: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 133

Example 2: Two orthogonal vectors are:

x =

24 123

35 and y =

24 6¡30

35since

xTy = 1£ 6 + 2£¡3 + 3£ 0 = 0:

Suppose x and y are orthogonal so the angle between x and y is 90o: Thenx and y form a right-angled triangle with x on one side, y on the other andthe sum: x + y on the hypotenuse. You may recall from geometry that forright-angled triangles the Pythagorean relationship

a2 + b2 = c2

holds as. The same it turns out holds for x and y in an n dimensional space.In particular:

Theorem 155 Pythagorean Relationship: If x and y are orthogonal n£ 1vectors then:

kx+ yk2 = kxk2 + kyk2 :

Proof. If x and y are orthogonal then xTy = yTx = 0 and so:

kx+ yk2 = (x+ y)T (x+ y) = xTx+ yT y +

=0z}|{xT y +

=0z}|{yTx

= xTx+ yT y

= kxk2 + kyk2 :

This is illustrated in the diagram below:

Page 142: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 134

3.7.4 Linearly Independent Vectors

Consider a two dimensional space with the two vectors:

a1 =

·10

¸; a2 =

·01

¸:

These two vectors form a basis for <2; that is you can describe any point in <2as a linear combination of a1 and a2: For example given the vector x below:

x =

·34

¸= 3

·10

¸+ 4

·01

¸= 3a1 + 4a2:

Most but not all combinations of two vectors will form a basis for <2: Forexample two vectors:

b1 =

·10

¸; b2 =

·30

¸:

so not form a basis for <2: The key requirement here is that the two vectorsbe linearly independent of each other or that they point in di¤erent directions.

Page 143: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 135

Thus the vector a1 points 1 block north while a2 points 1 block east and soare linearly independent. The two vectors b1 and b2 both point in the samedirection: north and so are linearly dependent or b2 = 3b1:These ideas are extended to higher dimensions as follows:

De…nition 156 Linear Independence: Given n vectors a1 ; a2; : : : an, wesay that they are linearly independent if for any scalars: x1; x2 :::xn :

a1x1 + a2x2 + ¢ ¢ ¢+ anxn = 0 =) x1 = 0; x2 = 0; ¢ ¢ ¢ ; xn = 0:

De…nition 157 Given n vectors a1 ; a2; : : : an, we say that they are linearlydependent if there exist x1; x2; :::xn; one of which is not 0; such that:

a1x1 + a2x2 + ¢ ¢ ¢+ anxn = 0:

This idea can be written more compactly if we think of column vectors a1 ;a2; : : : an as the columns of a matrix A: We have:

De…nition 158 The n columns of the m £ n matrix A: a1 ; a2; : : : an arelinearly independent if

Ax = 0 =) x = 0

where x is an n£ 1 column vector.

De…nition 159 If for x 6= 0:Ax = 0

the columns of A are linearly dependent.

Since vectors which are orthogonal to each other must point in di¤erentdirections they must be linearly independent and so:

Theorem 160 If a1 ; a2; : : : an are mutually orthogonal so that aTi aj = 0 fori 6= j then they are linearly independent.

Example 1: The vectors:

a1 =

·10

¸; a2 =

·01

¸are linearly independent since:·

10

¸x1 +

·01

¸x2 =

·00

¸

Page 144: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 136

can only be satis…ed when x1 = 0 and x2 = 0: Alternatively putting the twovectors in a 2£ 2 matrix we have:·

1 00 1

¸ ·x1x2

¸=

·00

¸=)

·x1x2

¸=

·00

¸and so a1 and a2 are linearly independent. Alternatively since:

aT1 a2 =£1 0

¤ · 01

¸= 0

it follows that a1 and a2 are orthogonal and hence are linearly independent.

Example 2: The vectors:

b1 =

·10

¸; b2 =

·30

¸are linearly dependent since:·

10

¸x1 +

·30

¸x2 =

·00

¸can be satis…ed for non-zero x1 and x2; for example if x1 = 1 and x2 = ¡1

3 :Alternatively putting b1 and b2 into a matrix we have for x 6= 0 where:·

1 30 0

¸·1

¡13

¸=

·00

¸so that b1 and b2 are linearly dependent.

Example 3: Suppose that

a1 =

24 3¡26

35 and a2 =

24 412

35 :You may verify that:

a1x1 + a2x2 =

24 3¡26

35x1 +24 412

35x2 =24 000

35or: 24 3

¡26

412

35· x1x2

¸=

24 000

35can only be satis…ed when x1 = x2 = 0. Consequently these two vectors arelinearly independent.

Page 145: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 137

Example 4: On the other hand:24 3¡26

35x1 +24 ¡6

4¡12

35x2 =24 000

35is satis…ed when x1 = 2 and x2 = 1: Consequently these two vectors are linearlydependent.The notion of linear independence leads to the rank of a matrix.

De…nition 161 Rank of a Matrix: The rank of a matrix, denoted by rank [A] ;is the maximum number of linearly independent column vectors of A.

De…nition 162 Full Rank: If an n £ n matrix A has rank [A] = n we saythat A has full rank.

We have:

Theorem 163 Properties of the Rank of a Matrix:

1. rank [A] = rank£AT¤so the number of linearly independent column vec-

tors equals the number of linearly independent row vectors in A:

2. If A is an m£ n matrix then rank [A] · m and rank [A] · n:3. If A is a square n£ n matrix then A is non-singular or A¡1 exists if andonly if rank [A] = n:

4. If A is a square n£ n matrix then A is non-singular or A¡1 exists if andonly if Ax = 0 implies that x = 0 where x is an n£ 1 vector.

5. If A is a square n£ n matrix then A is singular or A¡1 does not exist ifand only if there exists an n£ 1 vector x 6= 0 such that Ax = 0 .

Example 1: Consider the square matrix:

A =

·3 21 4

¸:

You can verify on your own that A here has a rank of 2; that is the vectors·31

¸;

·24

¸are linearly independent and consequently rank [A] = 2 and A¡1 exists, as givenby: ·

3 21 4

¸¡1=

·25 ¡1

5¡ 110

310

¸:

Page 146: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 138

Example 2: The matrix B given by:

B =

·3 61 2

¸has a rank of 1 since: ·

31

¸£¡2 +

·62

¸£ 1 =

·00

¸or there exists a non-zero x such that Bx = 0 where:

x =

· ¡21

¸since: ·

3 61 2

¸ · ¡21

¸=

·00

¸:

Consequently B¡1 here does not exist or B is singular.It is possible to calculate the rank of a matrix using determinants.

Theorem 164 Given any m £ n matrix A suppose that r is the order of thelargest r £ r sub-matrix: ~A of A such that det

h~Ai6= 0: Then rank [A] = r:

Example: For the matrix

A =

24 1 2 4 52 5 2 11 2 3 4

35examples of 2£ 2 and 3£ 3 sub-matrices of A would be:

·1 22 5

¸;

24 1 2 42 5 21 2 3

35 :Since A is 3 £ 4 we cannot obtain any larger sub-matrix from A than 3 £ 3:From the theorem we have rank [A] = 3 since for the 3£ 3 sub-matrix:

det

24 1 2 42 5 21 2 3

35 = ¡1 6= 0:

Page 147: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 139

3.8 Solving Systems of Equations

Matrix algebra is important for solving systems of linear equations. For examplein the demand and supply model:

demand : Q = 6¡ 32P

supply : Q = 2 +1

2P

plotted below:

1.5

2

2.5

3

3.5

4

4.5

Q

1 1.5 2 2.5 3P

Supply and Demand

we wish to …nd the equilibrium price and quantity: Q and P where the demandand supply curves intersect. Now if we set x1 = Q and x2 = P we can rewritethe demand and supply curves as:

Q = 6¡ 32P =) x1 = 6¡ 3

2x2 =) 2x1 + 3x2 = 12

Q = 2 +1

2P =) x1 = 2 +

1

2x2 =) 2x1 ¡ 1x2 = 4

or as the system of equations:

2x1 + 3x2 = 12

2x1 ¡ 1x2 = 4:

This can in turn be written in matrix notation as:·2 32 ¡1

¸ ·x1x2

¸=

·124

¸which is in the form Ax = b where:

A =

·2 32 ¡1

¸; x =

·x1x2

¸; b =

·124

¸:

Page 148: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 140

In the above example we have 2 equations and two unknowns. In generalin order to have a unique solution one needs as many equations as unknowns.Thus consider a system of n equations in n unknowns as:

a11x1 + a12x2 + ¢ ¢ ¢+ a1nxn = b1

a21x1 + a22x2 + ¢ ¢ ¢+ a2nxn = b2...

an1x1 + an2x2 + ¢ ¢ ¢+ annxn = bn

which can be written in matrix notation in the form: Ax = b as:26664a11 a12 ¢ ¢ ¢ a1na21 a22 ¢ ¢ ¢ a2n...

.... . .

...an1 an2 ¢ ¢ ¢ ann

3777526664x1x2...xn

37775 =26664b1b2...bn

37775A special property of linear systems of equations is that the number of

possible solutions is limited. In particular:

Theorem 165 Systems of linear equations have either no solution, one solu-tion, or an in…nite number of solutions.

Generally we are interested in the case where one solution exists so that ourmodel predicts that one thing and only one thing happen. We have:

Theorem 166 The system of equations: Ax = b has a unique solution if andonly if A¡1 exists in which case the unique solution is:

x = A¡1b:

Example 1: You can verify that the system of equations in the supply anddemand example above: ·

2 32 ¡1

¸ ·x1x2

¸=

·124

¸has a unique solution:·

x1x2

¸=

·2 32 ¡1

¸¡1 ·124

¸=

·18

38

14 ¡1

4

¸ ·124

¸=

·32

¸and so the unique solution is: x1 = 3 and x2 = 2 so that Q = 3 and P = 2:

Page 149: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 141

Example 2: For the system of equations:

3x1 + 2x2 = 7

6x1 + 4x2 = 14

the second equation is just the …rst multiplied by 2 and so there is really onlyone equation. This means that there are an in…nite number of solutions of theform:

x2 =7¡ 3x12

where x1 can be any number. Another way of seeing that there is a problemhere is that the matrix A is a singular matrix; that is:

det

·3 26 4

¸= 0

and so A¡1 does not exist.

Example 3: The system of equations:

3x1 + 2x2 = 7

6x1 + 4x2 = 13

has no solutions since if we divide both sides of the second equation by 2 weobtain:

3x1 + 2x2 = 7

3x1 + 2x2 = 6:5

which implies that 7 = 6:5: Again another way of seeing that there is a problemhere is that the matrix A is a singular matrix since:

det

·3 26 4

¸= 0

and so A¡1 does not exist.Since we have seen that there are a number of di¤erent necessary and su¢-

cient conditions for A¡1 to exist, we can state the above result more generallyas:

Theorem 167 If A is an n£ n matrix the following statements are equivalentin the sense that if one statement holds all the rest hold as well:

1. A¡1 exists.

2. rank [A] = n

Page 150: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 142

3. det [A] 6= 04. Ax = b has a unique solution (i.e., x = A¡1b )

5. Ax = 0 =) x = 0.

Since these results are necessary and su¢cient, we can restate this resultusing the negation of the above …ve statements as:

Theorem 168 If A is an n£ n matrix the following statements are equivalentin the sense that if one statement all the rest hold as well:

1. A¡1 does not exist.

2. rank [A] < n

3. det [A] = 0

4. Ax = b either has no solution or an in…nite number of solutions

5. There exists an n£ 1 vector x 6= 0 such that Ax = 0:

3.8.1 Cramer’s Rule

Cramer’s rule, a method for solving systems of equations using determinants,is used a lot in economics. Given a system of equations Ax = b suppose wewant to calculate the ith component: xi: The key operation is replacing the ith

column of A with b.

De…nition 169 Given an n£ n matrix A and a n£ 1 column vector b de…neAi (b) as the n£ n matrix obtained by replacing the ith column of A with b:Example: Given:

A =

·1 23 4

¸; b =

·56

¸then we obtain A1 (b) by putting b in the …rst column to obtain:

A1 (b) =

·5 26 4

¸;

and we obtain A2 (b) by putting b in the second column to obtain:

A1 (b) =

·1 53 6

¸:

Cramer’s rule then is:

Page 151: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 143

Theorem 170 Cramer’s Rule: Given the system of equations: Ax = b withdet [A] 6= 0 then:

xi =det [Ai (b)]

det [A]:

Example 1: The system of equations:

3x1 + 2x2 = 7

5x1 ¡ 2x2 = 10

can be rewritten in matrix form as:·3 25 ¡2

¸·x1x2

¸=

·710

¸.

Using Cramer’s rule we have:

x1 =

det

·7 210 ¡2

¸det

·3 25 ¡2

¸ =17

8; x2 =

det

·3 75 10

¸det

·3 25 ¡2

¸ = 5

16:

Example 2: Given the system of equations:

24 3 1 41 2 63 1 8

3524 x1x2x3

35 =24 567

35to …nd x2 using Cramer’s rule we replace the second column of A with b sothat:

x2 =

det

24 3 5 41 6 63 7 8

35det

24 3 1 41 2 63 1 8

35 =6

5:

3.9 Eigenvalues and Eigenvectors

3.9.1 Eigenvalues

Suppose we have a square n£ n matrix A and we multiply it by an n£ 1 rowvector x to obtain:

y = Ax:

Page 152: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 144

Note that y is itself an n £ 1 row vector. There are very special vectors x,called eigenvectors, which have the property that y = ¸x in which case ¸ is aneigenvalue. These turn out to be of fundamental importance in understandingmatrices.

De…nition 171 Eigenvalues and Eigenvectors: Let A be a square n£ nmatrix and suppose that:

Ax = ¸x:

where x is an n£ 1 column vector and ¸ is a scalar. Then we say that x is aneigenvector of A and ¸ is an eigenvalue.

An n£ n matrix A will in general have n eigenvalues which are the roots ofthe characteristic polynomial associated with A:

De…nition 172 Characteristic Polynomial Given an n £ n matrix A; thecharacteristic polynomial of A is:

f (¸) = det [A¡ ¸I] = ®0¸n + ®1¸n¡1 + ®2¸n¡2 + ¢ ¢ ¢+ ®nwhere the coe¢cients ®j depend on the elements of the matrix A:

Theorem 173 An n£ n matrix A has n eigenvalues: ¸1; ¸2; : : : ¸n which arethe roots of the characteristic polynomial of A :

f (¸i) = 0 for i = 1; 2; : : : n:

Proof. Since x = Ix we can rewrite Ax = ¸x as:

Ax = ¸Ix

or as:

(A¡ ¸I)x = 0:Since this equation is of the form Bx = 0 where B = (A¡ ¸I) ; and since werequire that x 6= 0; it follows from Theorem 163 that this can only hold if B issingular so that:

det [B] = f (¸) = det [A¡ ¸I] = 0

and so ¸ is a root of f (¸) : Since the characteristic polynomial: f (¸) is an nth

degree polynomial, by Theorem 19, the fundamental theorem of algebra, f (¸)has n roots, which are the n eigenvalues of A:

Example: The 2£ 2 matrix A :

A =

·5 ¡2

¡2 8

¸

Page 153: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 145

has a characteristic polynomial which is a quadratic given by:

f (¸) = det

··5 ¡2

¡2 8

¸¡ ¸

·1 00 1

¸¸= det

··5¡ ¸ ¡2¡2 8¡ ¸

¸¸= ¸2 ¡ 13¸+ 36:

Note that the coe¢cient ¡13 on ¸ is ¡Tr [A] and constant term 36 is det [A] ;which is always the case for 2£ 2 matrices.To …nd the eigenvalues of A we need to …nd the roots of this quadratic or

the solutions to:

¸2 ¡ 13¸+ 36 = 0which are:

¸1;2 =¡ (¡13)§

q(¡13)2 ¡ 4£ 362

or ¸1 = 4 and ¸2 = 9.

In the example above note that tr [A] = 13 = ¸1 + ¸2 = 4+ 9 and det [A] =36 = ¸1¸2 = 4 £ 9 so that the trace is equal to the sum of the eigenvaluesand the determinant is equal to the product of the eigenvalues. This turns outalways to be the case so that:

Theorem 174 Given any n£ n matrix A with eigenvalues: ¸1; ¸2; : : : ¸n :det [A] = ¸1 £ ¸2 £ ¢ ¢ ¢ £ ¸ntr [A] = ¸1 + ¸2 + ¢ ¢ ¢+ ¸n:

Since det [A] is the product of the eigenvalues we have:

Theorem 175 A¡1 exists if and only if all eigenvalues are not equal to 0:

An important fact about eigenvalues is that:

Theorem 176 A and AT have the same eigenvalues.

Proof. Since det [B] = det£BT¤we have:

f (¸) = det [A¡ ¸I]= det

h(A¡ ¸I)T

i= det

£AT ¡ ¸I¤

and so AT and A have the same characteristic polynomial and hence the sameeigenvalues.As you might expect calculating eigenvalues for upper and lower triangular

matrices (as well as diagonal matrices) is very easy. We have:

Page 154: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 146

Theorem 177 If A = [aij ] is an upper or lower triangular matrix or a diagonalmatrix, then the eigenvalues of A are the diagonal elements of A:

Proof. Given the assumptions about A the characteristic polynomial of Ais:

f (¸) = det [A¡ ¸I] = (a11 ¡ ¸) (a22 ¡ ¸)£ ¢ ¢ ¢ £ (ann ¡ ¸)since A¡ ¸I is upper or lower triangular and the determinant of such a matrixis the product of the diagonal elements. Therefore if ¸ = aii then f (¸) = 0 and¸ is an eigenvalue.

Example: The 3£ 3 matrix A below

A =

24 4 77 990 5 550 0 6

35is upper triangular so that its characteristic polynomial is:

f (¸) = (4¡ ¸) (5¡ ¸) (6¡ ¸)and so the eigenvalues are the diagonal elements: ¸1 = 4; ¸2 = 5; ¸3 = 6:

3.9.2 Eigenvectors

For an n£ n matrix A associated with each of the n eigenvalues ¸1; ¸n; : : : ¸nwill be n eigenvectors x1; x2; : : : xn which satisfy:

Axi = ¸ixi:

We have:

Theorem 178 Eigenvectors associated with distinct eigenvalues are linearly in-dependent.

Generally an n£n matrix will have n distinct eigenvalues so that there willbe n linearly independent eigenvectors. This in turn means that:

Theorem 179 If all eigenvalues of an n £ n matrix A are distinct then thematrix of eigenvectors C given by:

C = [x1; x2; : : : xn]

has rank [C] = n so that C¡1 exists.

Remark: Complications can arise when there are repeated eigenvalues. Forexample if the characteristic polynomial of a 3£ 3 matrix A were:

f (¸) = (¸¡ 2)2 (¸¡ 6)

Page 155: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 147

then the eigenvalues would be ¸1 = 2; ¸2 = 2 ¸3 = 6 and so there would betwo repeated eigenvalues equal to 2: In this case there might only be 2 linearlyindependent eigenvectors rather than 3:Another complication with eigenvectors is that unlike eigenvalues they are

not uniquely de…ned. In particular if xi is an eigenvector associated with theeigenvalue ¸i, then any scalar multiple of xi will also be an eigenvector; that isif ® is any scalar then:

Ax = ¸x =) A (®x) = ¸ (®x) :

For example if x is an eigenvector then A (3x) = ¸ (3x) and so 3x is also aneigenvector.To pin down an eigenvector one needs to adopt some convention. This

convention changes with the application according to what is convenient. Oftenfor example we adopt the convention that the eigenvectors have a unit length sothat if x is an arbitrary eigenvector we work with ~x = 1

kxkx which then satis…esk~xk = 1:Example: Consider the matrix:

A =

·5 ¡2

¡2 8

¸which we have seen has eigenvalues: ¸1 = 4 and ¸2 = 9:The associated eigenvectors are:

x1 =

·21

¸$ ¸1 = 4; x2 =

·1

¡2¸$ ¸2 = 9:

For example: ·5 ¡2

¡2 8

¸ ·21

¸= 4

·21

¸:

You can verify that x1 and x2 are linearly independent since:

C = [x1; x2] =

·21

1¡2

¸=) det [C] = ¡5 6= 0:

Here x1 and x2 are not unique. Instead of x1 we could equally well use theeigenvector 3x1 given by:

3x1 =

·63

¸=)

·5 ¡2

¡2 8

¸ ·63

¸= 4

·63

¸:

We can normalize x1 and x2 so that kx1k = 1 and kx2k = 1 using:

kx1k =p22 + 12 =

p5; kx2k =

q12 + (¡2)2 =

p5

and so the normalized eigenvectors would be:

~x1 =1p5

·21

¸=

"2p51p5

#; ~x2 =

1p5

·1

¡2¸=

"1p5

¡ 2p5

#:

Page 156: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 148

3.9.3 The Relationship A = C¤C¡1

We have seen that diagonal matrices are much easier to work with. It turnsout that almost all matrices can be transformed into diagonal matrices with theeigenvalues along the diagonal. More precisely:

Theorem 180 If an n £ n matrix A has n linearly independent eigenvectorsthen it can be written as:

A = C¤C¡1

where ¤ is a diagonal matrix with the eigenvalues of A along the diagonal as:

¤ =

26664¸1 0 ¢ ¢ ¢ 00 ¸2 ¢ ¢ ¢ 0...

. . .. . .

...0 ¢ ¢ ¢ 0 ¸n

37775and ith column of the n£ n matrix C is the ith eigenvector xi as:

C = [x1; x2; : : : ; xn] :

Proof. We then have:

AC = [Ax1; Ax2; : : : ; Axn]

= [¸1x1; ¸2x2; : : : ; ¸nxn]

= C¤:

Since the eigenvectors are linearly independent, rank [C] = n and so C¡1 exists.Post-multiplying both sides by C¡1 then yields A = C¤C¡1:

Remark: There are some matrices which cannot be written as: A = C¤C¡1

but these are in some sense very rare. An example of such a matrix is:·1 10 1

¸:

These exceptional matrices have two characteristics: 1) they have repeatedeigenvalues and 2) they are not symmetric. Thus in the example above sincethe matrix is upper triangular we have the repeated eigenvalues: ¸1 = 1 and¸2 = 1: For such matrices one can use the Jordan representation which wedo not discuss here.

Given A = C¤C¡1 suppose we multiply A by itself as A2 = A£ A: Usingthe representation A = C¤C¡1 we have:

A2 = C¤C¡1C| {z }=I

¤C¡1 = C¤¤C¡1 = C¤2C¡1

Page 157: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 149

where since ¤ is diagonal:

¤2 =

26664¸21 0 ¢ ¢ ¢ 00 ¸22 ¢ ¢ ¢ 0...

. . .. . .

...0 ¢ ¢ ¢ 0 ¸2n

37775 :That is we just square the eigenvalues along the diagonal of ¤: This means thatthe eigenvalues of A2 are just the square of the eigenvalues of A: In general wehave:

Theorem 181 Given an n£ n matrix A written as: A = C¤C¡1 then:An = C¤nC¡1:

Proof. (by induction). The theorem is true for n = 1: Assuming it is truefor n¡ 1 we have:

An = An¡1 £A= C¤n¡1C¡1C| {z }

=I

¤1C¡1

= C¤n¡1¤1C¡1

= C¤nC¡1:

Theorem 182 If A¡1 exists then:

A¡1 = C¤¡1C¡1:

Proof. Given that A¡1 exists then all eigenvalues are non-zero and so ¤¡1

exists. Therefore:

C¤¡1C¡1A = C¤¡1C¡1C| {z }=I

¤C¡1

= C¤¡1¤C¡1

= CC¡1 = I:

Example 1: The matrix:

A =

·73 ¡1

3¡23

83

¸has eigenvalues and eigenvectors given by:

¸1 = 2$ x1 =

·11

¸; ¸2 = 3$ x2 =

·1

¡2¸

Page 158: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 150

so that:

¤ =

·2 00 3

¸; C =

·1 11 ¡2

¸:

The representation A = C¤C¡1 then takes the form:·73 ¡1

3¡23

83

¸=

·1 11 ¡2

¸ ·2 00 3

¸·1 11 ¡2

¸¡1=

·1 11 ¡2

¸ ·2 00 3

¸·23

13

13 ¡1

3

¸which you can verify by carrying out the multiplication.To calculate A2 directly we have:

A2 =

·73 ¡1

3¡23

83

¸£·

73 ¡1

3¡23

83

¸=

·173 ¡5

3¡103

223

¸while with A2 = C¤2C¡1 we have:·

1 11 ¡2

¸ ·22 00 32

¸ ·23

13

13 ¡1

3

¸=

·173 ¡5

3¡103

223

¸:

To calculate A¡1 from A¡1 = C¤¡1C¡1 we have:·73 ¡1

3¡23

83

¸¡1=

·1 11 ¡2

¸·2¡1 00 3¡1

¸ ·23

13

13 ¡1

3

¸=

·49

19

118

718

¸:

Example 2:We can use the representation A = C¤C¡1 to prove Theorem 174.We have:

det [A] = det£C¤C¡1

¤= det [C] det [¤] det

£C¡1

¤= det [C]

1

det [C]det [¤]

= det [¤] = ¸1 £ ¸2 £ ¢ ¢ ¢ £ ¸nsince the determinant of a diagonal matrix is a product of the diagonal elements.Similarly since tr [AB] = tr [BA] we have:

tr [A] = tr£C¤C¡1

¤= tr

£¤C¡1C

¤= tr [¤]

= ¸1 + ¸2 + ¢ ¢ ¢+ ¸n:

Example 3: We can use the representation A = C¤C¡1 to prove the matrixversion of the geometric series:

Page 159: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 151

Theorem 183 Given an n£ n matrix A with eigenvalues ¸1; ¸n; : : : ¸n whichall satisfy: j¸ij < 1 then:

(I ¡A)¡1 = I +A+A2 +A3 + ¢ ¢ ¢ :For example with

A =

·0:3 0:650:2 0:72

¸:

the two eigenvalues of A are ¸1 = 0:92725 and ¸2 = 0:0927: Since these bothsatisfy j¸ij < 1 we have:µ·

1 00 1

¸¡·0:3 0:650:2 0:72

¸¶¡1=

·4:24 9:853:03 10:66

¸=

·1 00 1

¸+

·0:3 0:650:2 0:72

¸+

·0:3 0:650:2 0:72

¸2+ ¢ ¢ ¢ :

3.9.4 Left and Right-Hand Eigenvectors

Although A and AT have the same eigenvalues, they do not in general share thesame eigenvectors. Let yi be the eigenvector of AT corresponding to eigenvalue¸i and let xi be the eigenvector of A: Since yi satis…es:

ATyi = ¸iyi

by taking transposes of both sides we obtain:

yTi A = ¸iyTi :

For this reason yTi is referred to as a left-hand eigenvector of A while xi is theright-hand eigenvector. We thus have:

De…nition 184 The left and right-hand eigenvectors of A corresponding toeigenvalue ¸i are de…ned respectively as:

yTi A = ¸iyTi

Axi = ¸ixi:

We then have the following result:

Theorem 185 The left and right-hand eigenvectors of a matrix A correspond-ing to di¤erent eigenvalues are orthogonal to each other.

Proof. Let yj be the left-hand eigenvector corresponding to the eigenvalue¸j and let xi the right-hand eigenvector corresponding to the eigenvalue ¸i with¸i 6= ¸j : Then:

yTj A = ¸jyTj =) yTj Axi = ¸jy

Tj xi

Axi = ¸ixi =) yTj Axi = ¸iyTj xi

Page 160: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 152

so that:

yTj Axi = ¸iyTj xi = ¸jy

Tj xi

=) (¸i ¡ ¸j) yTi xj = 0=) yTi xj = 0

where the second last line follows from ¸i 6= ¸j: Since yTi xj = 0 it follows thatyi and xj are orthogonal.

Example: We have seen that above that the matrix:

A =

·73 ¡1

3¡23

83

¸has right-hand eigenvectors given by:

¸1 = 2$ x1 =

·11

¸; ¸2 = 3$ x2 =

·1

¡2¸:

The left-hand eigenvectors are the eigenvectors calculated from AT as:

y1 =

·21

¸$ ¸1 = 2; y2 =

· ¡11

¸$ ¸2 = 3:

since for example:

ATy1 =

·73 ¡2

3¡13

83

¸ ·21

¸=

·42

¸= 2

·21

¸= 2y1:

As predicted by the theorem, the eigenvectors x1 and y2 are orthogonal since:

xT1 y2 =£1 1

¤ · ¡11

¸= 1£¡1 + 1£ 1 = 0

and the eigenvectors x2 and y1 are orthogonal since:

xT2 y1 =£1 ¡2 ¤ · 2

1

¸= 1£ 2 +¡2£ 1 = 0:

3.9.5 Symmetric and Orthogonal Matrices

A nice property for a matrix to have is if its transpose equals its inverse. Suchmatrices are called orthogonal matrices:

De…nition 186 If C¡1 = CT then C is an orthogonal matrix.

Remark 1: If C is orthogonal, then if C is written as a collection of columnvectors as: C = [c1; c2; : : : cn] then the columns of C are orthogonal to each other,that is cTi cj = 0 for i 6= j and have a unit length; that is kcik =

pcTi ci = 1:

Page 161: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 153

Remark 2: If C is orthogonal then it preserves length. That is given any n£1vector x and y = Cx we have:

kyk =pyTy =

q(Cx)T (Cx) =

pxTCTCx =

pxTC¡1Cx =

pxTx = kxk :

Remark 3: The only scalars that have the property x = x¡1 are 1 and ¡1: Anorthogonal matrix has only eigenvalues equal to 1 and ¡1. This follows since theeigenvalues of C and CT are the same, and CT = C¡1. Since the eigenvalues ofC¡1 are the inverse of the eigenvalues of C they must satisfy ¸ = ¸¡1:

It turns out that when a matrix A is symmetric the representation A =C¤C¡1 always exists. Furthermore the matrix C is orthogonal; that is: C¡1 =CT so that:

Theorem 187 If A is a symmetric matrix then it can be written as:

A = C¤C¡1 = C¤CT

where CT = C¡1 is an orthogonal matrix.

Example: The symmetric matrix A given by:

A =

·5 ¡2

¡2 5

¸has eigenvalues ¸1 = 3 and ¸2 = 7: The representation A = C¤C¡1 = C¤CT

takes the form:

A =

"1p2¡ 1p

21p2

1p2

#·3 00 7

¸" 1p2

1p2

¡ 1p2

1p2

#

where:

C¡1 =

"1p2¡ 1p

21p2

1p2

#; C =

"1p2

1p2

¡ 1p2

1p2

#; ¤ =

·3 00 7

¸:

We then have:

A2 =

·5 ¡2

¡2 5

¸£·

5 ¡2¡2 5

¸=

·29 ¡20

¡20 29

¸=

"1p2¡ 1p

21p2

1p2

#·32 00 72

¸" 1p2

1p2

¡ 1p2

1p2

#

Page 162: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 154

and:

A¡1 =

·5 ¡2

¡2 5

¸¡1=

·521

221

221

521

¸=

"1p2¡ 1p

21p2

1p2

#·3¡1 00 7¡1

¸" 1p2

1p2

¡ 1p2

1p2

#:

The matrix C is orthogonal since:

C =

"1p2¡ 1p

21p2

1p2

#

=) C¡1 =

"1p2

¡ 1p2

1p2

1p2

#¡1=

"1p2

1p2

¡ 1p2

1p2

#= CT :

Note that the columns vectors of C are orthogonal to each other and have alength of 1.

3.10 Linear and Quadratic Functions in <n+1In this section we begin our treatment of multivariate functions. This topic willbe treated in more generality in the next chapter. Here we emphasize linearalgebra concepts, in particular the multivariate generalization of the linear andquadratic functions

3.10.1 Linear Functions

A line in the (x; y) plane, that is in the two dimensional space: <2; can berepresented by the linear function y = ax + b: Suppose we try and generalizethis the case where x is a n£ 1 vector. In this case we have the equivalent of aline in a <n+1 dimensional space with n dimensions for x and 1 dimension fory:

De…nition 188 A linear function in an n+1 dimensional space<n+1 takes theform:

y = aTx+ b

where a and x are n£ 1 vectors and b is a scalar.

Example: Consider the case where n = 2 and:

a =

·2¡3

¸; x =

·x1x2

¸; b = 10

Page 163: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 155

then the linear function is:

y = aTx+ b

=£2 ¡3 ¤ · x1

x2

¸+ 10

= 2x1 +¡3x2 + 10:Since n = 2 this describes a plane in a 2+1 = 3 dimensional space <3 with twodimensions for the x0s and 1 for y as shown below:

x1x2

y

y = 2x1 +¡3x2 + 10:

3.10.2 Quadratics

Next to the linear function the next most basic function in the (x; y) plane isthe quadratic y = ax2. Consider the problem of generalizing this to where x isan n £ 1 vector. If x is a vector we cannot write ax2 since x2 = x £ x is notde…ned! However if we rewrite this as ax2 = xTax, and replace a by a n £ nmatrix A; then this is de…ned when we let x be a n£ 1 vector. This turns outto be the most useful generalization.

De…nition 189 Quadratic Form: If x is an n £ 1 vector and A is a sym-metric n£ n matrix then a quadratic form in <n+1 is de…ned as:

xTAx:

Remark: There is no loss in generality in assuming that A is symmetric sinceif A is not symmetric we could replace A with B =

¡A+AT

¢=2; which is

symmetric and:

xTBx =xT¡A+AT

¢x

2=1

2

¡xTAx+ xTATx

¢= xTAx

since xTAx is a scalar and consequently xTAx =¡xTAx

¢T= xTATx:

Page 164: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 156

Example 1: If n = 2 then

y = xTAx =£x1 x2

¤ · a11 a12a12 a22

¸·x1x2

¸=) y = x21a11 + 2x1x2a12 + x

22a22:

Example 2: If n = 2 and

A =

·2 ¡1

¡1 3

¸

y = xTAx =£x1 x2

¤ · 2 ¡1¡1 3

¸·x1x2

¸=) y = 2x21 ¡ 2x1x2 + 3x22:

This quadratic form describes a valley-like quadratic in 3 dimensional space asshown below:

x1x2

y

y = 2x21 ¡ 2x1x2 + 3x22:

Now to generalize from y = ax2 + bx+ c we replace ax2 with the quadraticform xTAx and we replace the linear function bx+ c with its multivariate gen-eralization: bTx+ c to obtain:

De…nition 190 Quadratic: A quadratic in <n+1 takes the form:y = xTAx+ bTx+ c

where A is a symmetric n£ n matrix, b and x are n£ 1 column vectors and yand c are scalars.

Example 1: If n = 2 and

A =

·2 ¡1

¡1 3

¸; b =

·45

¸; c = 10

Page 165: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 157

then

y = xTAx+ bTx+ c

=) y = 2x21 ¡ 2x1x2 + 3x22 + 4x1 + 5x2 + 10:This describes a valley-like quadratic in 3 dimensional space or <3 as shown inthe plot below:

x1x2

y

y = 2x21 ¡ 2x1x2 + 3x22 + 4x1 + 5x2 + 10:

Example 2: If we replace A with¡A in Example 1 then the quadratic becomes:y = ¡2x21 + 2x1x2 ¡ 3x22 + 4x1 + 5x2 + 10

which describes a mountain-like function in three dimensions or <3:

x1x2

y

y = ¡2x21 + 2x1x2 ¡ 3x22 + 4x1 + 5x2 + 10:

3.10.3 Positive and Negative De…nite Matrices

One of the reasons quadratics are so important is their close relationship withthe notions of concavity and convexity. For the ordinary quadratic f (x) = ax2

Page 166: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 158

if a > 0 then f 00 (x) = 2a > 0 and f (x) is globally convex while if a < 0 thenf 00 (x) = 2a < 0 and f (x) is globally concave.Now instead of ax2 a multivariate quadratic takes the form f (x) = xTAx:

It turns out that if A > 0; or A is positive de…nite, then f (x) is convex. InExample 1 above it turns out that A > 0 and from the plot it appears that f (x)is indeed valley-like or convex. Similarly if A < 0; or A is negative de…nite, thenf (x) is concave or mountain-like. In Example 2 above it turns out that A < 0and from the plot it appears that f (x) is indeed mountain-like or concave.Now it is not obvious what we mean if we write A > 0 or A ¸ 0 or A < 0 or

A · 0 when A is a symmetric matrix. The key to extending these inequalitiesto matrices is the quadratic form. If we write for a scalar a that the a ¸ 0; thisis equivalent to saying that ax2 ¸ 0 for all x (since x2 ¸ 0 ). Similarly if wesay a > 0 this is equivalent to saying that ax2 > 0 for all x 6= 0: In general wehave:

a > 0() ¡ax2 > 0 for all x 6= 0¢

a ¸ 0() ¡ax2 ¸ 0 for all x¢

a < 0() ¡ax2 < 0 for all x 6= 0¢

a · 0() ¡ax2 > 0 for all x

¢To generalize to matrices we replace ax2 with the quadratic form: xTAx: ThusA ¸ 0 if xTAx ¸ 0 for all x: This leads to the following de…nitions where x isan n£ 1 vector:De…nition 191 We say that A is positive de…nite or A > 0 if and only if

xTAx > 0

for all x except x = 0:

De…nition 192 We say that A is positive semi-de…nite or A ¸ 0 if andonly if

xTAx ¸ 0for all x:

De…nition 193 We say that A is negative de…nite or A < 0 if and only if

xTAx < 0

for all x except x = 0:

De…nition 194 We say that A is negative semi-de…nite or A · 0 if andonly if

xTAx · 0for all x:

Page 167: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 159

Remark 1: If x = 0 then xTAx= 0 no matter what A is. This is the reason whythis case is exclude in the de…nitions of positive and negative de…nite matrices.

Remark 2: If A is positive (negative) de…nite it follows from the de…nitionthat the quadratic xTAx has a unique global minimum (maximum) at x¤ = 0:This is because y = xTAx is describes an n+1 dimensional valley (mountain) orconvex (concave) function with x¤ = 0 the minimum (maximum) of the valley(mountain). As we shall see later, it is a matrix of second derivatives (calledthe Hessian) being positive or negative de…nite which determines whether anyfunction in <n+1 is concave or convex.

Example 1: Consider the case where n = 2 and

A =

·2 ¡5

¡5 13

¸in which case:

xTAx =£x1 x2

¤ · 2 ¡5¡5 13

¸·x1x2

¸= 2x21 ¡ 10x1x2 + 13x22:

We can show that A is positive semi-de…nite or A ¸ 0 since for all x1; x2:

xTAx = 2x21 ¡ 10x1x2 + 13x22= (x1 ¡ 2x2)2 + (x1 ¡ 3x2)2 ¸ 0

since the sum of two squares can never be negative (you can verify that thesecond line is correct by expanding the two terms and showing it is equal to theprevious expression). We therefore conclude that for all x :

xTAx ¸ 0

and so by de…nition A is positive semi-de…nite.We can however prove more, that in fact A is positive de…nite. Suppose

xTAx = 0: This could only occur if (x1 ¡ 2x2)2 = 0 and (x1 ¡ 3x2)2 = 0 whichin turn implies that x1 = 2x2 and x1 = 3x2 which can only occur if x1 = x2 = 0since otherwise:

x1 = 2x2 and x1 = 3x2 =) 2x2 = 3x2 =) 2 = 3

which is a contradiction. Thus xTAx = 0 only occurs when x = 0 so that:

xTAx > 0 for all x except x = 0

and so by de…nition the matrix A is positive de…nite.

Page 168: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 160

Example 2: Consider the case where n = 2 and

A =

· ¡2 22 ¡2

¸in which case:

xTAx =£x1 x2

¤ · ¡2 22 ¡2

¸·x1x2

¸= ¡2x21 + 4x1x2 ¡ 2x22:

We can show that A is negative semi-de…nite or A · 0 since for all x1; x2:

xTAx = ¡2x21 + 4x1x2 ¡ 2x22= ¡2(x1 ¡ x2)2 · 0:

We can cannot prove that A < 0 or that A is negative de…nite since theredoes exist and x 6= 0 such that xTAx = 0: In particular if x1 = x2 = 1 then:

xTAx = ¡2(1¡ 1)2 = 0

and so A is not negative de…nite.

The inequality a > 0 (a < 0) is a strong inequality while a ¸ 0 (a · 0) isa weak inequality. This is because if a > 0 (a < 0) then it follows immediatelythat a ¸ 0 (a · 0) but one cannot conclude from a ¸ 0 (a · 0) that a > 0(a < 0) (since if a ¸ 0 it is possible that a = 0 in which case a > 0 would befalse). Thus knowing that a > 0 is a stronger result than knowing a ¸ 0 just asknowing that a ¸ 0 is a weaker result than knowing that a > 0:These same relationships hold also for matrices. In particular:

Theorem 195 If A > 0 then A ¸ 0 so that a positive de…nite matrix is alwayspositive semi-de…nite; but a positive semi-de…nite matrix is not necessarily pos-itive de…nite.

Theorem 196 If A < 0 then A · 0 so that a negative de…nite matrix is alwaysnegative semi-de…nite, but a negative semi-de…nite matrix is not necessarilynegative de…nite.

Example: In Example 2 above the matrix:· ¡2 22 ¡2

¸is A is negative semi-de…nite and so A · 0 but it is not negative de…nite.

Page 169: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 161

De…niteness and the existence of A¡1

Recall that for scalars if a number a satis…es a ¸ 0 but not a > 0 then it mustbe that a = 0: For matrices if A ¸ 0 but not A > 0; (A is positive semi-de…nitebut not positive de…nite), then it does not follow that A = 0; that is that allelements of A are zero. However it does follow that A has certain zero-likeproperties, in particular that its determinant is 0 and hence it does not have aninverse. This is summarized below:

Theorem 197 If A > 0 so that A is positive de…nite then it is non-singular orA¡1 exists.

Theorem 198 If A ¸ 0 or A is positive semi-de…nite but A is not positivede…nite then A is singular or A¡1 does not exist.

Theorem 199 If A < 0 so that A is negative de…nite then it is non-singularor A¡1 exists.

Theorem 200 If A · 0 or A is negative semi-de…nite but A is not negativede…nite then A is singular and A¡1 does not exist.

Example: For the two matrices:·2 ¡5

¡5 13

¸;

· ¡2 22 ¡2

¸:

The …rst we showed is positive de…nite while the second is negative semi-de…nitebut not negative de…nite. Note that

det

·2 ¡5

¡5 13

¸= 1;det

· ¡2 22 ¡2

¸= 0

so that the second matrix is singular and so does not have an inverse while the…rst is non-singular and has an inverse.

A positive de…nite matrix must have positive diagonal elements, but theo¤-diagonal elements can be either positive or negative. In general we have:

Theorem 201 If A > 0 or A is positive de…nite all diagonal elements must begreater than 0 (or aii > 0 for i = 1; 2; : : : n: )

Theorem 202 If A ¸ 0 or A is positive semi-de…nite all diagonal elementsmust be greater than or equal to 0 (or aii ¸ 0 for i = 1; 2; : : : n: )Theorem 203 If A < 0 or A is negative de…nite all diagonal elements must beless than 0 (or aii < 0 for i = 1; 2; : : : n: )

Theorem 204 If A · 0 or A is negative semi-de…nite all diagonal elementsmust be less than or equal to 0 (or aii · 0 for i = 1; 2; : : : n: )

Page 170: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 162

Remark 1: The signs of the diagonal elements provide necessary conditionsbut not su¢cient conditions. For example it turns out that the matrix:·

1 44 2

¸is not positive de…nite even though the diagonal elements are both positive.For the matrix:

A =

·1 11 ¡2

¸we can immediately conclude that it is not positive de…nite (nor positive semi-de…nite) since it has a negative diagonal element ¡2: We can also concludethat it is not negative de…nite ( nor negative semi-de…nite) since it also hasa positive diagonal element: Note that this last example shows that while forordinary scalars it is always the case that either a ¸ 0 or a · 0; this is not truefor matrices; that is for A above it is not the case that A ¸ 0 and it is not thecase that A · 0 .As usual, it is much easier to analyze diagonal matrices for de…niteness. We

have:

Theorem 205 If A is a diagonal matrix then A > 0 or A is positive de…niteif and only if all diagonal elements are greater than 0 (or aii > 0 for i =1; 2; : : : n:)

Theorem 206 If A is a diagonal matrix then A ¸ 0 or A is positive semi-de…nite if and only if all diagonal elements are greater than or equal to 0 (oraii ¸ 0 for i = 1; 2; : : : n:)Theorem 207 If A is a diagonal matrix then A < 0 or A is negative de…niteif and only if all diagonal elements are less than 0 (or aii < 0 for i = 1; 2; : : : n:)

Theorem 208 If A is a diagonal matrix then A · 0 or A is negative semi-de…nite if and only if all diagonal elements are less than or equal to 0 (or aii · 0for i = 1; 2; : : : n: )

Example: The diagonal matrices:·1 00 2

¸;

·1 00 0

¸;

· ¡1 00 ¡2

¸;

· ¡1 00 0

¸are respectively positive de…nite, positive semi-de…nite, negative de…nite andnegative semi-de…nite. The diagonal matrix:·

1 00 ¡2

¸is neither positive de…nite, positive semi-de…nite, negative de…nite nor negativesemi-de…nite

Page 171: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 163

The Negative of a Positive De…nite Matrix

With scalars if you multiply a positive number by¡1 you get a negative number.The same result holds for matrices, if you multiply a positive de…nite matrix by¡1 you get a negative de…nite matrix. In generalTheorem 209 A is positive de…nite (or A > 0 ) if and only if ¡A is negativede…nite (¡A < 0).Theorem 210 A is positive semi-de…nite (or A ¸ 0) if and only if ¡A isnegative semi-de…nite (or ¡A · 0 ).This means that if you have say a positive de…nite matrix, then you can …nd

another negative de…nite matrix by multiplying all elements by ¡1:

Example: Since we have already seen that:·2 ¡5

¡5 13

¸is positive de…nite it follows that:· ¡2 5

5 ¡13¸

is negative de…nite.

The Form A = BTB

Recall that if a scalar is of the form a = b2 then we immediately obtain the weakinequality: a ¸ 0 since a square is always non-negative. Given the additionalinformation b 6= 0 we can obtain the strong inequality: a > 0: Now supposethat the matrix A takes a similar form: A = BTB: We then obtain the weakinequality: A ¸ 0 or that A is positive semi-de…nite. Furthermore by restrictingB we obtain a strong inequality that A > 0 or that A is positive de…nite. Thisresult turns out to be quite important in econometrics.

Theorem 211 If B is an m £ n matrix and A = BTB then A ¸ 0 or A ispositive semi-de…nite.

Theorem 212 If B is an m £ n matrix with rank [B] = n then A > 0 orA = BTB is positive de…nite.

Proof. If A = BTB then de…ne y as y = Bx: Now for all x:

xTAx = xTBTBx

= (Bx)T Bx

= yTy

= kyk2 ¸ 0

Page 172: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 164

Now if rank [B] = n then y = Bx = 0 =) x = 0: Therefore for x 6= 0 it followsthat y 6= 0 so that xTAx = kyk2 > 0 and hence A is positive de…nite.

Example: Given:

B =

24 3 4¡2 16 2

35you can verify that rank [B] = 2; that is the two columns of B are linearlyindependent. Thus Theorem 212 the matrix A = BTB given by:

BTB =

·3 ¡2 64 1 2

¸24 3 4¡2 16 2

35 = · 49 2222 21

¸

is positive de…nite.

3.10.4 Using Determinants to Check for De…niteness

We can use determinants to check whether a matrix is positive or negativede…nite. The key concept is the leading principal minor de…ned as:

De…nition 213 Leading Principal Minors: The ith leading principal minorof the n£ n matrix A is given by: Mi = det [Aii] where Aii is the i£ i matrixobtained from the …rst i rows and columns of A:

Example: If A is given by:

A =

24 3 1 21 6 32 3 8

35we have 3 leading principal minors:

M1 = det [3] = 3

M2 = det

·3 11 6

¸= 17

M3 = det

24 3 1 21 6 32 3 8

35 = 97We have:

Theorem 214 The matrix A is positive de…nite if and only if all the leadingprincipal minors are strictly positive; that is:M1 > 0; M2 > 0; : : : Mn > 0:

Page 173: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 165

Theorem 215 The matrix A is negative de…nite if and only if all the principalminors alternative in sign with the …rst being negative or: M1 < 0; M2 >0; M3 < 0; : : : :

Example 1: For a general 2£ 2 matrix

A =

·a11 a12a12 a22

¸to be positive de…nite we require that M1 = a11 > 0 and M22 = det [A] =a11a22 ¡ a212 > 0: The last result implies that:

ja12j < pa11a22:Thus in addition to the diagonal elements being positive, we require that theo¤-diagonal elements not be too large in absolute value relative to the diagonalelements. Thus the matrices:·

1 44 2

¸;

·1 ¡4

¡4 2

¸are both not positive de…nite since the o¤-diagonal element, here either 4 or¡4; is too large relative to the diagonal elements or 4 > p1£ 2. Thus so thatfor both matrices: M1 = 1 > 0 but M2 = ¡14 < 0 and so neither matrix ispositive de…nite.

Example 2: The matrix we considered above:

A =

24 3 1 21 6 32 3 8

35is positive de…nite since: M1 = 3 > 0; M2 = 17 > 0 and M3 = 97 > 0:

Example 3: If

A =

24 ¡3 ¡1 ¡2¡1 ¡6 ¡3¡2 ¡3 ¡8

35then:

M1 = det [¡3] = ¡3 < 0M2 = det

· ¡3 ¡1¡1 ¡6

¸= 17 > 0

M3 = det

24 ¡3 ¡1 ¡2¡1 ¡6 ¡3¡2 ¡3 ¡8

35 = ¡97 < 0:

Page 174: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 166

Since the leading principal minors alternate in sign with M1 < 0 it follows thatA is negative de…nite.

Example 4: Given:

B =

24 3 4¡2 16 2

35you can verify that rank [B] = 2; that is the two columns of B are linearlyindependent. Thus Theorem 212 predicts that BTB is positive de…nite. Toverify this note that:

BTB =

·3 ¡2 64 1 2

¸24 3 4¡2 16 2

35 = · 49 2222 21

¸

which is positive de…nite since from the leading principal minors:M1 = 49 > 0and M2 = 545 > 0:

Remark 1: At …rst it is tricky to remember the condition for A to be negativede…nite since intuitively one would think that all leading principal minors mustbe negative. It may help you to remember the rule if you consider a diagonalmatrix with negative elements, which we know is negative de…nite. For example:24 ¡1 0 0

0 ¡2 00 0 ¡3

35 :Here M1 = ¡1 < 0 but M2 = ¡1 £ ¡2 = 2 > 0 because the product of twonegative numbers is positive. Finally M3 = ¡1 £ ¡2 £ ¡3 = ¡6 < 0: This iswhy Mi must alternate in sign for negative de…nite matrices.

Remark 2: We cannot easily extend these criteria to positive semi-de…nite andnegative semi-de…nite matrices. For example it does not follow that ifM1 ¸ 0;M2 ¸ 0; : : : that A is positive semi-de…nite. For example the matrix:24 1 0 0

0 0 00 0 ¡1

35has M1 = 1 ¸ 0; M2 = 0 ¸ 0 and M3 = 0 ¸ 0 but this matrix is not positivesemi-de…nite since it has a negative diagonal element.

3.10.5 Using Eigenvalues to Check for De…niteness

An alternative method for testing for de…niteness is to use eigenvalues. Wehave:

Page 175: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 167

Theorem 216 If A is a symmetric n £ n matrix with eigenvalues: ¸i ; i =1; 2; : : : n then:

1. A is positive de…nite ( A > 0 ) if and only if: ¸i > 0 for i = 1; 2; : : : n:

2. A is positive semi-de…nite ( A ¸ 0 ) if and only if: ¸i ¸ 0 for i = 1; 2; : : : n:3. A is negative de…nite ( A < 0 ) if and only if: ¸i < 0 for i = 1; 2; : : : n:

4. A is negative semi-de…nite ( A · 0 ) if and only if: ¸i · 0 for i = 1; 2; : : : n:Proof. We prove only 1: and 2: but 3: and 4: follow the same reasoning.

Given that A is symmetric we have A = C¤CT where ¤ is a diagonal matrixwith the eigenvalues of A along its diagonal and C is an orthogonal matrix sothat CT = C¡1. We then have:

xTAx = xTC¤CTx

= yT¤y

= y21¸1 + y22¸2 + ¢ ¢ ¢+ y2n¸n

where y = CTx is an n£ 1 vector. Since ¤ is a diagonal matrix it follows thatxTAx = yT¤y ¸ 0 if and only ¸i ¸ 0 for all i and so 2: follows. Since CT isnon-singular, it follows that x = 0 if and only if y = 0 so that xTAx = yT¤y > 0for x 6= 0 if and only if ¸i > 0 for all i and so 1: follows.

Example 1: If A is given by:

A =

·2 ¡5

¡5 13

¸we …nd that the eigenvalues are

¸1 =15

2+1

2

p221 = 14:933 > 0

¸2 =15

2¡ 12

p221 = 0:0066966 > 0:

It follows that A is positive de…nite.

Example 2: If A is given by:

A =

· ¡2 22 ¡2

¸we …nd that the eigenvalues are

¸1 = ¡4 < 0¸2 = 0:

Page 176: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 168

Since all the eigenvalues satisfy ¸i · 0 it follows that A is negative semi-de…nite.However since ¸2 = 0 it follows that A is not negative de…nite.

Example 3: If A is given by:

A =

·1 44 2

¸we …nd that the eigenvalues are

¸1 =3

2+1

2

p65 = 5:531 > 0

¸2 =3

2¡ 12

p65 = ¡2:531 < 0:

Since the eigenvalues have the opposite sign, the matrix A is neither positivede…nite, positive semi-de…nite, negative de…nite nor negative semi-de…nite.

3.10.6 Maximizing and Minimizing Quadratics

Consider the problem of maximizing (minimizing) the quadratic:

y = xTAx+ bTx+ c:

where x is an n£1 vector, A is a symmetric n£n matrix, and b is a n£1 vectorand c is a scalar. We also assume that A is negative de…nite which implies thatA¡1 exists. In the next chapter we will see how to solve this problem usingmultivariate calculus.It is possible to …nd x¤ without calculus using a technique called completing

the square. We have:

Theorem 217 The value of x which maximizes (minimizes)

y = xTAx+ bTx+ c:

where A is negative (positive) de…nite is:

x¤ = ¡12A¡1b:

Proof. Completing the square amounts to showing that:

xTAx+ bTx+ c = (x¡ x¤)T A (x¡ x¤) + c¡ bTA¡1b4

:

You can verify this as follows:

(x¡ x¤)T A (x¡ x¤) = xTAx¡ x¤TAx¡ xTAx¤ + x¤TAx¤

= xTAx+1

2bTA¡1Ax+

1

2xTAA¡1b+

1

4bTA¡1AA¡1b

= xTAx+1

2bTx+

1

2xT b+

1

4bTA¡1b

= xTAx+ bTx+1

4bTA¡1b

Page 177: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 169

since bTx = xT b: Thus:

xTAx+ bTx = (x+ x¤)T A (x+ x¤)¡ 14bTA¡1b

from which it follows that:

y = (x¡ x¤)T A (x¡ x¤) + c¡ bTA¡1b4

:

If A is negative de…nite then it follows that:

(x¡ x¤)T A (x¡ x¤) < 0 for (x¡ x¤) 6= 0(x¡ x¤)T A (x¡ x¤) = 0 for (x¡ x¤) = 0

and hence:

y · c¡ bTA¡1b4

with equality only when x = x¤: If A is positive de…nite then replace < with> above and the result follows. It follows then that x¤ is a global maximum(minimum).

Example 1: If n = 2 and

A =

·2 ¡1

¡1 3

¸; b =

·45

¸; c = 10

then

y = xTAx+ bTx+ c

=) y = 2x21 ¡ 2x1x2 + 3x22 + 4x1 + 5x2 + 10:You can check that the matrix A is positive de…nite since M1 = 2 > 0 andM2 = 5 > 0 so we will look for a minimum of the quadratic plotted below:

x1x2

y

y = 2x21 ¡ 2x1x2 + 3x22 + 4x1 + 5x2 + 10:

Page 178: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 170

We have:

x¤ = ¡12A¡1b = ¡1

2

·2 ¡1

¡1 3

¸¡1 ·45

¸=

· ¡1710¡75

¸so that the global minimum occurs at x¤1 = ¡17

10 and x¤2 = ¡7

5 :

Example 2: The linear regression model is:

Y = X¯ + e

where Y is an n£ 1 vector, X is an n£ p matrix of rank p, and e is an n£ 1vector of random errors. The least squares estimator ^ is the value of ¯ whichminimizes the sum of squares function:

S (¯) = (Y ¡X¯)T (Y ¡X¯)= ¯TXTX¯ ¡ 2Y TX¯ + Y TY:

Although the notation may cover this up, in fact S (¯) is a quadratic. Herex is ¯; x¤ is ^; A = XTX; b = ¡2XTY and c = Y TY . If rank [X] = p then Ais positive de…nite by Theorem 212 and so making the translation in notationfrom

x¤ = ¡12A¡1b

we …nd that:

^ = ¡12A¡1b

= ¡12

=Az }| {¡XTX

¢¡1 =bz }| {¡¡2XTY¢

=¡XTX

¢¡1XTY:

The formula ^ =¡XTX

¢¡1XTY is one of the central results in econometrics.

Example 3: Suppose one has data on 10 families where Yi is the consumptionof family i; Xi1 is the income of family i and Xi2 is the wealth of family i andsuppose that:

Yi = Xi1¯1 +Xi2¯2 + ei for i = 1; 2; : : : 10:

The parameter ¯1 is the marginal propensity to consume out of income while¯2 is the marginal propensity to consume out of wealth while ei is a randomerror.

Page 179: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 171

Suppose that the actual data takes the form:

Y =

2666666666666664

22:188:27:829:28:8217:435:161:911:452:7

3777777777777775; X =

2666666666666664

10 10025 3557 1041 5010 375 86021 6277 10721 1071 45

3777777777777775so that for example family 1’s consumption is 22:1; their income is 10 and theirwealth is 100: From this data we wish to estimate ¯1 and ¯2 using the leastsquares estimator: ^ =

¡XTX

¢¡1XTY .

We have:

XTX =

·10 25 7 41 10 75 21 77 21 71100 355 10 50 3 860 62 107 10 45

¸

2666666666666664

10 10025 3557 1041 5010 375 86021 6277 10721 1071 45

3777777777777775=

·20032 8947189471 895652

¸and:

XTY =

·10 25 7 41 10 75 21 77 21 71100 355 10 50 3 860 62 107 10 45

¸

2666666666666664

22:188:27:829:28:8217:435:161:911:452:7

3777777777777775=

·29555:3233334:4

¸:

Page 180: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 172

Thus the least squares estimator ^ is given by:

^ =¡XTX

¢¡1XTY

=

·20032 8947189471 895652

¸¡1 ·29555:3233334:4

¸=

·0:560:20

¸:

The estimated marginal propensity to consume out of income is then: ^1 = 0:56while the estimated marginal propensity to consume out of wealth is: ^2 = 0:20:

3.11 Idempotent Matrices

There are only two scalars that have the property a2 = a : 02 = 0 and 12 = 1:An idempotent matrix is like 0 or 1 in that it also has this property. Thus:

De…nition 218 Idempotent Matrix: An n£n matrix P is said to be idem-potent if

PP = P:

Example 1: Recall from the linear regression model Y = X¯ + e that ^ =¡XTX

¢¡1XTY: The vector of …tted values, or the values of Y predicted by the

estimated model are given by:

Y = X ^

= X¡XTX

¢¡1XTY

= PY

where P is given by:

P = X¡XTX

¢¡1XT :

P is idempotent since:

PP = X¡XTX

¢¡1XTX| {z }=I

¡XTX

¢¡1XT

= XI¡XTX

¢¡1XT

= X¡XTX

¢¡1XT = P:

The least squares residual, or the part of Y that the model cannot explain,is given by:

e ´ Y ¡ Y= Y ¡ PY= (I ¡ P )Y:

Page 181: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 173

The matrix I ¡ P is also idempotent since:

(I ¡ P ) (I ¡ P ) = I ¡ IP ¡ PI + PP= I ¡ P ¡ P + P= I ¡ P:

Since PP = P it can further be shown that

P (I ¡ P ) = P ¡ PP = P ¡ P = 0:

Example 2: If

X =

24 111

35then in econometrics this would correspond to the regression: Yi = ¹+ ei with3 observations and a constant term.To calculate P for this X note that:

XTX =£1 1 1

¤ 24 111

35 = 3:Thus:

P = X¡XTX

¢¡1XT

=

24 111

35 3¡1 £ 1 1 1¤

=1

3

24 1 1 11 1 11 1 1

35=

24 13

13

13

13

13

13

13

13

13

35 :We know P is idempotent but you might want to check this by multiplying P byitself.The idempotent matrix I ¡ P is then:24 1 0 0

0 1 00 0 1

35¡24 1

313

13

13

13

13

13

13

13

35 =24 2

3 ¡13 ¡1

3¡13

23 ¡1

3¡13 ¡1

323

35 :

Page 182: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 174

3.11.1 Important Properties of Idempotent Matrices

Idempotent matrices have the following properties:

Theorem 219 If P is idempotent then:

1. The eigenvalues of P are all either 0 or 1.

2. If P is symmetric then it is positive semi-de…nite.

3. tr [P ] = rank [P ].

4. If x and y are two n£ 1 vectors and if P is symmetric then w = Px andz = (I ¡ P ) y are orthogonal; that is wT z = 0:

Proof. If P is idempotent then an eigenvector x 6= 0 and eigenvalue ¸satisfy:

Px = ¸x:

Multiplying both sides by P we …nd that:

PPx = ¸Px =) ¸x = ¸2x

=) ¸ (1¡ ¸)x = 0since: Px = ¸x and PPx = Px = ¸x. Now since x 6= 0 it follows that¸ (1¡ ¸) = 0 so that: ¸ = 0 or ¸ = 1: Since ¸ = 0 or ¸ = 1 it follows that ¸ ¸ 0and so P is positive semi-de…nite. Since tr [P ] is the sum of the eigenvalueswhich equals the number of eigenvalues equal to 1 which is the rank of P:If P is symmetric or P = PT then and w = Px and z = (I ¡ P ) y then:

wT z = (Px)T (I ¡ P ) y = xTPT (I ¡ P ) y= xTP (I ¡ P ) y = xT (P ¡ PP ) y = xT (P ¡ P ) y= xT0y = 0

and so w and z are orthogonal.

Example 1: Consider the idempotent matrices: P and I¡P from the previousexample where:

P =

24 13

13

13

13

13

13

13

13

13

35 ; I ¡ P =24 2

3 ¡13 ¡1

3¡13

23 ¡1

3¡13 ¡1

323

35 :The eigenvalues of P are determined from the characteristic polynomial:

f (¸) = det [P ¡ ¸I] = det24 1

3 ¡ ¸ 13

13

13

13 ¡ ¸ 1

313

13

13 ¡ ¸

35 = ¸2 ¡ ¸3= ¸2 (¸¡ 1) = 0

Page 183: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 175

and so the eigenvalues are: ¸1 = 1; ¸2 = 0 and ¸3 = 0:Since the eigenvalues satisfy ¸ ¸ 0 it follows that P is positive semi-de…nite.

Since two of the eigenvalues are 0 however P is not positive de…nite and henceP¡1 does not exist.The trace of P is given by:

tr [P ] =1

3+1

3+1

3= 1

which is the rank of P ( i.e., P only has one linearly independent column).The trace of I ¡ P is:

tr [I ¡ P ] = 2

3+2

3+2

3= 2

which is also equal to rank [I ¡ P ] :Let us take any 3£ 1 vector x; say:

x =

24 123

35and multiply x by P and I ¡ P to obtain:

w = Px =

24 13

13

13

13

13

13

13

13

13

3524 123

35 =24 222

35z = (I ¡ P )x =

24 23 ¡1

3 ¡13¡1

323 ¡1

3¡13 ¡1

323

3524 123

35 =24 ¡1

01

35 :Note that w and z are orthogonal since:

wT z =£2 2 2

¤24 ¡101

35 = 2£¡1 + 2£ 0 + 2£ 1 = 0:Example 2: The vector of …tted values Y and the vector of least squaresresiduals are given by:

Y = PY; e = (I ¡ P )Ywhere P = X

¡XTX

¢¡1XT : Earlier we showed that tr [P ] = p and so the rank

of the n£ n matrix P is p: It follows that the two vectors are orthogonal fromthe above theorem or:

Y T e = Y T (I ¡ P )T PY= Y T (I ¡ P )PY= Y T

¡P ¡ P 2¢Y

= Y T 0Y

= 0:

Page 184: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 176

Since Y = Y + e where Y and e are orthogonal there exists from Theorem155 a Pythagorean relationship exists:

Y TY = Y T Y + eT e:

Dividing both sides by Y TY we obtain:

1 =Y T Y

Y TY+eT e

Y TY:

The …rst term on the right is the uncentered R2 de…ne by:

R2 =Y T Y

Y TY=

°°°Y °°°2kY k2

which measures the percentage variation in Y explained by the regression model.Alternatively from De…nition 152 the angle between Y and Y is:

cos (µ) =Y TY

kY k°°°Y °°°

=Y T

³Y + e

´kY k

°°°Y °°°=

°°°Y °°°2kY k

°°°Y °°° =°°°Y °°°kY k

=pR2:

Basically the closer R2 is to 1 the smaller the angle between Y and Y the morethe model explains Y:Now since:

R2 =

°°°Y °°°2kY k2 = 1¡

kek2kY k2 :

it follows that:

0 · R2 · 1:

You might want to try and show that R2 = 0 if and only if ^ = 0 (the modelexplains nothing) and R2 = 1 if and only if e = 0 (the model is a perfect …t).

3.11.2 The Spectral Representation

Closely related to the representation A = C¤C¡1 is the spectral representation.We have:

Page 185: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 177

Theorem 220 The Spectral Representation: Given an n £ n matrix Awritten as A = C¤C¡1 then

A = ¸1P1 + ¸2P2 + ¢ ¢ ¢+ ¸nPnwhere the n£ n matrices Pi given by:

Pi =xiy

Ti

xTi yi

and where xi and yi are the right and left-hand eigenvectors corresponding tothe eigenvalue ¸i of A: The matrices Pi are idempotent and orthogonal to eachother in that Pi £ Pi = Pi and PiPj = 0 for i 6= j:

Remark: That the matrices Pi are idempotent follows from the fact that:

Pi £ Pi =xiy

Ti¡

xTi yi¢ £ xiy

Ti¡

xTi yi¢

=xi

a scalarz}|{yTi xi y

Ti¡

xTi yi¢2

=

=xTi yiz}|{yTi xi x

Ti y

Ti¡

xTi yi¢2

=xiy

Ti¡

xTi yi¢ = Pi:

That they are orthogonal follows from Theorem 185 that left and right-handeigenvectors from di¤erent eigenvalues are orthogonal or: xTi yj = 0 for i 6= j:Thus:

Pi £ Pj =xiy

Ti¡

xTi yi¢ £ xiy

Ti¡

xTi yi¢

=xi

=0z}|{yTi xiy

Ti¡

xTi yi¢2

= 0:

An implication of the spectral representation then is that:

Theorem 221 Given the spectral representation for the n£ n matrix A :A = ¸1P1 + ¸2P2 + ¢ ¢ ¢+ ¸nPn

Page 186: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 178

then the mth power of A is given by:

Am = ¸m1 P1 + ¸m2 P2 + ¢ ¢ ¢+ ¸mn Pn:

Proof. We will prove this by induction. Note that it is obviously true form = 1: Now suppose it is true for m¡ 1: We then have:Am = Am¡1 £A

=¡¸m¡11 P1 + ¸

m¡12 P2 + ¢ ¢ ¢+ ¸m¡1n Pn

¢£ (¸1P1 + ¸2P2 + ¢ ¢ ¢+ ¸nPn) :Since PiPj = 0, any cross-product terms drop out and hence:

Am =¡¸m¡11 £ ¸1

¢P1 £ P1 +

¡¸m¡12 £ ¸2

¢P2 £ P2 + ¢ ¢ ¢+

¡¸m¡1n £ ¸n

¢Pn £ Pn

= ¸m1 P1 + ¸m2 P2 + ¢ ¢ ¢+ ¸mn Pn

since Pi £ Pi = Pi and ¸m¡1i £ ¸i = ¸mi :When a matrix is symmetric its left and right-hand eigenvectors are identical

and so the spectral representation take the form:

Theorem 222 If the n£ n matrix A is symmetric thenA = ¸1P1 + ¸2P2 + ¢ ¢ ¢+ ¸nPn

where the n£ n matrices Pi given by:

Pi =xix

Ti¡

xTi xi¢

and where xi is the eigenvector corresponding to the eigenvalue ¸i of A:: Thematrices Pi are idempotent and orthogonal to each other in that Pi £ Pi = Piand PiPj = 0 for i 6= j:

Example: For A below the spectral representation is:

A =

·5 ¡2

¡2 5

¸=

=¸1z}|{3

=P1z }| {·12

12

12

12

¸+

=¸2z}|{7

=P2z }| {·12 ¡1

2¡12

12

¸:

Note that P1 and P2 are idempotent; that:

P1 £ P2 =·

12

12

12

12

¸·12 ¡1

2¡12

12

¸=

·0 00 0

¸and that:

A2 =

·5 ¡2

¡2 5

¸2=

·29 ¡20

¡20 29

¸= 32

·12

12

12

12

¸+ 72

·12 ¡1

2¡12

12

¸

Page 187: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 179

which you can verify. Also:

A¡1 =

·5 ¡2

¡2 5

¸¡1=

·521

221

221

521

¸= 3¡1

·12

12

12

12

¸+ 7¡1

·12 ¡1

2¡12

12

¸which you can verify by multiplying A and A¡1.

3.12 Positive Matrices

3.12.1 The Perron-Frobenius Theorem

In many economic applications it often happens that the elements of a matrixare all positive in which case we have:

De…nition 223 Positive Matrix: We say a matrix A = [aij ] is positive ifaij > 0 for all i; j:

Example: The matrix:

A =

·1 23 4

¸is a positive matrix. Note that a positive matrix is not the same as a positivede…nite matrix.

Positive matrices occur in many economic applications: for example withinput-output matrices, which describe the technological interdependency of thedi¤erent industries in an economy, and Markov chains which describe how prob-abilities vary over time.In these applications the eigenvectors and eigenvalues turn out to be quite

important. For example with input-output matrices an eigenvector determinesequilibrium prices or the balanced growth vector and the associated eigenvaluedetermines both the rate of pro…t and the growth rate of the economy. We canthen show that all prices will be positive, that the equilibrium price vector isunique, and that the growth rate of the economy is maximized by appealing tothe Perron-Frobenius theorem:

Theorem 224 The Perron-Frobenius Theorem I If A = [aij ] is an n£ npositive square matrix then:

1. A has a unique positive eigenvalue ^ > 0 .

2. If ¸i is any other eigenvalue of A then j¸ij < ^ .

Page 188: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 180

3. Associated with ^ is a positive n£ 1 right-hand eigenvector x = [xi] (i.e.,with xi > 0) and a positive left-hand eigenvector y = [yi] (i.e., with yi > 0) which satisfy:

Ax = ^x; yTA = ^yT :

These positive eigenvectors are unique up to a scalar multiple.

4. No other eigenvectors exist which have all positive elements.

Remark: Note that we have assumed that for A = [aij ] that aij > 0 and sowe have not allowed any aij = 0. This assumption can be relaxed considerablyas long as A is remains indecomposable, which requires that An have all positiveelements for some n.

Example: Consider the 2£ 2 matrix A given by:

A =

·0:3 0:50:2 0:7

¸which has all positive elements. You can verify that the eigenvalues of A are

^ = ¸1 = 0:87417; ¸2 = 0:12583;

that j¸2j < ^; that the associated right-hand eigenvectors are:

x = x1 =

·0:707530:81248

¸; x2 =

· ¡0:944350:32895

¸and the associated left-hand eigenvectors are:

yT = yT1 =£0:3544 1:0174

¤; yT2 =

£ ¡0:754 0:65672¤

and that x and y have all positive elements, and the other eigenvectors do nothave all positive elements.

3.12.2 Markov Chains

Suppose that workers can be in 1 of 2 states: either employed, state 1 or unem-ployed, state 2: Suppose further that the probability of a worker being employednext year depends only on whether he is employed or unemployed this year.First suppose the worker is employed this year. Let the probability of em-

ployment next year given that he is employed this year be p11 where 0 < p11 < 1and let the probability of unemployment next year given that he is employedthis year be: p12 = 1¡p11: Since he will be either employed or unemployed nextyear: p12 = 1¡ p11 with 0 < p12 < 1:Now suppose he is unemployed this year. Let the probability of unemploy-

ment next year given that he is unemployed this year be p22 where 0 < p22 < 1and let the probability of employment next year given that he is unemployed

Page 189: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 181

this year be: p21 = 1¡p22 Since he will be either employed or unemployed nextyear: p21 = 1¡ p22 with 0 < p21 < 1:We can put all these probabilities in a 2£2matrix called P called a transition

matrix as:

P =

·p11 p12p21 p22

¸:

Note that the rows of P sum to 1 since probabilities sum to 1 and all the elementsof P are positive so later on we can apply the Perron-Frobenius theorem.Now suppose you want to know the probability that a worker employed today

will be unemployed in 2 years from now, or for that matter n years from now.We have the following result:

Theorem 225 Let pij (n) is the probability of being in state j in n periods giventhat the worker is in state i today. Then:·

p11 (n) p12 (n)p21 (n) p22 (n)

¸= Pn =

·p11 p12p21 p22

¸n:

Example: Consider the transition matrix

P =

·0:95 0:050:4 0:6

¸so that someone employed today has a 95% chance of being employed next year,and someone unemployed today has a 60% chance of being unemployed nextyear.To calculate the corresponding probabilities for two years from now we cal-

culate:

P 2 =

·0:95 0:050:4 0:6

¸·0:95 0:050:4 0:6

¸=

·0:9225 0:07750:62 0:38

¸:

Thus someone employed today has a probability of 92% probability of beingemployed in two years, and someone unemployed today has a 38% probabilityof being unemployed in 2 years.Now consider n = 10 years in the future. We have using the computer that:

P 10 =

·0:95 0:050:4 0:6

¸10=

·0:95 0:050:4 0:6

¸£·0:95 0:050:4 0:6

¸£ ¢ ¢ ¢ £

·0:95 0:050:4 0:6

¸=

·0:88917 0:110830:88664 0:1336

¸

Page 190: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 182

Thus someone employed today has a probability of 89% probability of beingemployed in 10 years, and someone unemployed today has a 13% probabilityof being unemployed in 10 years, and this is independent of whether you areemployed or unemployed this year!If we let n get even larger for say n = 50 then this pattern becomes even

more striking. Using the computer we have:

P 50 =

·0:95 0:050:4 0:6

¸50=

·0:889 0:1110:889 0:111

¸=

·11

¸ £0:89 0:11

¤:

The probabilities 0:89 and 0:11 are the long-run probabilities (or equilibriumprobabilities) of being employed and unemployed. Thus the long-run rate ofunemployment for the work force would be 11%:It turns out that the vector of long-run probabilities:

y =£0:89 0:11

¤is the left-hand eigenvector of P associated with the eigenvalue of 1; that is:yP = ¸y with ¸ = 1 or:

£0:89 0:11

¤ · 0:95 0:050:4 0:6

¸= 1£ £ 0:89 0:11

¤:

The is part of a very general result. We have

Theorem 226 If

P =

·p11 p12p21 p22

¸is an transition matrix with positive elements and rows which sum to 1 then:

limn!1P

n =

·11

¸ £p 1¡ p ¤

where p and 1¡ p are the long-run probabilities of being in state 1 and state 2and where 0 < p < 1:

Proof. Since the rows of P sum to 1 it follows that if

¶ =

·11

¸

Page 191: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 183

then:

P¶ =

·p11 p12p21 p22

¸ ·11

¸=

·p11 + p12p21 + p22

¸=

·11

¸= ¶

so that x = ¶ is the unique positive right-hand eigenvector of P correspondingto the eigenvalue ^ = ¸1 = 1: By the Perron-Frobenius theorem we know thatthe other eigenvalue is less that 1 so that : j¸2j < ^ = 1. We also know thatthere exists a corresponding left-hand eigenvector

y =£y1 y2

¤with y1 > 0 and y2 > 0: We can normalize y so the elements sum to 1 bydividing by y1 + y2 and setting p =

y1y1+y2

and so we can write y as:

y =£p 1¡ p ¤ :

Now from the spectral representation for P we have:

P = ^xy + ¸2x2y2

= ¶y + ¸2x2y2

and:

Pn = ¶y + ¸n2x2y2:

Since j¸2j < ^ = 1 it follows that as n!1 that ¸n2 ! 0 so that:

Pn ! ¶y =

·11

¸ £p 1¡ p ¤ = · p 1¡ p

p 1¡ p¸:

Remark 1: There is actually no reason to limit ourselves to 2 states. For ex-ample a worker might conceivably be in say four states: 1 full-time employment,2 part-time employment, 3 unemployment and 4 not being in the labour force.In this case the transition matrix is a 4 £ 4 matrix with positive entries androws which sum to 1: For example:

P =

26640:9 0:05 0:04 0:010:4 0:5 0:06 0:040:2 0:3 0:4 0:10:05 0:05 0:2 0:7

3775so that someone unemployed today has a probability p32 = 0:3 of having part-time work next year.

Page 192: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 184

Remark 2: Problems can arise if some of the elements of P are 0: For exampleif:

P =

·0:6 0:40 1

¸then if state 2 is unemployment then an unemployed worker never …nds a job.In this case unemployment is an absorbing state and employment is a transitorystate; all workers eventually become permanently unemployed.

3.12.3 General Equilibrium and Matrix Algebra

One of the …rst things you learn in economics is the supply and demand model.This is known as partial equilibrium analysis since it abstracts from the waydi¤erent markets interact with each other. In general equilibrium analysis onthe other hand we explicitly treat the way di¤erent markets interact. Generalequilibrium analysis generally requires quite advanced mathematical techniques.For example it was only with the development in the last 60 years in mathematicsof what are called …xed-point theorems that economists have been able to provethat a set of prices exists which will equate demand and supply in all marketsin the economy.Here we will give you a taste of general equilibrium analysis for an economy

with a Leontief technology and where technology determines prices independentof tastes. Thus consider an economy where there i = 1; 2; : : : n goods that areproduced. Let aij be the amount of good j needed to produce 1 unit of good i:We can put the aij 0s into an n£ n matrix A as:

A =

26664a11 a12 ¢ ¢ ¢ a1na21 a22 ¢ ¢ ¢ a2n...

.... . .

...an1 an2 ¢ ¢ ¢ ann

37775 :The matrix A; referred to as an input-output matrix, captures the Leontieftechnology of this economy.Let pj be the price of good j: The cost of producing one unit of good i is

given by:

ci = ai1p1 + ai2p2 + ¢ ¢ ¢+ ainpnor if we de…ne the n£1 vector of costs as c = [ci] and the n£1 vector of prices:p = [pj ] then in matrix form:

c = Ap:

The revenue from producing 1 unit of good i is just the price: pi so thatpro…ts are given by:

pi ¡ (ai1p1 + ai2p2 + ¢ ¢ ¢+ ainpn)

Page 193: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 185

and the rate of pro…t in industry i is given by pro…ts divided by the costs sothat:

¼i =pi ¡ (ai1p1 + ai2p2 + ¢ ¢ ¢+ ainpn)ai1p1 + ai2p2 + ¢ ¢ ¢+ ainpn :

Now in equilibrium the rate of pro…t must be the same in each industry, oth-erwise no production would take place in those industries with a lower rate ofpro…t. Thus we require that ¼i = ¼ for all i so that:

¼ =pi ¡ (ai1p1 + ai2p2 + ¢ ¢ ¢+ ainpn)ai1p1 + ai2p2 + ¢ ¢ ¢+ ainpn

or:

pi = (ai1p1 + ai2p2 + ¢ ¢ ¢+ ainpn) (1 + ¼)or in matrix notation:

p = Ap (1 + ¼)

or written slightly di¤erently:

Ap =1

1 + ¼p:

Note this takes the same form as Ax = ¸x where x is an eigenvector and ¸ isan eigenvalue. From this it follows that x = p is an eigenvector of the matrixA and that ¸ = 1

1+¼ is an eigenvalue of A:In general there will be n eigenvalues and eigenvectors so the question is

which one is the appropriate one. Since p is a vector of prices and prices are allpositive, we cannot accept eigenvectors with negative elements.From the Perron-Frobenius theorem we know that there is only one eigen-

vector p = x with all positive elements and this corresponds to the eigenvalue^ > 0 which determines the rate of pro…t of the economy. We therefore have:

Theorem 227 There exists a unique (up to a scalar multiple) positive pricevector which is the positive right-hand eigenvector of A associated with the eigen-value ^:

Theorem 228 The equilibrium rate of pro…t is given by:

¼ =1^¡ 1:

Remark: If p = x is an equilibrium vector of prices then so too is ®p where® is any positive scalar. This non-uniqueness is a general feature of generalequilibrium models and corresponds to the fact that agents only care aboutrelative prices. Thus if ® = 2 and we double all prices in the economy thiswill have no a¤ect on rational economic decision making or equilibrium in the

Page 194: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 186

economy. Thus while p and 2p correspond to di¤erent nominal price vectors,real or relative prices are the same.Note that p = x is a right-hand eigenvector. It turns out that the left-

hand eigenvector y also has an interesting economic interpretation. Thus bythe Perron-Frobenius theorem we know that corresponding to ^ there exists aunique positive eigenvector y which satis…es yTA = ^y: It turns out that:

Theorem 229 y determines the balanced growth path for the economy and ^the growth rate of the economy.

Proof. Let yi be the amount of good i produced. Then the input require-ment of good j will be:

rj = y1a1j + y2a2j + ¢ ¢ ¢+ ynanjor if we de…ne the 1 £ n vector of production levels as y = [yi] and the 1 £ nvector of input requirements as r = [rj ] then in matrix notation:

r = yA:

If there is balanced growth so that there is no unemployment or shortages inthe economy, then:

y = (1 + ½) r

where 1+½ is the growth rate of the economy. In matrix notation we then have:

yA =1

1 + ½y:

Since we require that all elements of y be positive it follows that: y = y and^ = 1

1+½ :

Thus ^ in addition to determining the pro…t rate also determines the growthrate of the economy along the balanced growth path. Therefore:

Theorem 230 With balanced growth the rate of growth of the economy and therate of pro…t are identical and are given by:

¼ = ½ =1^¡ 1:

Example: Suppose the economy has n = 2 sectors and

A =

·0:3 0:650:2 0:72

¸:

The two eigenvalues of A are then: ¸1 = ^ = 0:92725 and ¸2 = 0:0927: It followsthat the rate of pro…t ¼ and the growth rate of the economy ½ are identical:

^ = 0:92725 =1

1 + ¼=

1

1 + ½

Page 195: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 3. MATRIX ALGEBRA 187

so that:

¼ = ½ =1

0:92725¡ 1 = 0:0785

and the pro…t and growth rates are both 7:85%:The positive right-hand eigenvector associated with ^ determines prices and

is:

p = x =

·0:817540:78893

¸:

Thus the relative price of good 1 and 2 will be

p1p2=0:81754

0:78893= 1:036

so if p2 = 1 (the second good is the numeraire) then the real price of good 1 is1:036 units of good 2:The positive left-hand eigenvector of A determines the balanced growth path

and is:

y =

·0:345131:0824

¸and so:

y1y2=0:34513

1:0824= 0:319

so that if no resources are to be unemployed and there are to be no shortages,then along the balanced growth path 0:319 units of factor 1 will be employedfor every unit of factor 2 employed.

Page 196: Introduction to Mathematical Economics Part 1 - Loglinear Publications

Chapter 4

Multivariate Calculus

4.1 Functions of Many Variables

To treat variables as constants is the characteristic vice of the un-mathematical economist. -Francis Edgeworth

Functions of only one variable: y = f (x) can only take us so far. Usuallywhen we write such functions we have in mind that there are other variables inthe background that are kept constant. For example although we might writea demand function as:

Q = Q (P )

we know this is wrong, that the quantity demanded depends not only on theown price: P , but also on the price of other goods P1; P2; : : : Pn (substitutes andcomplements) and on income Y:We should instead write a demand function as:

Q = Q (P;P1; P2; : : : Pn; Y ) :

The same argument would apply equally well to almost anything else we con-sider in economics for the simple reason that economic variables are generallyin‡uenced by many other variables and not just one.Thus we now change our focus to functions of the form:

y = f (x1; x2; : : : xn)

and the calculus tools we need to work with these functions.Multivariate functions are generally hard to visualize since in order to graph

them we need n+ 1 dimensions: n dimensions for the xi 0s and one dimensionfor the y: A function with n = 2 independent variables: y = f (x1; x2) requires athree dimensional graph, something which can be represented (with di¢culty)

188

Page 197: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 189

on a two-dimensional page. For functions with n ¸ 3 however we really cannotdirectly visualize a function. Following what we have learned in linear algebra,we can nevertheless know a lot about these functions analytically. For examplewe will be able to tell which functions are mountains, which are valleys andwhere the tops and bottoms of these valleys are.It is often tedious to explicitly write out all n of the xi 0s: Instead we can

put all of them in a n £ 1 row vector x as and write y = f (x1; x2; : : : xn)more compactly as: y = f (x) : Note that this looks exactly like a function inunivariate calculus but where x is now interpreted as an n£ 1 vector.

Example: Consider a multivariate function with n = 2 as:

y = f (x) = f (x1; x2) = e¡ 12 (x

21+x

22¡x1x2):

where x = [x1; x2]T is a 2£ 1 vector. This is a three-dimensional mountain as

depicted below:

x1x2

y

f (x1; x2) = e¡ 12(x

21+x

22¡x1x2)

:

where the vertical axis is y and the two dimensional plane has x1 on one axisand x2 on the other.If x1 = 2 and x2 = 1 we have:

y = f (2; 1) = e¡12(2

2+12¡2£1) = 0:22313

while if x1 = ¡12 and x2 =

23 then:

y = f

µ¡12;2

3

¶= e

¡ 12

³(¡ 1

2)2+( 23 )

2¡(¡ 12)£( 23)

´= 0:598:

4.2 Partial DerivativesMathematics is a language. -Josiah Willard Gibbs

Page 198: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 190

The cornerstone of multivariate calculus is the partial derivative. Given afunction of n variables: y = f (x1; x2; : : : xn) there will be n partial derivatives,one for each one of the xi 0s: Calculating partial derivatives is really no moredi¢cult than calculating an ordinary derivative in univariate calculus. We have:

De…nition 231 Partial Derivative: Given the function y = f (x1; x2; : : : xn)the partial derivative with respect to xi; denoted by:

@y

@xi´ @f (x1; x2; : : : xn)

@xi

is the ordinary derivative of f (x1; x2; : : : xn) with respect to xi obtained by treat-ing all other xj 0s (for j 6= i ) as constants.

Remark 1: For ordinary derivatives we use d as in dydx while partial derivatives

we use the old German letter d: ‘@’ as in @y@xi:

Remark 2: Another useful notation for a partial derivative is to write eitherxi or i as a subscript:

@y

@xi´ fi (x1; x2; : : : xn) ´ fxi (x1; x2; : : : xn) :

Unlike @y@xi; this notation emphasizes that the fact that like f (x1; x2; : : : xn) the

partial derivative is also a multivariate function of x1; x2; : : : xn.

Remark 3: A very bad notation often used by students is to write a partialderivative as:

f 0 (x1; x2; : : : xn) :

The problem here is that a partial derivative is always with respect to a par-ticular xi but this notation does not tell you which xi you are di¤erentiatingwith respect to. Thus if you write: f 0 (x1; x2) there is no way of knowing if youmean @f(x1;x2)

@x1or @f(x1;x2)

@x2: Therefore: Do not use the notation: f 0 () for

partial derivatives.

Example 1: To calculate the partial derivative of

y = f (x1; x2) = x51x72

with respect to x1 we treat x2 as a constant and di¤erentiate with respect tox1 to obtain:

@f (x1; x2)

@x1´ @y

@x1´ @

@x1

¡x51x

72

¢= 5x41x

72:

Page 199: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 191

Note that 5x41x72 is a function of both x1 and x2; that is although in the

calculation we treated x2 as a constant, after we have calculated@f(x1;x2)

@x1x2

reverts to its former status as a variable, just like x1: That is why we write@f(x1;x2)

@x1and not @f(x1)

@x1:

To calculate the derivative of f (x1; x2) with respect to x2 treat x1 as aconstant and di¤erentiate with respect to x2 to obtain:

@f (x1; x2)

@x2´ @y

@x2´ @

@x1

¡x51x

72

¢= 7x51x

62:

Example 2: Given:

y = f (x1; x2) =1

3ln (x1) +

2

3ln (x2)¡ x1x2

there will be two partial derivatives:@f(x1;x2)@x1and @f(x1;x2)

@x2: To calculate @f(x1;x2)@x1

we treat x2 as a constant and di¤erentiate with respect to x1 as:

@f (x1; x2)

@x1=

@

@x1

µ1

3ln (x1) +

2

3ln (x2)¡ x1x2

¶=

1

3

@

@x1ln (x1)| {z }= 1x1

+2

3

@

@x1ln (x2)| {z }

=0 since x2 is a constant

¡ x2 @

@x1x1| {z }

=1

=1

3

1

x1¡ x2:

To calculate @f(x1;x2)@x2

we treat x1 as a constant and di¤erentiate with respectto x2 as:

@f (x1; x2)

@x2=

@

@x2

µ1

3ln (x1) +

2

3ln (x2)¡ x1x2

¶=

2

3

1

x2¡ x1:

Example 3: Given:

f (x1; x2; x3) = x23x31 + 2 ln (x2)x

21

Page 200: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 192

we have:

@f (x1; x2; x3)

@x1=

@

@x1

¡x23x

31 + 2 ln (x2)x

21

¢= 3x23x

21 + 4 ln (x2)x1;

@f (x1; x2; x3)

@x2=

@

@x2

¡x23x

31 + 2 ln (x2)x

21

¢=

2x21x2;

@f (x1; x2; x3)

@x3=

@

@x3

¡x23x

31 + 2 ln (x2)x

21

¢= 2x31x3:

4.2.1 The Gradient

It is often tedious to write down all of the n partial derivatives of f (x1; x2; : : : xn) :Just as we can write f (x1; x2; : : : xn) as f (x) by letting x be an n£ 1 vector,we can use matrix algebra to obtain a more compact notion by putting each ofthe n partial derivatives into a n£ 1 vector, called the gradient. We have:

De…nition 232 Gradient: Given the function y = f (x) where x is an n£ 1vector, the gradient is an n£ 1 vector of partial derivatives denoted by: rf (x)or @f(x)

@x :

@f (x)

@x´ rf (x) ´

266664@f(x1;x2;:::xn)

@x1@f(x1;x2;:::xn)

@x2...

@f(x1;x2;:::xn)@xn

377775 :

Example 1: Given:

f (x1; x2) =1

3ln (x1) +

2

3ln (x2)¡ x1x2

the gradient is a 2£ 1 vector given by:

rf (x1; x2) ="

@f(x1;x2)@x1

@f(x1;x2)@x2

#=

24 13x1

¡ x223x2

¡ x1

35 :Example 2: Given:

f (x1; x2; x3) = x23x31 + 2 ln (x2)x

21

Page 201: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 193

the gradient is a 3£ 1 vector given by:

rf (x1; x2; x3) =24 3x23x

21 + 4 ln (x2)x1

2x21x2

2x3x31

35 :

Imagine you are standing on a three-dimensional mountain y = f (x1; x2).Looking at the slope, you turn around until you are looking in the directionwhere the mountain is steepest or in the direction where the climbing wouldbe the hardest. You are then looking in the same direction as the gradientrf (x1; x2). In general we have:Theorem 233 The gradient rf (x) points in the direction where the functionf (x) is steepest.

Example: In Example 1 above we calculated the gradient for the function:13 ln (x1) +

23 ln (x2)¡ x1x2: For x1 = 1

3 ; x2 =23 the gradient is:

rf (1; 1) =

24 13x1

¡ x223x2

¡ x1

35jx1=

13;x2=

23

=

·1323

¸and so the function is steepest in the direction that the vector depicted belowpoints:

0

0.1

0.2

0.3

0.4

0.5

0.6

x2

0.05 0.1 0.15 0.2 0.25 0.3x1

Page 202: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 194

4.2.2 Interpreting Partial Derivatives

A partial derivative is much like the result of a controlled experiment. Sup-pose for example you want to know how vitamin C a¤ects the life expectancyof rats. In a proper experiment you would try and hold all variables constantexcept vitamin C, vary the consumption of vitamin C and observe what hap-pens to the rats’ life expectancy. If you see the rats with more vitamin C livelonger (shorter), you then can conclude that there exists a positive (negative)relationship between vitamin C and the life expectancy of rats.Now instead of real rats suppose that we have a multivariate function y =

f (x1; x2; : : : xn) where y is life expectancy and xi is vitamin C consumption.Just as with real rats we want to know how xi a¤ects y. Instead of an experimentwe calculate the partial derivative @f(x1;x2;:::xn)@xi

. Just as with the experiment wehold all other variables constant except vitamin C when calculating a partialderivative. The sign of this partial derivative then tells us the nature of therelationship between xi and y: In particular:

Theorem 234 Given y = f (x1; x2; : : : xn) if:

@f (x1; x2; : : : xn)

@xi> 0

then y is an increasing function of xi; that is increasing (decreasing) xi holdingall other xj 0s …xed will increase (decrease) y:

Theorem 235 Given y = f (x1; x2; : : : xn) if:

@f (x1; x2; : : : xn)

@xi< 0

then y is a decreasing function of xi; that is increasing (decreasing) xi holdingall other xi 0s …xed will decrease (increase) y:

Remark: As with univariate calculus, these properties can hold either locallyor globally. If @f(x)@xi

> 0 for all x in the domain we would say that y is a

globally increasing function of xi: If@f(x)@xi

> 0 only at a point then y would bea locally increasing function of xi:

The partial derivative also gives us quantitative information about the rela-tionship between xi and y; in particular it gives us the xi multiplier.

Theorem 236 Given y = f (x1; x2; : : : xn) if xi is changed by ¢xi with allother xj 0s kept constant then the change in y is approximately given by:

¢y ¼ @f (x1; x2; : : : xn)

@xi¢xi

where the approximation gets better the smaller is ¢xi:

Page 203: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 195

Example 1: Consider:

y = f (x1; x2) =1

3ln (x1) +

2

3ln (x2)¡ x1x2

where:

@f (x1; x2)

@x1=

1

3x1¡ x2:

At x1 = 112 and x2 =

12 we have

@f¡112 ;

12

¢@x1

=1

3¡112

¢ ¡ 12=7

2> 0

so that a positive relationship exists between x1 and y at¡112 ;

12

¢or y is a

locally increasing function of x1:For example if we increase x1 a small amount from x1 =

112 to say x1 =

110

so that

¢x1 =1

10¡ 1

12= 0:017;

and if we keep x2 constant at 12 , then y will increase from y = f

¡112 ;

12

¢=

¡1:332 1 to y = f ¡ 110 ; 12¢ = ¡1:2796 or by:¢y = ¡1:2796¡ (¡1:3321)

= 0:0525

¼ @f¡112 ;

12

¢@x1

¢x1

=7

2£ 0:017

= 0:0595:

Now focusing on the relationship between x2 and y we have:

@f (x1; x2)

@x2=

2

3x2¡ x1

so that:

@f¡112 ;

12

¢@x2

=2

3

1¡12

¢ ¡ 1

12=5

4> 0

so that y is a locally increasing function of x2: Thus if we increase x2 a smallamount from x1 =

12 , keeping x1 constant at

112 , then y will increase.

On the other hand at x1 = 2 when x2 = 12 we have:

@f¡2; 12

¢@x1

=1

3 (2)¡ 12= ¡1

3< 0

@f¡2; 12

¢@x2

=2

3¡12

¢ ¡ 2 = ¡23< 0

Page 204: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 196

and so it follows that f (x1; x2) is a locally decreasing function of both x1 andx2 at

¡2; 12

¢:

Example 2: Consider the function:

y = f (x1; x2) = e2x1¡3x2 :

We have:

@f (x1; x2)

@x1= 2e2x1¡3x2 > 0

for all x1; x2 and so y is a globally increasing function of x1: Similarly:

@f (x1; x2)

@x2= ¡3e2x1¡3x2 < 0

for all x1; x2 and so y is a globally decreasing function of x2:

4.2.3 The Economic Language of Partial Derivatives

Consider a demand curve:

Q = Q (P;P1; P2; : : : Pn; Y )

where P is the own price, the price of other related goods are P1; P2; : : : Pnincome is Y . Now suppose we want to say that the demand curve is downwardsloping. What does that mean? In introductory economics we would say that ademand curve is downward sloping if increasing P while holding P1; P2; : : : Pnand Y results in Q going down.Notice how long it takes to say this! We can be much more concise if we use

mathematics and say simply, a demand curve slopes downward if:

@Q

@P< 0:

One of the reasons this is more concise is that instead of saying “all othervariables are held constant” we merely write @ instead of d to express this idea.Similarly if we want to say that the good is normal, we simply write: @Q@Y > 0:

This then replaces the introductory de…nition which states that a good is normalif increasing Y holding P;P1; P2; : : : Pn constant causesQ to increase. If we wantto say that a good is inferior we write: @Q@Y < 0; if we want to say that good i isa substitute we write: @Q@Pi > 0; if we want to say that good j is a complement

we write: @Q@Pj < 0.These ideas apply equally well to supply curves, cost functions or just about

anything else one considers in economics. Thus much of the informal languageone uses in introductory economics is reformulated in terms of partial deriva-tives in more advanced economics, and this allows us to state ideas much moreconcisely.

Page 205: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 197

Example: Consider a demand curve for co¤ee:

Q = P¡2P 31P¡32 Y 2

where P is the price of co¤ee, P1 is the price of tea, P2 is the price of sugar andY is income. Then:

@Q

@P= ¡2P¡3P 31P¡32 Y 2 < 0 =) the co¤ee demand curve slopes downward

@Q

@P1= 3P¡2P 21P

¡32 Y 2 > 0 =) co¤ee and tea are substitutes

@Q

@P2= ¡3P¡2P 31P¡42 Y 2 < 0 =) co¤ee and sugar are complements

@Q

@Y= 2P¡2P 31P

¡32 Y 1 < 0 =) co¤ee is a normal good.

4.2.4 The Use of the Word Marginal

In introductory economics the marginal product of labour is de…ned as thecontribution of the last worker hired to output. In univariate calculus we de…nedthe marginal product of labour as the derivative of the short-run productionfunction Q = f (L) with respect to L: The short-run production function isobtained from the production function Q = F (L;K) by holding K constant.Since K was held …xed this ordinary derivative was actually a partial derivative.Thus the precise de…nition of the marginal product of labour is as the partialderivative:

MPL (L;K) ´ @F (L;K)

@L:

In general then when economists use the word ‘marginal’ as in marginalutility, marginal product of labour or the marginal product of capital they arereferring to a partial derivative where all other variables are held constant. Thus:

De…nition 237 When in economics we refer to a ‘marginal’ concept we meana partial derivative.

Example 1: Consider a production function:

Q = F (L;K)

where Q is output, L is labour and K is capital. The marginal product oflabour is the partial derivative:

MPL (L;K) ´ @F (L;K)

@L

Page 206: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 198

while the marginal product of capital is:

MPK (L;K) ´ @F (L;K)

@K:

Thus for the Cobb-Douglas production function:

Q = L12K

14

the marginal products of labour and capital are given by:

MPL (L;K) =1

2L¡

12K

14 > 0;

MPK (L;K) =1

4L

12K¡ 3

4 > 0:

The fact that the marginal products are positive means that Q is a globallyincreasing function of L and a globally increasing function of K; that is labourand capital are productive.

Example 2: Consider a household which gets utility from two goods Q1 andQ2 as:

U = U (Q1; Q2) :

The marginal utility of good 1 is:

MU1 (Q1; Q2) ´ @U (Q1; Q2)

@Q1

while the marginal utility of good 2 is:

MU1 (Q1;Q2) ´ @U (Q1; Q2)

@Q2:

For the Cobb-Douglas utility function:

U (Q1; Q2) = Q131Q

232

the marginal utilities of Q1 and Q2 are given by:

MU1 (Q1;Q2) =1

3Q¡ 23

1 Q232 > 0;

MU2 (Q1;Q2) =2

3Q

131Q

¡ 13

2 > 0:

The fact that the marginal utilities are positive means that utility is a globallyincreasing function of both Q1 and Q2, in other words both Q1 and Q2 are‘goods’ and not ‘bads’.

Page 207: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 199

4.2.5 Elasticities

Instead partial derivatives we often prefer to talk about elasticities since theseare free of units of measurement. Again elasticities are de…ned under the as-sumption that all other variables but one are held …xed. Thus for multivariatefunctions we de…ne an elasticity as:

De…nition 238 Elasticity: Given y = f (x1; x2; : : : xn) the elasticity with re-spect to xi is:

´i (x1; x2; : : : xn) ´ @y

@xi

xiy

´ @f (x1; x2; : : : xn)

@xi

xif (x1; x2; : : : xn)

:

In general elasticities change as x1; x2; : : : xn change. The functional formwhich has the property that the elasticities do not depend on x1; x2; : : : xn isthe multivariate generalization of y = Axb given below:

Theorem 239 If:

y = f (x1; x2; : : : xn) = Axb11 x

b22 £ ¢ ¢ ¢ £ xbnn

then all elasticities are independent of x1; x2; : : : xn and:

´i = bi:

Example: Consider again the demand curve for co¤ee:

Qd = P¡2P 31P¡32 Y 2

Note that the demand function has the functional form Axb11 xb22 £ ¢ ¢ ¢ and so

the elasticities are simply the exponents on each variable. We therefore have:

´P =@Qd

@P

P

Qd=¡2P¡3P 31P¡32 Y 2 £ P

P¡2P 31P¡32 Y 2

= ¡2

´P1 =@Qd

@P1

P1Qd

=3P¡2P 21P

¡32 Y 2 £ P1

P¡2P 31P¡32 Y 2

= 3

´P2 =@Qd

@P2

P2Qd

=¡3P¡2P 31P¡42 Y 2 £ P2

P¡2P 31P¡32 Y 2

= ¡3

´Y =@Qd

@Y

Y

Qd=2P¡2P 31P

¡32 Y 1 £ Y

P¡2P 31P¡32 Y 2

= 2:

Thus a 1% increase in P leads to a 2% fall Q (demand is elastic), a 1%increase in P1 leads to a 3% increase in Q (co¤ee and tea are substitutes), a1% increase in P2 leads to a 3% decrease Q (co¤ee and sugar are complements),and a 1% increase in Y leads to a 2% increase in Q (co¤ee is a normal good).

Page 208: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 200

4.2.6 The Chain Rule

We need the chain rule whenever we are working with functions of functions.Consider then the following situation. We have an outside function:

y = f (x1; x2; : : : xn)

and n inside functions: g1 (w) ; g2 (w) ; : : : gn (w) where w is a scalar. We replaceeach xi with the inside function gi (w) to obtain:

h (w) = f (g1 (w) ; g2 (w) ; : : : gn (w)) :

The multivariate chain rule then tells us how to calculate h0 (w) :

Theorem 240 Multivariate Chain Rule: If

h (w) = f (g1 (w) ; g2 (w) ; : : : gn (w))

then:

h0 (w) =@f (g1 (w) ; g2 (w) ; : : : gn (w))

@x1g01 (w)

+@f (g1 (w) ; g2 (w) ; : : : gn (w))

@x2g02 (w)

+ ¢ ¢ ¢+ @f (g1 (w) ; g2 (w) ; : : : gn (w))@xn

g0n (w) :

Remark 1: Think of a multi-national oil company that has n subsidiariesin n di¤erent countries. Suppose that w is the price of oil, that the insidefunction: gi (w) is the before-tax pro…ts in country i in the local currency andthe outside function f (x1; x2; : : : xn) converts the pro…ts in each country’s localcurrency into say US dollars. Thus h (w) gives total pro…ts in US dollars as afunction of the price of oil and h0 (w) indicates how pro…ts change as the priceof oil changes. The terms in the chain rule take the form:

@f (g1 (w) ; g2 (w) ; : : : gn (w))

@xig0i (w) :

Here g0i (w) tells you how pro…ts in country i change as the price of oil changeswhile the multiplier:

@f (g1 (w) ; g2 (w) ; : : : gn (w))

@xi

indicates how a change in local currency pro…ts a¤ect aggregate US dollar prof-its. The total e¤ect of a change in the price of oil h0 (w) is then the sum of thesee¤ects of the n subsidiaries, as indicated by the chain rule.

Page 209: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 201

Remark 2: Although the multivariate chain rule might look complicated, it isexactly the same idea as in univariate calculus where one starts by taking thederivative of the outside function, replacing the x with the inside function andthen multiplying by the derivative of the inside function. The only di¤erence isthat now there are now n xi 0s and n inside functions: the gi (w)

0s . One thus

has to take the partial derivative of the outside function with respect to eachxi, one must replace each xi with the n inside functions, multiply this by g0i (w)and add up all n terms. A recipe for this goes as follows:

A Recipe for the Multivariate Chain Rule

Starting with x1:

1. Take the partial derivative of the outside function f (x1; x2; : : : xn) withrespect to xi :

@f(x1;x2;:::xn)@xi

:

2. Replace every occurrence of xi in@f(x1;x2;:::xn)

@xiwith the corresponding

inside function gi (w) to obtain:@f(g1(w);g2(w);:::gn(w))

@xi

3. Multiply the result in 2) by g0i (w) : to obtain:@f(g1(w);g2(w);:::gn(w))

@xig0i (w)

4. Repeat steps 1 to 3 for all xi:

5. Add up the results of 1 to 4 together to obtain h0 (w).

Example 1: Consider a function with two x0s :

f (x1; x2) =1

3ln (x1) +

2

3ln (x2)¡ x1 ¡ x2

and let g1 (w) = w2 and g2 (w) = ew be the two inside functions. If we replaceevery occurrence of x1 with w2 and every occurrence of x2 with ew we obtain:

h (w) = f (g1 (w) ; g2 (w))

= f¡w2; ew

¢=

1

3ln¡w2¢+2

3ln (ew)¡ w2 ¡ ew

=2

3ln (w) +

2

3w ¡w2 ¡ ew:

We can then calculate h0 (w) directly as:

h0 (w) =2

3w+2

3¡ 2w ¡ ew:

Now let us now use the recipe for the multivariate chain rule to calculate h0 (w) :Following the recipe we have:

Page 210: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 202

1. The partial derivative of the outside function with respect to x1 is:

@f (x1; x2)

@x1=

1

3x1¡ 1:

2. Replacing x1 with w2 and x2 with ew in 1 results in:

@f¡w2; ew

¢@x1

=1

3w2¡ 1:

3. Since g1 (w) = w2 we have: g01 (w) = 2w and so multiplying 2 with g01 (w)yields:

@f¡w2; ew

¢@x1

g01 (w) =µ1

3w2¡ 1¶£ 2w:

4. Repeat steps 1 to 3 with respect to x2: Thus with x2 we have:

@f (x1; x2)

@x2=

2

3x2¡ 1

and so replacing x1 and x2 by w2 and ew:

@f¡w2; ew

¢@x2

=2

3ew¡ 1

and multiplying by g02 (w) = ew we get:µ2

3ew¡ 1¶£ ew:

5. Adding up the results from 1 to 4 yields:

h0 (w) =

µ1

3w2¡ 1¶£ 2w +

µ2

3ew¡ 1¶£ ew

=2

3w+2

3¡ 2w ¡ ew

and so we get the same answer as the direct calculation.

Example 2: Proving the Product Rule Using the Chain Rule. Supposethe outside function is

p (x1; x2) = x1x2

so that p (x1; x2) simply multiplies x1 and x2 together. The two inside functionsare any two univariate functions: f (x) and g (x) where x here is a scalar so that:

h (x) = p (f (x) ; g (x)) = f (x) g (x)

Page 211: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 203

and so h (x) is just the product of two functions f (x) and g (x) :To …nd h0 (x) we use the multivariate chain rule. We have:

@p (x1; x2)

@x1= x2 and

@p (x1; x2)

@x2= x1

so that:

h0 (x) =@p (f (x) ; g (x))

@x1f 0 (x) +

@p (f (x) ; g (x))

@x2g0 (x)

= g (x) f 0 (x) + f (x) g0 (x)

which is the product rule in univariate calculus.

4.2.7 A More General Multivariate Chain Rule

The chain rule can be generalized to the case where the inside functions aremultivariate functions. Although this is more general, if you understand theprevious chain rule there are really no new ideas involved except that somethings which were ordinary derivatives become partial derivatives.

Theorem 241 Multivariate Chain Rule: If w in gi (w) is an m£ 1 vectoras: gi (w) = gi (w1; w2; : : : wm) and if

h (w1; w2; : : : wm) = f (g1 (w) ; g2 (w) ; : : : gn (w))

then:

@h (w1; w2; : : : wm)

@wj=

@f (g1 (w) ; g2 (w) ; : : : gn (w))

@x1

@g1 (w1; w2; : : : wm)

@wj

+@f (g1 (w) ; g2 (w) ; : : : gn (w))

@x2

@g2 (w1; w2; : : : wm)

@wj

+ ¢ ¢ ¢+ @f (g1 (w) ; g2 (w) ; : : : gn (w))@xn

@gn (w1; w2; : : : wm)

@wj:

4.2.8 Homogeneous Functions

In agriculture, the state of the art being given, doubling the labourdoes not double the produce. -John Stuart Mill

In economics one encounters many functions which are homogeneous. De-mand and supply curves are always homogenous of degree 0; the marginal utilityof income is always homogenous of degree ¡1 and cost functions are always ho-mogeneous of degree 1: The homogeneity of a production function determineswhether there are decreasing (small is beautiful), constant or increasing returnsto scale (bigger is better).Homogeneity is de…ned as follows:

Page 212: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 204

De…nition 242 A function f (x1; x2; : : : xn) is said to be homogeneous of degreek if and only if for any ¸ > 0 :

f (¸x1; ¸x2; : : : ¸xn) = ¸kf (x1; x2; : : : xn) :

Remark: To prove that a given function is homogeneous of some degree, onebegins with:

f (¸x1; ¸x2; : : : ¸xn)

and through a series of derivations one tries to obtain:

¸kf (x1; x2; : : : xn) :

The exponent on ¸ then gives the degree of the homogeneity.

Example 1: The Cobb-Douglas production function

Q = F (L;K) = AL®K¯

is homogeneous of degree k = ®+ ¯; the sum of the exponents on capital andlabour, since:

F (¸L; ¸K) = A (¸L)® (¸K)¯

= A¸®L®¸¯K¯

= ¸®+¯AL®K¯

= ¸®+¯F (L;K) :

Thus Q = L12K

14 is homogeneous of degree k = 1

2 +14 =

34 :

Example 2: The Constant Elasticity of Substitution or CES production func-tion:

Q = F (L;K) = (®L½ + (1¡ ®)K½)°½

is homogeneous of degree ° since:

F (¸L; ¸K) = (® (¸L)½ + (1¡ ®) (¸K)½)°½

= (¸½ (®L½ + (1¡ ®)K½))°½

= (¸½)°½ (®L½ + (1¡ ®)K½)

°½

= ¸° (®L½ + (1¡ ®)K½)°½

= ¸°F (L;K) :

For example if ® = 12 ; ½ = ¡1 and ° = 1 then the CES production function:

Q =¡12L

¡1 + 12K

¡1¢¡1 is homogenous of degree 1:An important calculus property of homogeneous functions is Euler’s theorem:

Page 213: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 205

Theorem 243 Euler’s Theorem. If f (x1; x2; : : : xn) is homogeneous of de-gree k then:

kf (x1; x2; : : : xn) =@f (x1; x2; : : : xn)

@x1x1 +

@f (x1; x2; : : : xn)

@x2x2 + ¢ ¢ ¢+ @f (x1; x2; : : : xn)

@xnxn:

Proof. Let:

h (¸) = f (¸x1; ¸x2; : : : ¸xn) = ¸kf (x1; x2; : : : xn) :

Using the multivariate chain rule on f (¸x) and the fact that d¸k

d¸ = k¸k¡1we…nd that:

h0 (¸) =@f (¸x1; ¸x2; : : : ¸xn)

@x1x1 + ¢ ¢ ¢+ @f (¸x1; ¸x2; : : : ¸xn)

@xnxn

= k¸k¡1f (x1; x2; : : : xn) :

Now set ¸ = 1 and the result follows.

Example 1: We have seen that:

F (L;K) = L12K

14

is homogeneous of degree k = 12 +

14 =

34 : To verify Euler’s theorem note that:

@F (L;K)

@L£ L+ @F (L;K)

@K£K

=

µ@

@LL

12K

14

¶£ L+

µ@

@KL

12K

14

¶£K

=1

2£ L 1

2¡1K14 £ L+ 1

4£ L 1

2K14¡1 £K

=1

2£ L 1

2K14 +

1

4£ L 1

2K14

=3

4£ L 1

2K14

=3

4£ F (L;K) :

Example 2: Euler’s theorem allows us to make predictions about a competitive…rm’s pro…ts. Suppose Q = F (L;K) is homogeneous of degree k: Then byEuler’s theorem:

kQ =@F (L;K)

@LL+

@F (L;K)

@KK:

Page 214: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 206

A perfectly competitive …rm pro…t maximizes pro…ts by setting: @F (L;K)@L = WP

and @F (L;K)@K = R

P so that:

kQ =W

PL+

R

PK =) kPQ =WL+RK

and so pro…ts ¼ are given by:

¼ = PQ¡ (WL+RK) = PQ¡ kPQ= (1¡ k)PQ:

Thus if 0 < k < 1 (there are decreasing returns to scale) then ¼ > 0 while ifk = 1 (constant returns to scale) then ¼ = 0: If k > 1 then pro…ts must benegative, which is indicative of the fact that increasing returns to scale are notconsistent with perfect competition.

Another useful calculus result for homogeneous functions is:

Theorem 244 If f (x1; x2; : : : xn) is homogeneous of degree k then

@f (x1; x2; : : : xn)

@xi

is homogeneous of degree k ¡ 1:

Example: While Q = L12K

14 is homogeneous of degree k = 3

4 the marginalproduct of labour:

@

@L

³L

12K

14

´=1

2L¡

12K

14

is homogeneous of degree k ¡ 1 = 34 ¡ 1 = ¡1

4 .

4.2.9 Homogeneity and the Absence of Money Illusion

Consider a demand function:

Q = Q (P;P1; P2; : : : Pn; Y )

where P is the own price, P1P2; : : : Pn are the prices of related goods and Yis nominal income. Now suppose there is a general in‡ation so that all pricesand incomes increase by the same proportion ¸. For example suppose ¸ = 2 sothat all prices and incomes double. This means that real income and all realprices have stayed the same and so a rational household, that is one that doesnot su¤er from money illusion, will not change any of its real behavior and soQ remains the same.

Page 215: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 207

Mathematically this means that:

Q (¸P; ¸P1; ¸P2; : : : ¸Pn; ¸Y ) = Q (P;P1; P2; : : : Pn; Y )

= ¸0Q (P;P1; P2; : : : Pn; Y ) :

Thus the absence of money illusion is equivalent to the demand function beinghomogeneous of degree k = 0: This logic also applies to supply curves as well asmany other functions in economics.

4.2.10 Homogeneity and the Nature of Technology

In economics the nature of technology is often critical. What is often critical iswhat happens as the scale of production is increased; is bigger better or is smallbeautiful? This can be captured by the degree of homogeneity of the productionfunction.Suppose that a production function

Q = F (L;K)

is homogeneous of degree k: If we double the size of operation, so that ¸ = 2then:

F (2L; 2K) = 2kF (L;K) :

This says that doubling the scale of operation causes output to increase by afactor of 2k: Now:

1. If k > 1 (e.g. F (L;K) = L12K

34 and k = 5

4 ) then doubling the scale leadsto more than twice the output since then 2k > 1: We then say that thetechnology exhibits increasing returns to scale. Bigger is better.

2. If k = 1 (e.g. F (L;K) = L12K

12 and k = 1) then doubling the scale

leads to exactly twice the output since then 21 = 2:We then say that thetechnology exhibits constant returns to scale.

3. If k < 1 (e.g. F (L;K) = L12K

14 and k = 3

4) then doubling the scale leadsto less than twice the output since then 2k < 2: We then say that thetechnology exhibits decreasing returns to scale. Small is beautiful.

4.3 Second-Order Partial Derivatives

We are going to be interested in second-order partial derivatives when we discussthe concavity, convexity and second-order conditions of multivariate functions.We have:

De…nition 245 Given y = f (x1; x2; : : : xn) the second-order partial derivativewith respect to xi and xj is:

@2f (x1; x2; : : : xn)

@xj@xi=

@

@xj

µ@f (x1; x2; : : : xn)

@xi

¶:

Page 216: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 208

Remark 1: If there are n xi 0s then there are n …rst-order partial derivativesand n2 second-order partial derivatives. For example the function y = f (x1; x2)has 2 …rst-order partial derivatives but 22 = 4 second-order partial derivatives:

@2f (x1; x2)

@x1@x1;@2f (x1; x2)

@x1@x2;@2f (x1; x2)

@x2@x1;@2f (x1; x2)

@x2@x2:

Remark 2: The notation is usually a little di¤erent when we di¤erentiate twicewith respect to the same xi in which case we typically (but not always) write:

@2f (x1; x2; : : : xn)

@x2i

and not

@2f (x1; x2; : : : xn)

@xi@xi;

that is instead of @xi@xi in the denominator one writes @x2i :

Example: Consider:

y = f (x1; x2) =1

3ln (x1) +

2

3ln (x2)¡ x1x2:

We have 4 second-order partial derivatives:

@2f (x1; x2)

@x21=

@

@x1

µ@f (x1; x2)

@x1

¶=

@

@x1

µ1

3x1¡ x2

¶= ¡ 1

3x21@2f (x1; x2)

@x2@x1=

@

@x2

µ@f (x1; x2)

@x1

¶=

@

@x2

µ1

3x1¡ x2

¶= ¡1

@2f (x1; x2)

@x1@x2=

@

@x1

µ@f (x1; x2)

@x2

¶=

@

@x1

µ2

3x2¡ x1

¶= ¡1

@2f (x1; x2)

@x22=

@

@x2

µ@f (x1; x2)

@x2

¶=

@

@x2

µ2

3x2¡ x1

¶= ¡ 2

3x22:

Note that in this example

@2f (x1; x2)

@x1@x2=@2f (x1; x2)

@x2@x1;

Page 217: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 209

that is we get the same result if we …rst di¤erentiate with respect to x1 andthen with respect to x2 as when we …rst di¤erentiate with respect to x2 andthen with respect to x1: Alternatively we get the same result if we …rst apply@@x1

and then @@x2

or if we …rst apply @@x2

and then @@x1:

This turns is not a coincidence but is true for all functions. This very usefulresult is known as: Young’s theorem.

Theorem 246 Young’s Theorem: Given y = f (x1; x2; : : : xn) di¤erentiating…rst with respect to xi and then with respect to xj gives the same result asdi¤erentiating …rst with respect to xj and then with respect to xi so that:

@

@xi

µ@f (x1; x2; : : : xn)

@xj

¶=

@

@xj

µ@f (x1; x2; : : : xn)

@xi

¶or:

@2f (x1; x2; : : : xn)

@xi@xj=@2f (x1; x2; : : : xn)

@xj@xi:

Example: Given:

y = f (x1; x2) = e¡ 12(x

21+x

22)

if we …rst di¤erentiate with respect to x1 we have :

@f (x1; x2)

@x1= ¡x1e¡ 1

2(x21+x

22) =) @2f (x1; x2)

@x2@x1=

@

@x2

³¡x1e¡ 1

2 (x21+x

22)´= x1x2e

¡ 12(x

21+x

22)

while if we …rst di¤erentiate with respect to x2 we have :

@f (x1; x2)

@x2= ¡x2e¡ 1

2(x21+x

22) =) @2f (x1; x2)

@x1@x2=

@

@x1

³¡x2e¡ 1

2 (x21+x

22)´= x1x2e

¡ 12(x

21+x

22):

Both yield the same result, as required by Young’s theorem.

4.3.1 The Hessian

A multivariate function y = f (x1; x2; : : : xn) has a large number of second-orderpartial derivatives. The best way to organize these n2 second derivatives is toput them into an n£ n matrix called the Hessian, as de…ned below:De…nition 247 Hessian: Given y = f (x1; x2; : : : xn) = f (x) where x is ann£ 1 vector the Hessian is:

H (x1; x2; : : : xn) =

2666664@2f(x)@x21

@2f(x)@x1@x2

¢ ¢ ¢ @2f(x)@x1@xn

@2f(x)@x2@x1

@2f(x)@x22

¢ ¢ ¢ @2f(x)@x2@xn

......

. . ....

@2f(x)@xn@x1

@2f(x)@xn@x2

¢ ¢ ¢ @2f(x)@x2n

3777775 :

Page 218: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 210

Note that by Young’s theorem

@2f (x)

@xi@xj=@2f (x)

@xj@xi

and so the elements above the diagonal of the Hessian are equal to the corre-sponding elements below the diagonal and so:

Theorem 248 Matrix Version of Young’s Theorem: The Hessian:H (x1; x2; : : : xn)is a symmetric matrix or:

H (x1; x2; : : : xn) = H (x1; x2; : : : xn)T:

Remark: Young’s theorem reduces the number of di¤erent second-orderpartial derivatives that we need to calculate from the n2 elements in the Hessianto the n(n+1)

2 elements on and above the diagonal. For example if n = 4 ratherthan calculating 42 = 16 second derivatives we need only calculate: 4(4+1)2 = 10di¤erent second derivatives.

Calculating the Elements of the Hessian

If you have trouble remembering how to construct the Hessian write downa blank square matrix and along the top and left-side of the matrix make a listof the xi 0s as follows:

x1 x2 ¢ ¢ ¢ xnx1x2...xn

26664¢ ¢ ¢

...¢ ¢ ¢

37775 :

To …ll in i; jth entry read the corresponding xi to the left and xj above anddi¤erentiate with respect to these two variables.

Example 1: For the function:

f (x1; x2) =1

3ln (x1) +

2

3ln (x2)¡ x1x2

the second derivatives are:

@2f (x1; x2)

@x21= ¡ 1

3x21;@2f (x1; x2)

@x2@x1=@2f (x1; x2)

@x1@x2= ¡1;

@2f (x1; x2)

@x22= ¡ 2

3x22:

To calculate the Hessian: H (x1; x2) one writes :

Page 219: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 211

x1 x2x1x2

·?¸:

Thus for the 1; 2 element where the ? is placed, we di¤erentiate …rst with respectto x1 (o¤ the left side of the 1; 2 element), and then with respect to x2 (directlyabove the 1; 2 element). This then is

@2f (x1; x2)

@x1@x2= ¡1:

By Young’s theorem the 2; 1 and 1; 2 elements are identical and so we obtain:

x1 x2x1x2

·? ¡1¡1

¸:

To obtain the 1; 1 element where the ? is now placed, we di¤erentiate …rst withrespect to x1; reading left, and then again with respect to x1 reading abovewhich is:

@2f (x1; x2)

@x21= ¡ 1

3x21

and so we now have:

x1 x2x1x2

· ¡ 13x21

¡1¡1 ?

¸:

To …nish our calculation we calculate the 2; 2 element where the ? is placed bydi¤erentiating twice with respect to x2 as:

@2f (x1; x2)

@x22= ¡ 2

3x22

and so the Hessian is given by:

H (x1; x2) =

" ¡ 13x21

¡1¡1 ¡ 2

3x22

#:

Example 2: For the Cobb-Douglas production function:

Q = F (L;K) = L12K

14

Page 220: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 212

we calculate the Hessian by …rst listing L and K along the top and left-side ofa 2£ 2 matrix as:

L KLK

·?¸:

Thus for the 1; 2 element where the ? is placed we di¤erentiate once with respectto L and once with respect to K yielding:

@F (L;K)

@L=1

2L¡

12K

14 =) @2F (L;K)

@L@K=

@

@K

µ1

2L¡

12K

14

¶=1

8L¡

12K¡ 3

4 :

By Young’s theorem the 1; 2 and 2; 1 elements are identical and so we obtain:

L K

LK

·? 1

8L¡ 12K¡ 3

4

18L

¡ 12K¡ 3

4

¸:

To …nd the 1; 1 element where the ? is placed we di¤erentiate twice with respectto L so that:

@F (L;K)

@L=1

2L¡

12K

14 =) @2F (L;K)

@L2=@

@L

µ1

2L¡

12K

14

¶= ¡1

4L¡

32K

14

and so we obtain:

L K

LK

· ¡14L

¡32K

14

18L

¡ 12K¡ 3

4

18L

¡ 12K¡ 3

4 ?

¸:

Finally to …nd the 2; 2 element where the ? is placed we di¤erentiate twice withrespect to K so that:

@F (L;K)

@K=1

4L

12K¡ 3

4 =) @2F (L;K)

@K2=

@

@K

µ1

4L

12K¡ 3

4

¶= ¡ 3

16L

12K¡ 7

4

so that the Hessian is given by:

H (L;K) =

· ¡14L

¡ 32K

14

18L

¡ 12K¡ 3

4

18L

¡ 12K¡ 3

4 ¡ 316L

12K¡ 7

4

¸:

4.3.2 Concavity and Convexity

In univariate calculus a function y = f (x) is concave (a mountain) if f 00 (x) < 0and convex (a valley) if f 00 (x) > 0 where these mountains and valleys are in thetwo dimensional space <2.

Page 221: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 213

In multivariate calculus with n xi 0s : y = f (x1; x2; : : : xn) instead of f 00 (x)we look at the n £ n Hessian: H (x1; x2; : : : xn) to determine if a function isconcave (a mountain) or convex (a valley) in the n+1 dimensional space <n+1:Let us start with some easy examples. Consider the multivariate function

f (x1; x2) = ¡12

¡x21 + x

22

¢plotted below:

x1x2

y

f (x1; x2) = ¡12

¡x21 + x

22

¢ :

which is concave (a mountain) in 3 dimensions. You can verify that the Hessianfor this function is:

H (x1; x2) =

· ¡1 00 ¡1

¸which is a negative de…nite matrix (since it is a diagonal matrix with negativeelements along the diagonal). Note the parallel: in univariate calculus a functionis concave if f 00 (x) is negative, in multivariate calculus a function is concaveif its Hessian H (x) is negative de…nite.The function:

f (x1; x2) =1

2

¡x21 + x

22

¢

Page 222: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 214

which is plotted below:

x1x2

y

f (x1; x2) =12

¡x21 + x

22

¢ :

is convex or a valley in 3 dimensions. You can verify that the Hessian for thisfunction is:

H (x1; x2) =

·1 00 1

¸which is a positive de…nite matrix (since it is diagonal with positive elementsalong the diagonal). Again note the parallel: in univariate calculus a functionis convex if f 00 (x) is positive, in multivariate calculus a function is convex ifits Hessian H (x) is positive de…nite.We have:

De…nition 249 Concavity: The function y = f (x1; x2; : : : xn) is concave ifthe Hessian: H (x1; x2; : : : xn) is a negative de…nite matrix.

De…nition 250 Convexity: The function y = f (x1; x2; : : : xn) is convex ifthe Hessian: H (x1; x2; : : : xn) is a positive de…nite matrix.

Remark: As before we can distinguish between local concavity and convexityand global concavity and convexity. Thus

De…nition 251 Local Concavity: The function y = f (x1; x2; : : : xn) is locallyconcave at a point: x01; x

02; : : : x

0n if the Hessian evaluated at x

01; x

02; : : : x

0n or

H¡x01; x

02; : : : x

0n

¢is a negative de…nite matrix.

De…nition 252 Local Convexity: The function y = f (x1; x2; : : : xn) is lo-cally convex at a point: x01; x

02; : : : x

0n if the Hessian evaluated at x

01; x

02; : : : x

0n or

H¡x01; x

02; : : : x

0n

¢is a positive de…nite matrix.

De…nition 253 Global Concavity: The function y = f (x1; x2; : : : xn) is glob-ally concave if the Hessian: H (x1; x2; : : : xn) is a negative de…nite matrix forall x1; x2; : : : xn in the domain of f (x1; x2; : : : xn) :

Page 223: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 215

De…nition 254 Global Convexity: The function y = f (x1; x2; : : : xn) is glob-ally convex if the Hessian: H (x1; x2; : : : xn) is a positive de…nite matrix for allx1; x2; : : : xn in the domain of f (x1; x2; : : : xn) :

Example 1: We have seen from a previous example that the function

y = f (x1; x2) =1

3ln (x1) +

2

3ln (x2)¡ x1x2

has a Hessian

H (x1; x2) =

" ¡ 13x21

¡1¡1 ¡ 2

3x22

#:

The function is locally concave at x01 =12 and x

02 =

13 since:

H

µ1

2;1

3

¶=

24 ¡ 1

3( 12)2 ¡1

¡1 ¡ 2

3( 13 )2

35=

· ¡43 ¡1¡1 ¡6

¸is a negative de…nite matrix as can be shown from the leading principal minorswhere M1 = ¡4

3 < 0 and M2 = 7 > 0; or from the eigenvalues which are¸1 = ¡1:13 < 0 and ¸2 = ¡6:21 < 0.The function is not globally concave since at another point where x01 = x

02 =

1:

H (1; 1) =

· ¡13 ¡1

¡1 ¡23

¸which is not negative de…nite since this requires that both eigenvalues be neg-ative but:

¸1 = ¡12+1

6

p37 = 0:513 > 0

¸1 = ¡12¡ 16

p37 = ¡1:514 < 0:

Example 2: Consider the Cobb-Douglas production function:

Q = L12K

14

Page 224: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 216

plotted below:

0

5

L

0

5

K

0

1

2

3

Q

Q = L12K

14

:

From the graph the function appears mountain-like or concave. To verify thislet us look at the Hessian calculated in a previous example:

H (L;K) =

· ¡14L

¡ 32K

14

18L

¡ 12K¡ 3

4

18L

¡ 12K¡ 3

4 ¡ 316L

12K¡ 7

4

¸:

The …rst leading principal minor M1 is negative for all L and K since:

M1 = ¡14L¡

32K

14 < 0:

The second leading principal minor M2 is positive since:

M2 = det

· ¡14L

¡ 32K

14

18L

¡ 12K¡ 3

4

18L

¡ 12K¡ 3

4 ¡ 316L

12K¡ 7

4

¸=

3

64L¡1K¡ 3

2 ¡ 1

64L¡1K¡ 3

2

=1

32L¡1K¡ 3

2 > 0:

It follows that H (L;K) is a negative de…nite matrix for all L and K andhence that Q = L

12K

14 is globally concave.

You may want to attempt to prove the following results:

Theorem 255 The Cobb-Douglas production function:

f (L;K) = L®K¯

with ® > 0 and ¯ > 0 is globally concave if and only if ®+ ¯ < 1:

Theorem 256 f (x1; x2; : : : xn) is globally concave (convex) if and only if

g (x1; x2; : : : xn) = ¡1£ f (x1; x2; : : : xn)is globally convex (concave).

Page 225: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 217

Theorem 257 If f (x1; x2; : : : xn) is globally concave (convex) and g (x1; x2; : : : xn)is globally concave (convex) then

h (x1; x2; : : : xn) = f (x1; x2; : : : xn) + g (x1; x2; : : : xn)

is globally concave (convex).

4.3.3 First and Second-Order Taylor Series

We say in univariate calculus that Taylor series can be used to approximate anarbitrary function by a linear function or a quadratic. Similar results apply formultivariate functions. In particular:

Theorem 258 If x is an n£ 1 vector and f (x) is a multivariate function, a…rst-order Taylor series of f (x) around the point x0 is given by:

f¡x0¢+rf ¡x0¢T (x¡ x0)

while a second-order Taylor series around x0 is given by:

f¡x0¢+rf ¡x0¢T (x¡ x0) + 1

2(x¡ x0)TH (x0) (x¡ x0)

where rf (x) is the gradient and H (x) the Hessian of f (x) :

Example: Given:

f (x1; x2) = x51x32

and suppose we wish to calculate a Taylor series approximation around x01 =1and x02 = 2. We then have: f (1; 2) = (1)

5 (2)3 = 8 and

@f (x1; x2)

@x1= 5x41x

22 =)

@f (1; 2)

@x1= 5 (1)4 (2)2 = 20

@f (x1; x2)

@x2= 2x51x

12 =)

@f (1; 2)

@x2= 2 (1)5 (2)1 = 4

so that the gradient at (1; 2) is

rf ¡x0¢ ´ rf (1; 2) = · 204

¸and a …rst-order Taylor series around x01 = 1 and x

02 = 2 would be

f¡x0¢+rf ¡x0¢T (x¡ x0) = 8 +

£20 4

¤µ· x1x2

¸¡·12

¸¶= 8 + 20£ (x1 ¡ 1) + 4£ (x2 ¡ 2) :

Page 226: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 218

To calculate a second-order Taylor series we need the second derivatives:

@2f (x1; x2)

@x21= 20x31x

22 =)

@2f (1; 2)

@x21= 20 (1)3 (2)2 = 80

@2f (x1; x2)

@x1@x2= 15x41x

22 =)

@2f (1; 2)

@x1@x2= 15 (1)4 (2)2 = 60

@2f (x1; x2)

@x22= 6x51x2 =)

@2f (1; 2)

@x21= 6 (1)5 (2)1 = 12

and so that the Hessian at (1; 2) is:

H (1; 2) =

·80 6060 12

¸:

Thus the second-order Taylor series is:

f¡x0¢+rf ¡x0¢T (x¡ x0) + 1

2(x¡ x0)TH (x0) (x¡ x0)

= 8 + 20 (x1 ¡ 1) + 4 (x2 ¡ 2)+1

2

£x1 ¡ 1 x2 ¡ 2

¤ · 80 6060 12

¸ ·x1 ¡ 1x2 ¡ 2

¸= 8 + 20£ (x1 ¡ 1) + 4£ (x2 ¡ 2)

+40 (x1 ¡ 1)2 + 6 (x2 ¡ 2)2 + 60 (x1 ¡ 1) (x2 ¡ 2) :

4.4 Unconstrained Optimization

4.4.1 First-Order Conditions

The …rst-order conditions for a maximum or minimum of a function of n vari-ables:

y = f (x1; x2; : : : xn)

are:

Theorem 259 First-Order Conditions: If x¤1; x¤2; : : : x¤n maximizes or min-imizes the function y = f (x1; x2; : : : xn) then:

@f (x¤1; x¤2; : : : x¤n)@x1

= 0;@f (x¤1; x¤2; : : : x¤n)

@x2= 0; ¢ ¢ ¢ ; @f (x

¤1; x

¤2; : : : x

¤n)

@xn= 0:

Proof. (by contradiction): Suppose x¤1; x¤2; : : : x¤n maximizes (minimizes) yand suppose it were the case that @f(x

¤1 ;x

¤2 ;:::x

¤n)

@xi> 0: It follows then at x¤1; x¤2; : : : x¤n

that y is a locally increasing function of xi so that increasing (decreasing) xiand keeping all other variables …xed would increase (decrease) y: This however

Page 227: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 219

contradicts x¤1; x¤2; : : : x¤n being a maximum (minimum) and so@f(x¤1 ;x

¤2 ;:::x

¤n)

@xi> 0

is not possible. Similarly if @f(x¤1 ;x

¤2 ;:::x

¤n)

@xi< 0 then y is a locally decreasing func-

tion of xi and so if we decreased (increased) xi keeping all other variables …xedthen y would increase (decrease). Again this contradicts x¤1; x¤2; : : : x¤n being amaximum (minimum) and so @f(x¤1 ;x

¤2 ;:::x

¤n)

@xi< 0 is not possible. It follows then

that:

@f (x¤1; x¤2; : : : x¤n)@xi

= 0:

We can also express the …rst-order conditions more compactly using thegradient rf (x) ´ @f(x)

@x evaluated at x¤ so that:

Theorem 260 If the n£ 1 vector x¤ maximizes or minimizes y = f (x) then:

rf (x¤) ´ @f (x¤)@x

= 0:

where 0 is an n£ 1 vector of zeros.

Remark: The …rst-order conditions for a maximum or minimum involve n equa-tions in n unknowns x¤1; x¤2; : : : x¤n. If the problem is “nice” then it is sometimespossible to solve these n equations for the n unknown: x¤1; x¤2; : : : x¤n: Even whenwe cannot explicitly solve these equations we can often learn a lot about thenature of the solution by examining the …rst-order conditions.Although …nding the …rst-order conditions is generally straightforward, there

are a few pitfalls that students can avoid by using the following recipe:

Deriving the First-Order Conditions

1. Calculate the n …rst-order partial derivatives @f(x1;x2;:::xn)@xifor i = 1; 2; : : : n:

2. Put ¤ 0s on all the xi 0s in 1 and set each partial derivative equal to zero.

3. If possible solve for x¤1; x¤2; : : : x¤n or if not examine the …rst-order conditionsfor anything you can learn about the optimal values.

Example 1: Consider the function:

f (x1; x2) = 3x21 ¡ 6x1x2 + 5x22 ¡ 4x1 ¡ 2x2 + 8:

1. Calculating the …rst derivatives we …nd:

@f (x1; x2)

@x1= 6x1 ¡ 6x2 ¡ 4

@f (x1; x2)

@x2= ¡6x1 + 10x2 ¡ 2:

Page 228: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 220

2. Putting ¤ 0s on the xi 0s in 1: and setting these derivatives equal to 0results in:

@f (x¤1; x¤2)@x1

= 6x¤1 ¡ 6x¤2 ¡ 4 = 0@f (x¤1; x¤2)

@x2= ¡6x¤1 + 10x¤2 ¡ 2 = 0:

3. We can solve the …rst-order conditions since in matrix notation we obtain:·6 ¡6

¡6 10

¸·x¤1x¤2

¸=

·42

¸:

So that using Cramer’s rule we …nd that:

x¤1 =det

·4 ¡62 10

¸det

·6 ¡6

¡6 10

¸ = 13

6; x¤2 =

det

·6 4

¡6 2

¸det

·6 ¡6

¡6 10

¸ = 3

2:

Example 2: Consider the function:

y = f (x1; x2) =1

3ln (x1) +

2

3ln (x2)¡ x1 ¡ x2:

Following the recipe we have:

1. The …rst derivatives are:

@f (x1; x2)

@x1=

1

3x1¡ 1;

@f (x1; x2)

@x2=

2

3x2¡ 1:

2. Putting ¤ 0s on the xi 0s in 1: and setting these derivatives equal to 0:

@f (x¤1; x¤2)@x1

=1

3x¤1¡ 1 = 0;

@f (x¤1; x¤2)@x2

=2

3x¤2¡ 1 = 0:

3. Solving we …nd that: x¤1 =13 and x

¤2 =

23 :

Example 3: Consider the function:

f (x1; x2) = x1=41 x

1=22 ¡ 3x1 ¡ 2x2;

where x1 > 0 and x2 > 0: Following the recipe we have:

Page 229: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 221

1. The …rst derivatives are:

@f (x1; x2)

@x1=

1

4x¡3=41 x

1=22 ¡ 3

@f (x1; x2)

@x2=

1

2x1=41 x

¡1=22 ¡ 2:

2. Putting ¤ 0s on the xi 0s in 1 and setting these derivatives equal to 0:

@f (x¤1; x¤2)@x1

=1

4(x¤1)

¡3=4 (x¤2)1=2 ¡ 3 = 0

@f (x¤1; x¤2)@x2

=1

2(x¤1)

1=4 (x¤2)¡1=2 ¡ 2 = 0

3. We now attempt to solve the equations in 2 using the ln ( ) function toconvert them into two linear equations as:

(x¤1)¡3=4 (x¤2)

1=2 = 12 =)¡34ln (x¤1) +

1

2ln (x¤2) = ln (12)

(x¤1)1=4 (x¤2)

¡1=2 = 4 =) 1

4ln (x¤1)¡

1

2ln (x¤2) = ln (4)

and hence:

¡34y¤1 +

1

2y¤2 = ln (12)

1

4y¤1 ¡

1

2y¤2 = ln (4)

where y¤1 = ln (x¤1) and y¤2 = ln (x¤2) : Writing this in matrix notation weobtain: · ¡3

412

14 ¡1

2

¸·y¤1y¤2

¸=

·ln (12)ln (4)

¸:

Using Cramer’s rule we …nd that:

ln (x¤1) = y¤1 =det

·ln (12) 1

2ln (4) ¡1

2

¸det

· ¡34

12

14 ¡1

2

¸ = ¡2 ln (12)¡ 2 ln (4) ;

ln (x¤2) = y¤2 =det

· ¡34 ln (12)14 ln (4)

¸det

· ¡34

12

14 ¡1

2

¸ = ¡3 ln (4)¡ ln (12) :

Thus

x¤1 = ey¤1 = e¡2 ln(12)¡2 ln(4) =

1

2304

x¤2 = ey¤2 = e¡3 ln(4)¡ln(12) =

1

768:

Page 230: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 222

Example 4: Suppose a perfectly competitive …rm has a technology: Q =F (L;K) which is globally concave. Pro…ts expressed as a function of L and Kare given by:

¼ (L;K) = PF (L;K)¡WL¡RK:Following the recipe we have:

1. The …rst derivatives are:

@¼ (L;K)

@L= P

@F (L;K)

@L¡W

@¼ (L;K)

@K= P

@F (L;K)

@K¡R:

2. Putting ¤ 0s on L and K in 1 and setting these derivatives equal to 0results in:

@¼ (L¤;K¤)@L

= P@F (L¤;K¤)

@L¡W = 0

@¼ (L¤;K¤)@K

= P@F (L¤;K¤)

@K¡R = 0:

3. Given the level of generality there is no hope of explicitly solving for L¤ andK¤ here. We can nevertheless learn something about how a competitive…rm chooses L¤ and K since:

P@F (L¤;K¤)

@L¡W = 0 =)MPL (L

¤;K¤) =W

P´ w

P@F (L¤;K¤)

@K¡R = 0 =)MPK (L

¤;K¤) =R

P´ r

where w and r are the real wage rate and real rental cost of capital. Thusthe competitive …rm chooses L¤ and K¤ to equate the marginal productsof labour and capital with the real wage w and the real rental cost ofcapital.

Example 5: Consider pro…t maximization in the long-run with a Cobb-Douglasproduction function:

Q = F (L;K) = L12K

14

with P = 8; W = 5 and R = 4. The pro…t function is then:

¼ (L;K) = PF (L;K)¡WL¡RKor:

¼ (L;K) = 8L12K

14 ¡ 5L¡ 4K:

Following the recipe we have:

Page 231: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 223

1. The …rst derivatives are:

@¼ (L;K)

@L= 4L¡

12K

14 ¡ 5

@¼ (L;K)

@K= 2L

12K¡ 3

4 ¡ 4:

2. Putting ¤ 0s on L and K and setting these derivatives equal to 0 resultsin:

@¼ (L¤;K¤)@L

= 4L¤¡12K¤ 14 ¡ 5 = 0 =) L¤¡

12K¤ 14 =

5

4@¼ (L¤;K¤)

@K= 2L¤

12K¤¡ 3

4 ¡ 4 = 0 =) L¤12K¤¡ 3

4 = 2:

3. We now attempt to solve using the ln ( ) function to convert these equa-tions into two linear equations as:

L¤¡12K¤ 14 =

5

4=)¡1

2ln (L¤) +

1

4ln (K¤) = ln

µ5

4

¶L¤

12K¤¡ 3

4 = 2 =) 1

2ln (L¤)¡ 3

4ln (K¤) = ln (2)

or in matrix notation as· ¡12

14

12 ¡3

4

¸·l¤

¸=

24 ln¡54

¢ln (2)

35where l¤ = ln (L¤) and k¤ = ln (K¤) : Solving these two equations bymatrix inversion (or by using Cramer’s rule) we obtain:

·l¤

¸=

· ¡12

14

12 ¡3

4

¸¡1 ·ln¡54

¢ln (2)

¸=

24 ¡3 ln ¡54¢¡ ln (2)¡2 ln ¡54¢¡ 2 ln (2)

35 :Thus

L¤ = e¡3 ln(54)¡ln(2) =

32

125

K¤ = e¡2 ln(54)¡2 ln(2) =

4

25:

4.4.2 Second-Order Conditions

A solution x¤1; x¤2; : : : x¤n to the …rst-order conditions can be either a maximum ora minimum. Clearly we want to be able to know if x¤1; x¤2; : : : x¤n is a maximumor a minimum. For example if we are interested in pro…t maximization we donot want to be at a point which minimizes pro…ts.

Page 232: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 224

As with univariate calculus, the second-order conditions rely on the fact thata maximum occurs at the top of a mountain (concavity) while a minimum occursat the bottom of a valley (convexity). We therefore use the matrix of secondderivatives or the Hessian: H (x1; x2; : : : xn) to determine if x¤1; x¤2; : : : x¤n is thea maximum or a minimum. As before the weaker condition of local concavity(convexity) yields the weaker result of a local maximum (minimum) while thestronger condition of global concavity (convexity) yields the stronger result ofa global maximum (minimum). We have:

Theorem 261 Local Maximum: If x¤1; x¤2; : : : x¤n satis…es the …rst-order con-ditions and f (x1; x2; : : : xn) is locally concave at x¤1; x¤2; : : : x¤n so that: H (x¤1; x¤2; : : : x¤n)is negative de…nite, then x¤1; x¤2; : : : x¤n is a local maximum of f (x1; x2; : : : xn) :

Theorem 262 Local Minimum: If x¤1; x¤2; : : : x¤n satis…es the …rst-order con-ditions and f (x1; x2; : : : xn) is locally convex at x¤1; x¤2; : : : x¤n so that: H (x¤1; x¤2; : : : x¤n)is positive de…nite, then x¤1; x¤2; : : : x¤n is a local minimum of f (x1; x2; : : : xn) :

Theorem 263 If x¤1; x¤2; : : : x¤n satis…es the …rst-order conditions and f (x1; x2; : : : xn)is globally concave so that H (x1; x2; : : : xn) is a negative de…nite matrix for allx1; x2; : : : xn; then x¤1; x¤2; : : : x¤n is the unique global maximum of f (x1; x2; : : : xn) :

Theorem 264 If x¤1; x¤2; : : : x¤n satis…es the …rst-order conditions and f (x1; x2; : : : xn)is globally convex so that H (x1; x2; : : : xn) is a positive de…nite matrix for allx1; x2; : : : xn; then x¤1; x¤2; : : : x¤n is the unique global minimum of f (x1; x2; : : : xn) :

Example 1 (continued): For the function:

f (x1; x2) = 3x21 ¡ 6x1x2 + 5x22 ¡ 4x1 ¡ 2x2 + 8:

we showed that x¤1 =136 x¤2 =

32 is a solution to the …rst-order conditions. To

determine if this is a maximum or a minimum we need:

@2f (x1; x2)

@x21=

@

@x1(6x1 ¡ 6x2 ¡ 4) = 6

@2f (x1; x2)

@x2@x1=

@

@x2(6x1 ¡ 6x2 ¡ 4) = ¡6

@2f (x1; x2)

@x22=

@

@x2(¡6x1 + 10x2 ¡ 2) = 10

so that the Hessian of f (x1; x2) is:

H (x1; x2) =

·6 ¡6

¡6 10

¸:

Note for this particular problem the Hessian does not depend on x1 and x2:

Page 233: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 225

Now from the leading principal minors we have:

M1 = det [6] = 6 > 0

M1 = det

·6 ¡6

¡6 10

¸= 24 > 0:

It follows then thatH (x1; x2) is positive de…nite for all x1; x2 and hence f (x1; x2)is globally convex. Therefore x¤1 =

136 x

¤2 =

32 is a global minimum.

Example 2 (continued): For the function:

y = f (x1; x2) =1

3ln (x1) +

2

3ln (x2)¡ x1 ¡ x2

we showed that the solution to the …rst-order conditions is: x¤1 =13 and x

¤2 =

23 :

The Hessian is given by:

H (x1; x2) =

"¡ 13x21

0

0 ¡ 23x22

#:

At x¤1 =13 and x

¤2 =

23 we have:

H

µ1

3;2

3

¶=

24 ¡ 1

3( 13)2 0

0 ¡ 2

3( 23)2

35 = · ¡3 00 ¡3

2

¸:

which is a negative de…nite matrix (since M1 = ¡3 < 0 and M2 =92 > 0 ) so

that it follows that x¤1 =13 and x

¤2 =

23 is a local maximum.

We can in fact make the stronger statement that x¤1 =13 and x

¤2 =

23 is a

global maximum since the Hessian is a negative de…nite matrix for all x1; x2since it is a diagonal matrix with negative elements along the diagonal. Thus:x¤1 =

13 and x

¤2 =

23 is a global maximum.

Example 3 (continued): For the function:

f (x1; x2) = x1=41 x

1=22 ¡ 3x1 ¡ 2x2;

we showed that the solution to the …rst-order conditions is x¤1 =1

2304 ; x¤2 =

1768 :

We have:

@2f (x1; x2)

@x21= ¡ 3

16x¡7=41 x

1=22 ;

@2f (x1; x2)

@x22= ¡1

4x1=41 x

¡3=22 ;

@2f (x1; x2)

@x1@x2=1

8x¡3=41 x

¡3=22

so that the Hessian is given by:

H (x1; x2) =

"¡ 316x

¡7=41 x

1=22

18x

¡3=41 x

¡3=22

18x

¡3=41 x

¡3=22 ¡1

4x1=41 x

¡3=22

#:

Page 234: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 226

Substituting x¤1 =1

2304 and x¤2 =

1768 into H (x1; x2) we …nd that:

H

µ1

2304;1

768

¶=

"¡ 316

¡1

2304

¢¡7=4 ¡ 1768

¢1=2 18

¡1

2304

¢¡3=4 ¡ 1768

¢¡1=218

¡1

2304

¢¡3=4 ¡ 1768

¢¡1=2 ¡14

¡1

2304

¢1=4 ¡ 1768

¢¡5=2#

=

· ¡5184 11521152 ¡58982

¸:

This matrix is negative de…nite from the leading principal minors since:

M1 = ¡5184 < 0M2 = det

· ¡5184 11521152 ¡58982

¸= 304435584 > 0:

It follows then that x¤1 =1

2304 , x¤2 =

1768 is a local maximum.

We can in fact prove more, that x¤1 =1

2304 and x¤2 =

1768 is a globalmaximum

by showing that f (x1; x2) is globally concave. To do this we need to show thatH (x1; x2) is negative de…nite for all x1; x2: Using leading principal minors wehave:

M1 = ¡ 3

16x¡7=41 x

1=22 < 0

M2 = det

"¡ 316x

¡7=41 x

1=22

18x

¡3=41 x

¡3=22

18x

¡3=41 x

¡3=22 ¡1

4x1=41 x

¡5=22

#=

3

64x¡6=41 x

¡4=22 ¡ 1

64x¡6=41 x

¡4=22

=2

64x¡6=41 x

¡4=22 > 0

so that H (x1; x2) is a negative de…nite matrix for all x1and x2: Thus f (x1; x2)is globally concave and x¤1 =

12304 ; x

¤2 =

1768 is the unique global maximum.

Example 4 (continued): Given the problem of maximizing pro…ts:

¼ (L;K) = PF (L;K)¡WL¡RKwhere the production function: Q = F (L;K) is globally concave so that itsHessian:

HF (L;K) =

"@2F (L;K)

@L2@2F (L;K)@L@K

@2F (L;K)@L@K

@2F (L;K)@K2

#is negative de…nite for all L and K: The Hessian of the pro…t function is then:

H¼ (L;K) =

"@2¼(L;K)@L2

@2¼(L;K)@L@K

@2¼(L;K)@L@K

@2¼(L;K)@K2

#=

"P @2F (L;K)

@L2 P @2F (L;K)@L@K

P @2F (L;K)@L@K P @2F (L;K)

@K2

#

= P

"@2F (L;K)

@L2@2F (L;K)@L@K

@2F (L;K)@L@K

@2F (L;K)@K2

#= PHF (L;K) :

Page 235: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 227

Since HF (L;K) is negative de…nite for all (L;K) (since F (L;K) is concave),and since P > 0; it follows that H¼ (L;K) is also negative de…nite for all (L;K).Thus ¼ (L;K) is globally concave and hence the L¤;K¤ which solves the …rst-order conditions is the unique global maximum.

Example 5: Consider the long-run pro…t maximization problem with:

¼ (L;K) = 8L12K

14 ¡ 5L¡ 4K:

We showed that the solution to the …rst-order conditions is: L¤ = 32125 ; K

¤ = 425 :

We would like to show that this is in fact a global pro…t maximum. The Hessianof ¼ (L;K) is:

H (L;K) =

· ¡2L¡32K

14 L¡

12K¡ 3

4

L¡12K¡ 3

4 ¡32L

12K¡ 7

4

¸:

Using leading principal minors we have:

M1 = ¡2L¡ 32K

14 < 0

M2 = 3L¡1K32 ¡ L¡1

2K¡ 34 = 2L¡1K

32 > 0:

Thus H (L;K) is negative de…nite for all L and K so that ¼ (L;K) is globallyconcave and hence: L¤ = 32

125 and K¤ = 4

25 is a global maximum.

4.5 Quasi-Concavity and Quasi-Convexity

4.5.1 Ordinal and Cardinal Properties

Just as with univariate functions, multivariate functions have both ordinal andcardinal properties de…ned in exactly the same manner:

De…nition 265 Ordinal Property: An ordinal property of a function f (x1; x2; : : : xn)is one which remains unchanged when any monotonic transformation g (x) isapplied; that is both f (x1; x2; : : : xn) and g (f (x1; x2; : : : xn)) share the propertyfor any g (x) with: g0 (x) > 0:

De…nition 266 Cardinal Property: A cardinal property of a function f (x1; x2; : : : xn)is one which does change a monotonic transformation is applied.

Just as with univariate functions, the global maximum or minimum is anordinal property while concavity or convexity is a cardinal property.

Theorem 267 A Global Maximum or Minimum: x¤1; x¤2; : : : x¤n is anOrdinal Property; that is x¤1; x¤2; : : : x¤n is a global maximum or minimumof f (x1; x2; : : : xn) if and only if it is also a global maximum or minimum ofg (f (x1; x2; : : : xn)) with g0 (x) > 0:

Page 236: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 228

Theorem 268 Concavity and Convexity are Cardinal Properties: Iff (x1; x2; : : : xn) is globally concave or convex it does not follow that g (f (x1; x2; : : : xn))(with g0 (x) > 0) is globally concave or convex.

Again this leads us to de…ne a weaker notion of concavity or convexity whichis an ordinal property:

De…nition 269 Quasi-Concavity: A function f (x1; x2; : : : xn) is quasi-concaveif and only if it can be written as:

f (x1; x2; : : : xn) = g (h (x1; x2; : : : xn))

with g0 (x) > 0 and where h (x1; x2; : : : xn) is globally concave.

De…nition 270 Quasi-Convexity: A function f (x1; x2; : : : xn) is quasi-convexif and only if it can be written as:

f (x1; x2; : : : xn) = g (h (x1; x2; : : : xn))

with g0 (w) > 0 and h (x1; x2; : : : xn) is globally convex.

We have:

Theorem 271 Quasi-Concavity and Quasi-Convexity are Ordinal Prop-erties.

Just as with univariate functions, one can show that a given function isquasi-concave (quasi-convex) by …nding a monotonic transformation g (x) whichmakes the function concave (convex). Thus:

Theorem 272 A function f (x1; x2; : : : xn) is quasi-concave (quasi-convex) ifand only if there exists a monotonic transformation g (x) such that

h (x1; x2; : : : xn) = g (f (x1; x2; : : : xn))

is globally concave (globally convex).

Example: Consider the function:

f (x1; x2) = x21x42

for x1 > 0 and x2 > 0: You can verify that the Hessian Hf (x1; x2) of f (x1; x2)is given by:

Hf (x1; x2) =

·2x42 8x1x328x1x

32 12x21x

22

¸:

Page 237: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 229

The function x21x42 is not concave since the diagonal elements of Hf (x1; x2)

are both positive. The function x21x42 is also not convex since H (x1; x2) is not

positive de…nite since:

M2 = det [Hf (x1; x2)] = det

·2x42 8x1x328x1x

32 12x21x

22

¸= ¡40x62x21 < 0:

We can however show that x21x42 is quasi-concave. We will do this two

di¤erent ways. First we show that f (x1; x2) is a monotonic function of a concavefunction since:

f (x1; x2) = x21x42 = e

2 ln(x1)+4 ln(x2)

so that the monotonic transformation is g (x) = ex (with g0 (x) = ex > 0 ) andthe concave function is:

h (x1; x2) = 2 ln (x1) + 4 ln (x2)

since the Hessian of h (x1; x2) is:

Hh (x1; x2) =

" ¡ 2x21

0

0 ¡ 4x22

#which is negative de…nite for all (x1; x2) (since it is a diagonal matrix withnegative diagonal elements). We conclude then that x21x

42 is quasi-concave.

Now we show that f (x1; x2) is quasi-concave by …nding a monotonic trans-formation g (x) which transforms f (x1; x2) into a concave function. To this endlet:

g (x) = ln (x) with g0 (x) =1

x> 0

so that:

h (x1; x2) = g (f (x1; x2)) = ln¡x21x

42

¢= 2 ln (x1) + 4 ln (x2) :

We have already shown that h (x1; x2) is globally concave and so the quasi-concavity of x21x

42 follows.

4.5.2 Su¢cient Conditions for a Global Maximum or Min-imum

To insure that a solution to the …rst-order conditions x¤1; x¤2; : : : x¤n is a globalmaximum (minimum) we do not necessarily need concavity (convexity), we onlyneed the weaker conditions of quasi-concavity (quasi-convexity). In particular:

Theorem 273 Suppose that x¤1; x¤2; : : : x¤n satis…es the …rst-order conditions andf (x1; x2; : : : xn) is quasi-concave, then x¤1; x¤2; : : : x¤n is the unique global maxi-mum of f (x1; x2; : : : xn) :

Page 238: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 230

Theorem 274 Suppose that x¤1; x¤2; : : : x¤n satis…es the …rst-order conditions andf (x1; x2; : : : xn) is quasi-convex, then x¤1; x¤2; : : : x¤n is the unique global minimumof f (x1; x2; : : : xn) :

Example: Consider a scaled version of the bivariate standard normal distribu-tion:

f (x1; x2) = e¡ 12(x

21+x

22):

To …nd the …rst-order conditions we …rst calculate:

@f (x1; x2)

@x1= ¡x1 £ e¡ 1

2(x21+x

22);

@f (x1; x2)

@x2= ¡x2 £ e¡ 1

2(x21+x

22):

Now we put ¤ 0s on the xi 0s and solve for x¤1 and x¤2 as:

@f (x¤1; x¤2)@x1

= 0 = ¡x¤1 £1

2¼e¡

12(x

¤21 +x

¤22 ) =) x¤1 = 0

@f (x¤1; x¤2)@x2

= 0 = ¡x¤2 £1

2¼e¡

12(x

¤21 +x

¤22 ) =) x¤2 = 0:

We would like to show that x¤1 = 0; x¤2 = 0 is a global maximum.The Hessian of f (x1; x2) is:

H (x1; x2) = e¡ 12 (x

21+x

21)·x21 ¡ 1 x1x2x1x2 x22 ¡ 1

¸and for x¤1 = 0; x¤2 = 0 we have:

H (0; 0) =

· ¡1 00 ¡1

¸which is negative de…nite and so we conclude that x¤1 = 0; x¤2 = 0 is a localmaximum.We cannot use concavity to show that x¤1 = 0; x¤2 = 0 is a global maximum

since at x1 = 2; x2 = 2 we have:

H (2; 2) = e¡4·3 44 3

¸which is not negative de…nite (since it has positive elements on the diagonal).It follows that f (x1; x2) is not concave.We can show that f (x1; x2) is quasi-concave. Note that:

f (x1; x2) = e¡ 12(x

21+x

22) = g (h (x1; x2))

Page 239: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 231

with monotonic function: g (x) = ex and concave function:

h (x1; x2) = ¡12

¡x21 + x

22

¢:

since its Hessian is given by: · ¡1 00 ¡1

¸which is a diagonal matrix with negative elements along the diagonal and henceis negative de…nite for all x1; x2: It follows then that f (x1; x2) is quasi-concaveand hence that x¤1 = 0; x¤2 = 0 is a global maximum for both f (x1; x2) andh (x1; x2) :

4.5.3 Indi¤erence Curves and Quasi-Concavity

Suppose a household has a utility function:

U (Q1; Q2)

where the marginal utility of good 1 and 2 are:

@U (Q1; Q2)

@Q1´ U1 > 0;

@U (Q1; Q2)

@Q2´ U2 > 0

and the second derivatives are:

U11 =@2U (Q1; Q2)

@Q21; U22 =

@2U (Q1; Q2)

@Q22; U12 =

@2U (Q1; Q2)

@Q1@Q2:

We de…ne an indi¤erence curve from the utility function as follows:

De…nition 275 Indi¤erence Curve: An indi¤erence curve corresponding toutility level c, written as Q2 = f (Q1) ; is all combinations of Q1; Q2 which yieldc units of utility or:

U (Q1; f (Q1)) = c:

We de…ne the slope of the indi¤erence curve as:

De…nition 276 Marginal Rate of Substitution: The marginal rate of sub-stitution is:

MRS (Q1; Q2) = f0 (Q1) :

De…nition 277 We say that the indi¤erence curve has a diminishing marginalrate if substitution if f 00 (Q1) > 0:

Page 240: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 232

Example: Given the Cobb-Douglas utility function:

U (Q1; Q2) = Q21Q

42

then an indi¤erence curve which yields 5 units of utility is de…ned as:

U (Q1; Q2) = Q21Q

42 = 5 =) Q2 = 5

14Q

¡ 12

1

so that the indi¤erence curve is: Q2 = 514Q

¡12

1 : This indi¤erence curve is plottedbelow:

Q2

Q1

Indi¤erence Curve

:

Note that this indi¤erence curve has the correct shape: it is downward slopingand convex or bent towards origin. It therefore exhibits a diminishing marginalrate of substitution.You can verify that this is true for all indi¤erence curves where the indi¤er-

ence curve corresponding to c units of utility is:

Q2 = f (Q1) = c14Q

¡ 12

1 :

We can show that all indi¤erence curves slope downwards and show that theslope or marginal rate of substitution is equal to the negative of the ratio of themarginal utilities:

Theorem 278 Given a utility function U (Q1; Q2) the marginal rate of substi-tution is:

f 0 (Q1) = ¡@U(Q1;Q2)

@Q1

@U(Q1;Q2)@Q2

= ¡U1U2

< 0:

Proof. An indi¤erence curve is de…ned as:

U (Q1; f (Q1)) = c:

Page 241: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 233

Let U (Q1; Q2) be the outside function and let g1 (Q1) = Q1 and g2 (Q1) =f (Q1) be the two inside functions. Di¤erentiating both sides of with respect toQ1 and using the chain rule we …nd that:

@U (g1 (Q1) ; g2 (Q1))

@Q1g01 (Q1) +

@U (g1 (Q1) ; g2 (Q1))

@Q2g02 (Q1) = 0:

Since g01 (Q1) = 1 and g02 (Q1) = f 0 (Q1) we have:

@U (Q1; f (Q1))

@Q1+@U (Q1; f (Q1))

@Q2f 0 (Q1) = 0

and since Q2 = f (Q1) we can write this as:

@U (Q1; Q2)

@Q1+@U (Q1; Q2)

@Q2f 0 (Q1) = 0

so that solving for f 0 (Q1) and using U1 > 0 and U2 > 0 the result follows.Now suppose we take a monotonic transformation of U (Q1;Q2) as:

~U (Q1; Q2) = g (U (Q1; Q2))

where g0 (x) > 0. We might then ask what kind of utility function is ~U (Q1; Q2)?The quite surprising answer is that all practical purposes U (Q1; Q2) and ~U (Q1; Q2)are the same utility function! Actual economic behavior depends on the indif-ference curves and the indi¤erence curves of U (Q1; Q2) and ~U (Q1; Q2) areidentical; that is:

U (Q1; Q2) = c, ~U (Q1; Q2) = g (c) :

or all combinations Q1; Q2 which yield c units of utility given U (Q1; Q2) alsoyield g (c) units of utility given ~U (Q1; Q2) : The only di¤erence is that with oneutility function the indi¤erence curve has the utility number c associated with itwhile the other has the utility number g (c) associated with it. Actual economicbehavior though does not depend on what utility numbers we attach to eachindi¤erence curve and so U (Q1; Q2) and ~U (Q1; Q2) both represent the samepreferences and hence economic behavior.We thus have:

Theorem 279 An indi¤erence curve of a utility function U (Q1; Q2) is an or-dinal property of U (Q1; Q2) :

Example: Given the Cobb-Douglas utility function:

U (Q1; Q2) = Q21Q

42

if we transform it with g (x) = ln (x) (with g0 (x) = 1x > 0) then

~U (Q1;Q2) = ln¡Q21Q

42

¢= 2 ln (Q1) + 4 ln (Q2)

Page 242: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 234

is an equivalent utility function and has the same indi¤erence curves. Thus ifwe calculate the indi¤erence curve for ~U (Q1; Q2) which yields ln (5) units ofutility we obtain:

~U (Q1; Q2) = 2 ln (Q1) + 4 ln (Q2) = ln (5)

=) Q21Q42 = 5 =) Q2 = 5

14Q

¡ 12

1 :

which is the identical indi¤erence curve for 5 units of utility for U (Q1; Q2) =Q21Q

42 that we calculated above.In fact all of the utility functions below lead to the same indi¤erence curves:

Q21Q42; Q

131Q

232 ; e

Q151 Q

252 ;1

3ln (Q1) +

2

3ln (Q2) ; 2 ln (Q1) + 4 ln (Q2) :

The question now is under what circumstances do indi¤erence curves exhibita diminishing marginal rate of substitution? A good …rst guess would be theconcavity of the utility function. Although this is su¢cient it cannot be neces-sary. For example we have seen that U (Q1; Q2) = Q21Q

42 exhibits a diminishing

marginal rate of substitution even though U (Q1; Q2) = Q21Q42 is not concave

since concavity requires that the diagonal elements of the Hessian be negativewhile:

U11 = 2Q22 > 0; U22 = 12Q

21Q

22 > 0:

A necessary and su¢cient condition is in fact quasi-concavity. We have:

Theorem 280 The utility function U (Q1;Q2) exhibits a diminishing marginalrate of substitution if and only if it is quasi-concave.

Example: Despite not being concave, the utility function U (Q1; Q2) = Q21Q42

is quasi-concave since:

~U (Q1;Q2) = ln¡Q21Q

42

¢= 2 ln (Q1) + 4 ln (Q4)

and ~U (Q1; Q2) is globally concave.We can obtain the following calculus test for the quasi-concavity of the utility

function:

Theorem 281 A utility function U (Q1; Q2) with U1 > 0 and U2 > 0 is quasi-concave if and only if det [H] > 0 where

det [H] = det

24 0 U1 U2U1 U11 U12U2 U12 U22

35= ¡U22U11 + 2U12U1U2 ¡ U21U222:

Page 243: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 235

Proof. Using the multivariate chain rule on

@U (Q1; f (Q1))

@Q1+@U (Q1; f (Q1))

@Q2f 0 (Q1) = 0

we …nd that:

U11 + 2U12f0 (Q1) + U22f 0 (Q1)

2 + U2f00 (Q1) = 0:

Substituting f 0 (Q1) = ¡U1U2we obtain:

U11 ¡ 2U12U1U2+ U222

µU1U2

¶2+ U2f

00 (Q1) = 0

from which it follows that:

f 00 (Q1) = ¡ 1

U2

ÃU11 ¡ 2U12U1

U2+ U222

µU1U2

¶2!=

1

U32

¡¡U22U11 + 2U12U1U2 ¡ U21U222¢=

1

U32det [H] :

Since U2 > 0 it follows that f 00 (Q1) > 0 if and only if det [H] > 0:

Remark 1: This matrix is sometimes referred to as the bordered Hessian. Itcontains the ordinary Hessian of U (Q1; Q2) in the lower right-hand corner andis bordered by the …rst derivatives on either side with a 0 in the upper left-handcorner.

Example: For

U (Q1; Q2) = Q21Q

42

the bordered Hessian is:

H =

24 0 2Q1Q42 4Q21Q

32

2Q1Q42 2Q42 8Q1Q

32

4Q21Q32 8Q1Q

32 12Q21Q

22

35and (with some work) you can show that:

det [H] = 48Q41Q102 > 0

so that, as we already knew, U (Q1; Q2) is quasi-concave.

Page 244: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 236

4.6 Constrained Optimization

Economics whether normative or positive, has not simply been thestudy of the allocation of scarce resources, it has been the study ofthe rational allocation of scarce resources. -Herbert A. Simon

Typically in economics when rational agents attempt to maximize pro…ts orutility, or to minimize costs or expenditure, they are not free to choose any oneof the variables they control. Instead they face some constraint that restrictsthe choices they can make. This is because economics is about scarcity andscarcity imposes constraints on economies and agents. For example a householdmaximizing utility cannot choose any bundle of goods might want, but can onlychoose from amongst those bundles that it can a¤ord; that is which satisfy thehousehold’s budget constraint.This leads to a new kind of optimization problem from what we have con-

sidered so far where instead of working directly with the objective functionf (x1; x2; : : : xn) we construct a new function, the Lagrangian:

L (¸; x1; x2; : : : xn)

and work with this function instead.Economists work with Lagrangians all the time. In a way it is the most

important mathematical technique for you to learn if you want to go on ineconomics.

4.6.1 The Lagrangian

Suppose we wish to maximize or minimize a multivariate function

f (x1; x2; : : : xn)

subject to a constraint. The constraint is written as:

g (x1; x2; : : : xn) = 0:

This means that in maximizing or minimizing f (x1; x2; : : : xn) we can onlychoose those x1; x2; : : : xn which make g (x1; x2; : : : xn) equal to zero.To do this we construct the Lagrangian, which is a function of n+1 variables:

the Lagrange multiplier ¸; which is a scalar, and the n components x1; x2; : : : xn:We have:

De…nition 282 Corresponding to the problem of maximizing or minimizing theobjective function:

f (x1; x2; : : : xn)

Page 245: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 237

subject to the constraint

g (x1; x2; : : : xn) = 0

is the Lagrangian given by:

L (¸; x1; x2; : : : xn) = f (x1; x2; : : : xn) + ¸g (x1; x2; : : : xn) :At the beginning students often make mistakes in constructing the La-

grangian. To avoid these errors consider the following step-by-step recipe:

A Recipe for Constructing the Lagrangian

1. Identify the objective function, the function to be maximized or mini-mized:

f (x1; x2; : : : xn) :

2. Identify the constraint and, if necessary, rewrite the constraint in theform of

______ = 0:

3. Write the Lagrangian function using L with the …rst argument the La-grange multiplier ¸ followed by the xi 0s . We thus write:

L (¸; x1; x2; : : : xn) =

4. After the equality sign in 3: write the objective function: f (x1; x2; : : : xn)followed by +¸ and then brackets as:

L (¸; x1; x2; : : : xn) = f (x1; x2; : : : xn) + ¸ ( ) :

5. Inside the brackets in 4: put the expression to on left-hand side of theconstraint written as ___ = 0 as:

(¸; x1; x2; : : : xn) = f (x1; x2; : : : xn) + ¸

0B@ _________| {z }left-side of _=0 from 2:

1CAwhich then gives the Lagrangian.

Example 1: Consider the problem of minimizing

x21 +1

2x22

subject to the constraint that x1 and x2 sum to 1 or that:

x1 + x2 = 1:

Page 246: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 238

1. We …rst identify what is the constraint and what is to be maximized.Here a typical error would be to confuse: x1 + x2; which forms part ofthe constraint, with: x21 +

12x

22 which is the objective function. The

objective function is what we wish to minimize:

f (x1; x2) = x21 +

1

2x22:

2. The constraint is that x1 + x2 = 1: We need to rewrite this as: ___ =0: This is easily done by putting x1 + x2 on the other side of the equalsign as:

x1 + x2 = 1 =) 1¡ x1 ¡ x2 = 0

so that g (x1; x2) is given by:

g (x1; x2) = 1¡ x1 ¡ x2 = 0:

3. Here the Lagrangian is a function of ¸ and x1 and x2 so we write:

L (¸; x1; x2) = :

4. After the equal sign in 3: we write the objective function from 1 followedby +¸ ( ) as:

L (¸; x1; x2) = x21 +1

2x22| {z }

ob jective function

+ ¸ (_____) :

5. Inside the brackets we place the left-hand side of the constraint written as___ = 0: Thus from 2: we have:

L (¸; x1; x2) = x21 +1

2x22 + ¸

0@1¡ x1 ¡ x2| {z }from 2

1A :

Example 2: Suppose a household wishes to maximize utility:

U (Q1; Q2) :

where Q1 and Q2 are the amounts of good 1 and good 2 that the householdconsumes. The household has income Y , the price of Q1 is P1 and the price ofQ2 is P2 so that the budget constraint is:

Y = P1Q1 + P2Q2:

Page 247: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 239

We need to rewrite this as g (Q1;Q2) = 0: This can be done in a number ofways. Here we will use:

Y = P1Q1 + P2Q2 =) Y ¡ P1Q1 ¡ P2Q2 = 0so that the constraint is:

g (Q1; Q2) = Y ¡ P1Q1 ¡ P2Q2 = 0:The Lagrangian is therefore:

L (¸;Q1; Q2) = U (Q1; Q2)| {z }objective function

+ ¸

0@Y ¡ P1Q1 ¡ P2Q2| {z }constraint

1A :

Example 3: Suppose now that we have the particular utility function:

U (Q1; Q2) = 0:3 ln (Q1) + 0:7 ln (Q2)

and as above the budget constraint is:

Y = P1Q1 + P2Q2

or:

g (Q1; Q2) = Y ¡ P1Q1 ¡ P2Q2 = 0:The Lagrangian is therefore:

L (¸;Q1; Q2) = 0:3 ln (Q1) + 0:7 ln (Q2)| {z }=U(Q1;Q2)

+ ¸

0B@Y ¡ P1Q1 ¡ P2Q2| {z }=g(Q1;Q2)

1CA :

Example 4: Suppose a …rm has a Cobb-Douglas production function:

Q = F (L;K) = L12K

34 :

The …rm’s objective is to minimize the cost of producing Q units. Thus theobjective function is costs:

WL+RK

where W is the wage and R the rental cost of capital.The constraint is that L and K must produce Q units of output (otherwise

L = K = 0 minimizes costs!) so that Q = F (L;K) is the constraint. Using:

Q = F (L;K) =) Q¡ L 12K

34 = 0

Page 248: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 240

we can rewrite the constraint as g (L;K) = 0 where:

g (L;K) = Q¡ L 12K

34 = 0:

We therefore have the Lagrangian for cost minimization as:

L (¸;L;K) =WL+RK| {z }ob jective

+ ¸

0@Q¡ L 12K

34| {z }

constraint

1A :Example 5: Now consider the more general problem of a …rm with a productionfunction

Q = F (L;K)

wishes to minimize cost of producing Q units. The Lagrangian is then:

L (¸;L;K) =WL+RK| {z }ob jective

+ ¸

0@Q¡ F (L;K)| {z }constraint

1A :4.6.2 First-Order Conditions

For constrained optimization we have, just as before, …rst-order conditions. Nowhowever the relevant …rst-order conditions are not with respect to f (x1; x2; : : : xn)but with respect to the Lagrangian:

L (¸; x1; x2; : : : xn) = f (x1; x2; : : : xn) + ¸g (x1; x2; : : : xn) :We have:

Theorem 283 Suppose x¤1; x¤2; : : : x¤n either maximizes or minimizes the objec-tive function: f (x1; x2; : : : xn) subject to the constraint g (x1; x2; : : : xn) = 0:Then there is a ¸¤ such that:

@L (¸¤; x¤1; x¤2; : : : x¤n)@¸

= g (x¤1; x¤2; : : : x

¤n) = 0

@L (¸¤; x¤1; x¤2; : : : x¤n)@xi

=@f (x¤1; x¤2; : : : x¤n)

@xi+ ¸¤

@g (x¤1; x¤2; : : : x¤n)@xi

= 0;

i = 1; 2; : : : n:

Remark 1: There are n+1 …rst-order conditions leading to n+1 equations in n+1 unknowns: ¸¤; x¤1; x¤2; : : : x¤n: Since there are as many equations as unknowns,it should be possible to solve them for ¸¤; x¤1; x¤2; : : : x¤n:

Remark 2: It is important that the Lagrange multiplier ¸ also have a ¤: Tosolve the …rst-order conditions you must solve for ¸¤: Thus in essence then thereis no di¤erence between the treatment of the xi 0s and ¸:

Page 249: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 241

Remark 3: The …rst of the …rst-order conditions:

@L (¸¤; x¤1; x¤2; : : : x¤n)@¸

= g (x¤1; x¤2; : : : x

¤n) = 0

insures that x¤1; x¤2; : : : x¤n satis…es the constraint:

Example 1 (continued): From the Lagrangian:

L (¸; x1; x2) = x21 +1

2x22 + ¸ (1¡ x1 ¡ x2)

we need to calculate three partial derivatives:@L@¸ ;@L@x1

and @L@x2: We thus have:

@L@¸

=@

µx21 +

1

2x22 + ¸ (1¡ x1 ¡ x2)

¶= 1¡ x1 ¡ x2

@L@x1

=@

@x1

µx21 +

1

2x22 + ¸ (1¡ x1 ¡ x2)

¶= 2x1 ¡ ¸

@L@x1

=@

@x1

µx21 +

1

2x22 + ¸ (1¡ x1 ¡ x2)

¶= x2 ¡ ¸:

Putting a ¤ on ¸; x1x2 and setting the derivatives equal to zero we obtain:

1¡ x¤1 ¡ x¤2 = 0 =) x¤1 + x¤2 = 1

2x¤1 ¡ ¸¤ = 0 =) x¤1 =1

2¸¤

x¤2 ¡ ¸¤ = 0 =) x¤2 = ¸¤:

Using the …rst result and adding up the second and third results we have:

1 = x¤1 + x¤2 =

1

2¸¤ + ¸¤ =

3

2¸¤

=) 1 =3

2¸¤

=) ¸¤ =2

3

Now that we have ¸¤ we can solve for x¤1 and x¤2 as:

x¤1 =1

2¸¤ =

1

2£ 23=1

3

x¤2 = ¸¤ =2

3:

Thus the solution is ¸¤ = 23 ; x

¤1 =

13 and x

¤2 =

23 :

Page 250: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 242

Example 2 (continued): For the utility maximization problem we obtainedthe Lagrangian:

L (¸;Q1; Q2) = U (Q1;Q2) + ¸ (Y ¡ P1Q1 ¡ P2Q2) :We need to calculate three partial derivatives:@L@¸ ;

@L@Q1

and @L@Q2

as:

@L@¸

=@

@¸(U (Q1; Q2) + ¸ (Y ¡ P1Q1 ¡ P2Q2))

= Y ¡ P1Q1 ¡ P2Q2@L@Q1

=@

@Q1(U (Q1; Q2) + ¸ (Y ¡ P1Q1 ¡ P2Q2))

=@U (Q1; Q2)

@Q1¡ ¸P1

@L@Q2

=@

@x1(U (Q1; Q2) + ¸ (Y ¡ P1Q1 ¡ P2Q2))

=@U (Q1; Q2)

@Q2¡ ¸P2:

Setting ¸ = ¸¤; Q1 = Q¤1; Q2 = Q¤2 and setting the partial derivatives equal tozero we obtain three …rst-order conditions:

Y ¡ P1Q¤1 ¡ P2Q¤2 = 0

@U (Q¤1; Q¤2)@Q1

¡ ¸¤P1 = 0

@U (Q¤1; Q¤2)@Q2

¡ ¸¤P2 = 0:

Note that the …rst condition insures that:

P1Q¤1 + P2Q

¤2 = Y

so that Q¤1 and Q¤2 satisfy the budget constraint.Since we have made no assumptions about U (Q1; Q2) we cannot hope to

solve these three equations directly for ¸¤; Q¤1; Q¤2: We can however use theseequations to learn something about the nature of the optimal decision rule forthe household. From the second and third of the …rst-order conditions we have:

@U (Q¤1;Q¤2)@Q1

¡ ¸¤P1 = 0 =)MU1 (Q¤1; Q

¤2) = ¸

¤P1

=) MU1 (Q¤1; Q¤2)P1

= ¸¤

and

@U (Q¤1;Q¤2)@Q2

¡ ¸¤P2 = 0 =)MU2 (Q¤1; Q

¤2) = ¸

¤P2

=) MU2 (Q¤1; Q

¤2)

P2= ¸¤:

Page 251: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 243

From these two results we conclude that:

MU1 (Q¤1; Q

¤2)

P1=MU2 (Q

¤1; Q

¤2)

P2= ¸¤:

This says that a rational household will allocate its income Y between Q1 andQ2 so as to equate the ratio of each good’s marginal utility to its price. Thisis the familiar condition from introductory economics. In introductory howeverone does not answer the question, what are MU1

P1and MU2

P2are equal to? The

answer is ¸¤; the Lagrange multiplier. Later you will learn that ¸¤ is in fact themarginal utility of income.

Example 3 (continued): For the Lagrangian:

L (¸;Q1;Q2) = 0:3 ln (Q1) + 0:7 ln (Q2) + ¸ (Y ¡ P1Q1 ¡ P2Q2)

we need to calculate three partial derivatives: @L@¸ ;@L@Q1

and @L@Q2

as:

@L@¸

=@

@¸(0:3 ln (Q1) + 0:7 ln (Q2) + ¸ (Y ¡ P1Q1 ¡ P2Q2))

= Y ¡ P1Q1 ¡ P2Q2@L@Q1

=@

@Q1(0:3 ln (Q1) + 0:7 ln (Q2) + ¸ (Y ¡ P1Q1 ¡ P2Q2))

=0:3

Q1¡ ¸P1

@L@Q2

=@

@Q2(0:3 ln (Q1) + 0:7 ln (Q2) + ¸ (Y ¡ P1Q1 ¡ P2Q2))

=0:7

Q2¡ ¸P2:

Setting ¸ = ¸¤; Q1 = Q¤1; Q2 = Q¤2 and setting the partial derivatives equal tozero we obtain three …rst-order conditions:three …rst-order conditions:

Y ¡ P1Q¤1 ¡ P2Q¤2 = 0

0:3

Q¤1¡ ¸¤P1 = 0

0:7

Q¤2¡ ¸¤P2 = 0:

We have three equations with three unknowns. To solve them take thesecond and third equations to obtain:

0:3

Q¤1¡ ¸¤P1 = 0 =) Q¤1 =

0:3

¸¤P10:7

Q¤2¡ ¸¤P2 = 0 =) Q¤2 =

0:7

¸¤P2:

Page 252: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 244

We are not yet at the solution because we still do not know ¸¤: Substitutingthese two into the budget constraint we obtain:

Y = P1Q¤1 + P2Q

¤2 = P1

µ0:3

¸¤P1

¶+ P2

µ0:7

¸¤P2

¶=

0:3

¸¤+0:7

¸¤

=1

¸¤:

From this it follows that: ¸¤ = 1Y : This says that the marginal utility of income

decreases with income, or richer people get less utility out of an extra dollarthan poorer people. We then have:

Q¤1 =0:3

¸¤P1=

0:31Y P1

=0:3Y

P1and

Q¤2 =0:7

¸¤P2=

0:71Y P2

=0:7Y

P2

so that Q¤1 =0:3YP1

is the demand curve for good 1 and Q¤2 =0:7YP2

is the demandcurve for good 2:

Example 4 (continued): From the Lagrangian:

L (¸;L;K) =WL+RK + ¸³Q¡ L 1

2K34

´we obtain the …rst-order conditions:

@L (¸¤; L¤;K¤)@¸

= Q¡ L¤ 12K¤ 34 = 0

@L (¸¤; L¤;K¤)@L

= W ¡ 12¸¤L¤¡

12K¤ 34 = 0

@L (¸¤; L¤;K¤)@K

= R¡ 34¸¤L¤

12K¤¡ 1

4 = 0:

Using the ln ( ) function we can convert these into a system of 3 linear equationsin 3 unknowns as:

Q¡ L¤ 12K¤ 34 = 0 =) 1

2ln (L¤) +

3

4ln (K¤) = ln (Q)

W ¡ 12¸¤L¤¡

12K¤ 34 = 0 =) ln (¸¤)¡ 1

2ln (L¤) +

3

4ln (K¤) = ln (2W )

R¡ 34¸¤L¤

12K¤¡1

4 = 0 =) ln (¸¤) +1

2ln (L¤)¡ 1

4ln (K¤) = ln

µ4

3R

¶which can be written in matrix notation as:24 0 1

234

1 ¡12

34

1 12 ¡1

4

3524 ln (¸¤)ln (L¤)ln (K¤)

35 =24 ln (Q)ln (2W )ln¡43R¢35 :

Page 253: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 245

You can verify that

det

24 0 12

34

1 ¡12

34

1 12 ¡1

4

35 = 5

4:

Using Cramer’s rule we then …nd that:

ln (¸¤) =

det

24 ln (Q) 12

34

ln (2W ) ¡12

34

ln¡43R¢

12 ¡1

4

3554

= ¡15ln (Q) +

2

5ln (2W ) +

3

5ln

µ4

3R

ln (L¤) =

det

24 0 ln (Q) 34

1 ln (2W ) 14

1 ln¡43R¢ ¡3

4

3554

=4

5ln (Q)¡ 3

5ln (2W ) +

3

5ln

µ4

3R

ln (K¤) =

det

24 0 12 ln (Q)

1 ¡12 ln (2W )

1 12 ln

¡43R¢35

54

=4

5ln (Q) +

2

5ln (2W )¡ 2

5ln

µ4

3R

¶from which it follows that:

¸¤ = Q¡15 (2W )

25

µ4

3R

¶ 35

L¤ = Q45 (2W )¡

35

µ4

3R

¶ 35

K¤ = Q45 (2W )

25

µ4

3R

¶¡25

:

From these we can work out the …rm’s cost function as:

C¤ (Q;W;R) = WL¤ +RK¤

=

õ2

3

¶ 35

+

µ3

2

¶ 25

!Q

45W

25R

35 :

Let us now note some patterns that are generally true. Note that the La-grange multiplier ¸¤ turns out to be marginal cost; that is:

@C¤ (Q;W;R)@Q

= ¸¤ = Q¡15 (2W )

25

µ4

3R

¶ 35

:

The fact that marginal cost falls with Q re‡ects the increasing returns to scale ofthis technology. L¤ and K¤ are the conditional factor demands for L andK;

Page 254: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 246

that is conditional on the …rm producing an output level Q this is the optimal(cost minimizing) amount of labour and capital that they would demand. NoteL¤ and K¤ here are not the same as the ordinary demand and supply curvesfor labour which are based on pro…t maximization and which have argumentsP;W and R and not Q;W and R as here. It is also the case that:

@C¤ (Q;W;R)@W

= L¤ = Q45 (2W )¡

35

µ4

3R

¶ 35

@C¤ (Q;W;R)@R

= K¤ = Q45 (2W )

25

µ4

3R

¶¡ 25

:

These two results are examples of Shephard’s lemma.

Example 5 (continued): Given the Lagrangian from the cost minimizationproblem:

L (¸;L;K) =WL+RK + ¸ (Q¡ F (L;K))

we have the …rst-order conditions for cost minimization:

@L (¸¤; L¤;K¤)@¸

= Q¡ F (L¤;K¤) = 0

@L (¸¤; L¤;K¤)@L

= W ¡ ¸¤ @F (L¤;K¤)@L

= 0

@L (¸¤; L¤;K¤)@K

= R¡ ¸¤ @F (L¤;K¤)@K

= 0

The …rst condition insures that: Q = F (L¤;K¤) so that L¤ and K¤ produce Qunits of output. From the second and third conditions, and recalling that

@F (L¤;K¤)@L

=MPL (L¤;K¤) ;

@F (L¤;K¤)@K

=MPK (L¤;K¤)

it follows from the second and third …rst-order conditions that:

1

¸¤=MPL (L

¤;K¤)W

=MPK (L

¤;K¤)R

:

4.6.3 Second-Order Conditions

As with unconstrained optimization any solution to the …rst-order conditionscan be either a maximum or a minimum. We can determine if ¸¤; x¤1; x¤2; : : : x¤nis a local maximum or minimum by examining the Hessian of the Lagrangian

Page 255: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 247

given by:

H (¸; x1; x2; : : : xn) =

266666664

@2L@¸2

@2L@¸@x1

@2L@¸@x2

¢ ¢ ¢ @2L@¸@xn

@2L@¸@x1

@2L@x21

@2L@x1@x2

¢ ¢ ¢ @2L@x1@xn

@2L@¸@x2

@2L@x1@x2

@2L@x22

¢ ¢ ¢ @2L@x2@xn

......

.... . .

...@2L@¸@xn

@2L@x1@xn

@2L@x2@xn

¢ ¢ ¢ @2L@x2n

377777775

=

2666666664

0 @g(x)@x1

@g(x)@x2

¢ ¢ ¢ @g(x)@xn

@g(x)@x1

@2f(x)@x21

+ ¸@2g(x)@x21

@2f(x)@x1@x2

+ ¸ @2g(x)

@x1@x2¢ ¢ ¢ @2f(x)

@x1@xn+ ¸ @2g(x)

@x1@xn@g(x)@x2

@2f(x)@x1@x2

+ ¸ @2g(x)

@x1@x2

@2f(x)@x22

+ ¸@2g(x)@x22

¢ ¢ ¢ @2f(x)@x2@xn

+ ¸ @2g(x)@x2@xn

......

.... . .

...@g(x)@xn

@2f(x)@x1@xn

+ ¸ @2g(x)@x1@xn

@2f(x)@x2@xn

+ ¸ @2g(x)@x2@xn

¢ ¢ ¢ @2f(x)@x2n

+ ¸@2g(x)@x2n

3777777775:

Remark 1: Note the zero in the upper left-hand corner of H (¸; x1; x2; : : : xn) :This occurs because the Lagrangian is a linear function of ¸ so that:

L (¸; x1; x2; : : : xn) = f (x1; x2; : : : xn) + ¸g (x1; x2; : : : xn)

=) @L (¸; x1; x2; : : : xn)@¸

= g (x1; x2; : : : xn)

=) @2L (¸; x1; x2; : : : xn)@¸2

=@

@¸g (x1; x2; : : : xn) = 0:

Remark 2: Note that the partial derivatives @g(x)@xi

i = 1; 2; : : : n of the con-straint function along the border of the Hessian. For this reason H (¸; x) issometimes referred to as the bordered Hessian.

Remark 3: Since neither a positive de…nite nor a negative de…nite matrix canhave a 0 along the diagonal, it follows that L (¸; x1; x2; : : : xn) is neitherconcave nor convex. It follows that the second-order conditions cannot bethe same as with unconstrained optimization. Another way of seeing this pointis that for the …rst leading principal minor

M1 = 0

always. Thus M1 tells us nothing about whether we have a maximum or aminimum. Since the …rst diagonal element of the Hessian is 0, it follows thatL (¸; x1; x2; : : : xn) is neither concave nor convex and so the second-orderconditions cannot be the same as with unconstrained optimization.This point is reinforced by the fact that the second leading principal minor

is

M2 = ¡µ@g (x1; x2; : : : xn)

@x1

¶2< 0

Page 256: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 248

and so this also tells us nothing about whether ¸¤; x¤1; x¤2; : : : x¤n corresponds toa maximum or a minimum.It is only at the third leading principal minor M3 where the Hessian begins

to tell us something about we have a maximum or a minimum. In particular let

M3; M4; M5; : : :

be the leading principal minors of H (¸¤; x¤1; x¤2; : : : x¤n) :We then have:

Theorem 284 Suppose that ¸¤; x¤1; x¤2; : : : x¤n satisfy the …rst-order conditionsfrom the Lagrangian and that the leading principal minors of the Hessian H (¸¤; x¤1; x¤2; : : : x¤n)satisfy:

M3 > 0; M4 < 0 ; M5 > 0 ¢ ¢ ¢then ¸¤; x¤1; x¤2; : : : x¤n corresponds to a constrained local maximum.

Theorem 285 Suppose that ¸¤; x¤1; x¤2; : : : x¤n satisfy the …rst-order conditionsfrom the Lagrangian and that the leading principal minors of the Hessian H (¸¤; x¤1; x¤2; : : : x¤n)satisfy:

M3 < 0; M4 < 0 ; M5 < 0 ¢ ¢ ¢then ¸¤; x¤1; x¤2; : : : x¤n corresponds to a constrained local minimum.

Remark: Evaluating the leading principal minors of bordered Hessians is oftena tedious business. Most if not all of the examples that we will consider involvethe case where there are n = 2 independent variables or xi 0s so that the Hessianis a 3 £ 3 matrix. In this case one need only calculate the determinant of theHessian itself and check that it is positive for a maximum or negative for aminimum. In particular we have:

Theorem 286 If n = 2 then the solution to the …rst-order conditions: ¸¤; x¤1; x¤2represents a local constrained maximum if

M3 > 0

and a local constrained minimum if:

M3 < 0:

Example 1 (continued): From the Lagrangian:

L (¸; x1; x2) = x21 +1

2x22 + ¸ (1¡ x1 ¡ x2)

Page 257: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 249

the Hessian is calculated from the second derivatives of the Lagrangian as:

@2L (¸; x1; x2)@¸2

=@

@¸(1¡ x1 ¡ x2) = 0

@2L (¸; x1; x2)@¸@x1

=@

@x1(1¡ x1 ¡ x2) = ¡1

@2L (¸; x1; x2)@¸@x2

=@

@x2(1¡ x1 ¡ x2) = ¡1

@2L (¸; x1; x2)@x21

=@

@x1(2x1 ¡ ¸) = 2

@2L (¸; x1; x2)@x1@x2

=@

@x2(2x1 ¡ ¸) = 0

@2L (¸; x1; x2)@x22

=@

@x2(x2 ¡ ¸) = 1

so that the Hessian is given by:

H (¸¤; x¤1; x¤2) =

24 0 ¡1 ¡1¡1 2 0¡1 0 1

35 :Note that the Hessian for this problem does not depend on ¸; x1 and x2.

The second-order conditions for a minimum are then satis…ed since:

M3 = det [H (¸; x1; x2)] = det

24 0 ¡1 ¡1¡1 2 0¡1 0 1

35= ¡3 < 0:

Example 2 (continued): For the utility maximization problem with the La-grangian:

L (¸;Q1;Q2) = U (Q1; Q2) + ¸ (Y ¡ P1Q1 ¡ P2Q2)

the Hessian is given by:

H (¸¤; Q¤1; Q¤2) =

24 0 ¡P1 ¡P2¡P1 U11 U12¡P2 U12 U22

35where:

U11 ´ @2U (Q¤1; Q¤2)@Q21

; U12 ´ @2U (Q¤1; Q¤2)@Q1@Q2

; U22 ´ @2U (Q¤1; Q¤2)@Q22

:

Page 258: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 250

In order for ¸¤; Q¤1; Q¤2 to be a utility maximum (and not a minimum!) werequire:

M3 = det [H (¸¤; Q¤1; Q¤2)]

= ¡P 21U22 + 2P1P2U12 ¡ P 22U11 > 0:This condition requires that the household’s indi¤erence curve be convex at¸¤; Q¤1; Q¤2 so that there is a local diminishing marginal rate of substitution.

Example 3 (continued): The Hessian of the Lagrangian:

L (¸;Q1;Q2) = 0:3 ln (Q1) + 0:7 ln (Q2) + ¸ (Y ¡ P1Q1 ¡ P2Q2)at ¸¤; Q¤1; Q¤2 is given by:

H (¸¤;Q¤1; Q¤2) =

264 0 ¡P1 ¡P2¡P1 ¡ 0:3

(Q¤1)

2 0

¡P2 0 ¡ 0:7

(Q¤2)

2

375 :Since the Hessian is a 3£3 matrix, we only have to calculate the determinant ofH (¸¤; Q¤1; Q¤2) and verify that it is positive to show that ¸

¤;Q¤1; Q¤2 correspondto a local maximum. Thus:

M3 = det [H (¸¤; Q¤1;Q

¤2)] =

0:7P 21

(Q¤2)2 +

0:3P 22

(Q¤1)2 > 0

and the second-order conditions for a local maximum are satis…ed.

Example 4 (continued): From the Lagrangian:

L (¸;L;K) =WL+RK + ¸³Q¡ L 1

2K34

´the Hessian is given by:

H (¸;L;K) =

24 0 ¡12L

¡12K

34 ¡3

4L12K¡ 1

4

¡12L

¡ 12K

34

14¸L

¡ 32K

34 ¡3

8¸L¡ 12K¡ 1

4

¡34L

12K¡ 1

4 ¡38¸L

¡ 12K¡ 1

4316¸L

12K¡ 5

4

35 :With some straightforward work it can be shown that:

M3 = det [H (¸¤; L¤;K¤)] = ¡15

32¸¤L¤¡

12K¤ 14 < 0

(recall from the solution to the …rst-order conditions that ¸¤ > 0) and so¸¤; L¤;K¤ corresponds to a local minimum as required.

Example 5 (continued): Given the Lagrangian

L (¸;L;K) =WL+RK + ¸ (Q¡ F (L;K))

Page 259: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 251

the Hessian is given by:

H (¸;L;K) =

264 0 ¡@F (L;K)@L ¡@F (L;K)

@K

¡@F (L;K)@L ¡¸@2F (L;K)@L2 ¡¸@2F (L;K)@L@K

¡@F (L;K)@K ¡¸@2F (L;K)@L@K ¡¸@2F (L;K)@K2

375 :To make the notation more compact de…ne

FL ´ @F (L¤;K¤)@L

;FK ´ @F (L¤;K¤)@K

;

FLL ´ @2F (L¤;K¤)@L2

; FKK ´ @2F (L¤;K¤)@K2

and FLK ´ @2F (L¤;K¤)@L@K

:

The second-order conditions then require that:

det [H (¸¤; L¤;K¤)] = det

24 0 ¡FL ¡FK¡FL ¡¸¤FLL ¡¸¤FLK¡FK ¡¸¤FLK ¡¸¤FKK

35 < 0or

¸¤¡F 2LFKK ¡ 2FLFKFLK + F2KFLL

¢< 0:

As with utility maximization, this condition requires that the isoquant be benttowards the origin.

4.6.4 Su¢cient Conditions for a Global Maximum or Min-imum

The second order conditions we have examined only guarantee a local con-strained maximum or minimum; they do not guarantee that ¸¤; x¤1; x¤2; : : : x¤nwill correspond to a global maximum or minimum. Like unconstrained opti-mization, we can insure that ¸¤; x¤1; x¤2; : : : x¤n is a global maximum or minimumby appealing to quasi-concavity or quasi-convexity, but now we need to examinethe properties of both the objective function f (x1; x2; : : : xn) and the constraintfunction g (x1; x2; : : : xn) ; as well as the sign of the Lagrange multiplier ¸

¤:Almost all of the problems in economics one encounters at the intermedi-

ate level involve either a linear objective function f (x1; x2; : : : xn) ; as in costminimization, or a linear constraint function g (x1; x2; : : : xn) ; as in utility max-imization. For these cases we can use the following results:

Theorem 287 If 1) f (x1; x2; : : : xn) is quasi-concave, 2) the constraint islinear, that is it can be written as:

g (x1; x2; : : : xn) = a¡ b1x1 ¡ b2x2 ¡ ¢ ¢ ¢ ¡ bnxnand if 3) ¸¤ > 0; then ¸¤; x¤1; x¤2; : : : x¤n corresponds to a constrained globalmaximum.

Page 260: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 252

Theorem 288 If 1) f (x1; x2; : : : xn) is quasi-convex, 2) the constraint is lin-ear, that is it can be written as:

g (x1; x2; : : : xn) = a¡ b1x1 ¡ b2x2 ¡ ¢ ¢ ¢ ¡ bnxnand if 3) ¸¤ > 0; then ¸¤; x¤1; x¤2; : : : x¤n corresponds to a constrained globalminimum.

Remark: In addition to requiring that the constraint be linear, note that weneed to insure that the Lagrange multiplier ¸¤ > 0 or ¸¤ is positive. If you…nd that ¸¤ < 0 this might be because of the way that you wrote down theconstraint. For example with utility maximization if you wrote the constraintas:

g (Q1; Q2) = P1Q1 + P2Q2 ¡ Y = 0

you would obtain ¸¤ < 0 while if instead you used:

g (Q1; Q2) = Y ¡ P1Q1 ¡ P2Q2 = 0

you would obtain ¸¤ > 0: Thus if you …nd ¸¤ < 0 you may be able to …x thisproblem by rewriting the constraint.

Example 1 (continued): Consider the problem of minimizing:

f (x1; x2) = x21 +

1

2x22

subject to the constraint:

g (x1; x2) = 1¡ x1 ¡ x2 = 0:

We …rst show that f (x1; x2) is convex and hence that it is quasi-convex. Thisfollows since its Hessian is given by:

H (x1; x2) =

·2 00 1

¸which is positive de…nite for all (x1; x2) : The second condition, that the con-straint is linear is obviously satis…ed. Finally we showed that:

¸¤ =2

3> 0

so the third condition is also satis…ed. It follows then that: x¤1 =13 and x

¤2 =

23

is the global minimum for all x1 and x2 which satisfy:

g (x1; x2) = 1¡ x1 ¡ x2 = 0

Page 261: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 253

or:

x1 + x2 = 1:

Example 2 (continued): Consider utility maximization where:

L (¸;Q1; Q2) = U (Q1;Q2) + ¸ (Y ¡ P1Q1 ¡ P2Q2) :and where we assume that U (Q1; Q2) is quasi-concave so that the indi¤erencecurves have the correct shape. Thus the …rst requirement for a global maximumis satis…ed by assumption. It is also the case that the constraint is linear since:

g (Q1; Q2) = Y ¡ P1Q1 ¡ P2Q2so the second requirement is also satis…ed. Now from the …rst-order conditionswe have:

MU1 (Q¤1; Q

¤2) =

@U (Q¤1; Q¤2)@Q1

= ¸¤P1:

Since P1 > 0 and MU1 (Q¤1; Q¤2) > 0 it follows that ¸¤ > 0 so that the thirdrequirement for a global maximum is satis…ed. We therefore conclude that¸¤; Q¤1; Q¤2 correspond to a global maximum.

Example 3 (continued): Consider the problem of maximizing:

U (Q1; Q2) = 0:3 ln (Q1) + 0:7 ln (Q2)

subject to the budget constraint:

g (Q1; Q2) = Y ¡ P1Q1 ¡ P2Q2 = 0:We …rst show that U (Q1; Q2) is concave and hence that it is quasi-concave.

This follows since the Hessian of U (Q1; Q2) is:

H (Q1; Q2) =

" ¡ 0:3(Q1)

2 0

0 ¡ 0:7(Q2)

2

#is a diagonal matrix with negative diagonal elements and hence is negativede…nite for all Q1 and Q2: Obviously the budget constraint is linear so that thesecond condition for a global maximum is also satis…ed. Finally we showed that

¸¤ =1

Y> 0

so the third condition for a global maximum is satis…ed. Thus:

Q¤1 =0:3Y

P1; Q¤2 =

0:7Y

P2

corresponds to a global maximum.The other class of problems one typically encounters is where the objective

function is linear and the constraint is quasi-concave or quasi-convex. In thiscase we have:

Page 262: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 254

Theorem 289 If 1) f (x1; x2; : : : xn) is linear so that it can be written as:

f (x1; x2; : : : xn) = a+ b1x1 + b2x2 + ¢ ¢ ¢+ bnxn;2) the constraint function g (x1; x2; : : : xn) is quasi-concave (quasi-convex) andif 3) ¸¤ > 0 then ¸¤; x¤1; x¤2; : : : x¤n correspond to a constrained global minimum(maximum).

Example 4 (continued): Consider the problem of minimizing cost:

WL+RK

subject to the constraint:

g (L;K) = Q¡ L 12K

34 = 0:

Obviously the objective function is a linear function of L and K and hencethe …rst condition for a global minimum is satis…ed. The constraint functiong (L;K) is not convex since its Hessian is given by:

H (L;K) =

·14L

¡32K

34 ¡3

8L¡ 12K¡1

4

¡38L

¡ 12K¡1

4316L

12K¡ 5

4

¸which is not positive de…nite since:

M2 =3

64L¡1K¡ 1

2 ¡ 9

64L¡1K¡ 1

2

= ¡ 6

64L¡1K¡ 3

2 < 0:

We can show, however, that g (L;K) is quasi-convex since:

g (L;K) = Q¡ L 12K

34

= Q¡ expµ¡µ¡12ln (L) +¡3

4ln (K)

¶¶= r (s (L;K))

where the monotonic function is:

r (x) = Q¡ exp (¡x) ; r0 (x) = exp (¡x) > 0and function:

s (L;K) = ¡12ln (L) +¡3

4ln (K)

is convex since it has a Hessian:

Hs (L;K) =

·12L2 00 3

4K2

¸

Page 263: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 255

which is positive de…nite for all L and K:Finally we showed that:

¸¤ = Q¡15 (2W )

25

µ4

3R

¶ 35

> 0

so that the third condition for a global minimum is satis…ed. We thereforeconclude that:

¸¤ = Q¡15 (2W )

25

µ4

3R

¶ 35

L¤ = Q45 (2W )¡

35

µ4

3R

¶ 35

K¤ = Q45 (2W )

25

µ4

3R

¶¡ 25

correspond to a constrained global minimum.

Example 5 (continued): Consider the general cost minimization problemwhere the objective function is cost:

WL+RK

and the constraint is:

g (L;K) = Q¡ F (L;K) = 0and where we assume that F (L;K) is quasi-concave. (Assuming that F (L;K)is quasi-concave is basically equivalent to assuming that the isoquants bendtowards the origin.)The objective function which is cost is obviously linear so that the …rst

requirement for a global minimum is satis…ed.We now show that the constraint is quasi-convex. We have:Proof. If F (L;K) is quasi-concave then by de…nition it can be written as:

F (L;K) = r (s (L;K))

where r0 (x) > 0 and s (L;K) is concave. Now since:

g (L;K) = Q¡ F (L;K) = a (b (L;K))where the monotonic function is:

a (x) = Q¡ r (¡x) ; a0 (x) = r0 (¡x) > 0and the convex function is:

b (L;K) = ¡s (L;K)

Page 264: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 256

since the negative of a concave function s (L;K) is convex. It follows then thatg (L;K) = Q¡ F (L;K) is quasi-convex.Finally we note from the …rst-order conditions for cost minimization that:

W = ¸¤@F (L¤;K¤)

@L:

SinceW > 0 and @F (L¤;K¤)@L > 0; it follows that: ¸¤ > 0 so the third requirement

for a global minimum is satis…ed. We conclude then that ¸¤; L¤;K¤ correspondto a global minimum.

Su¢cient Conditions when neither the objective function nor the con-straint is linear

There are cases where neither the objective function f (x1; x2; : : : xn) nor theconstraint g (x1; x2; : : : xn) is linear. In this case we have:

Theorem 290 If 1) both f (x1; x2; : : : xn) and g (x1; x2; : : : xn) in the Lagrangian:

L (¸; x1; x2; : : : xn) = f (x1; x2; : : : xn) + ¸g (x1; x2; : : : xn)are quasi-concave (quasi-convex), and 2) if ¸¤; x¤1; x¤2; : : : x¤n solve the …rst-orderconditions from the Lagrangian for ¸¤ > 0 then ¸¤; x¤1; x¤2; : : : x¤n correspond toa constrained global maximum (minimum).

Example: Consider a country that produces two goods Q1 and Q2 with utilityfunction:

U (Q1; Q2) = Q1Q2:

which it wishes to maximize. This is clearly non-linear. The production possi-bilities curve or constraint satis…es:

Q21 +Q22 = 1

which is plotted below

Q2

Q1

Production Possibilities Curve

:

Page 265: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 257

The constraint can be written as:

g (Q1; Q2) = 1¡Q21 ¡Q22 = 0and is also clearly non-linear.This leads to the Lagrangian:

L (¸;Q1; Q2) = Q1Q2 + ¸¡1¡Q21 ¡Q22

¢:

The …rst-order conditions are:

@L (¸¤; Q¤1; Q¤2)@¸

= 1¡ (Q¤1)2 ¡¡Q¤22

¢2= 0 =) (Q¤1)

2 + (Q¤2)2 = 1;

@L (¸¤; Q¤1; Q¤2)@Q1

= Q¤2 ¡ 2¸¤Q¤1 = 0 =) (Q¤2)2 = 4 (¸¤)2 (Q¤1)

2

@L (¸¤; Q¤1; Q¤2)@Q2

= Q¤1 ¡ 2¸¤Q¤2 = 0 =) (Q¤1)2 = 4 (¸¤)2 (Q¤2)

2:

Adding the second and third results we obtain:

(Q¤1)2 + (Q¤2)

2 = 4 (¸¤)2³(Q¤2)

2 + (Q¤1)2´=) ¸¤ =

1

2:

Using ¸¤ = 12 in the second condition we obtain:

Q¤2 ¡ 2¸¤Q¤1 = 0 =) Q¤2 = Q¤1

so that using Q¤2 = Q¤1 in the constraint yields:

(Q¤1)2 + (Q¤2)

2 = 1 =) Q¤1 = Q¤2 =

1p2:

Thus the solution to the …rst-order conditions is: ¸¤ = 12 ; Q

¤1 = Q

¤2 =

1p2:

We can show that f (Q1; Q2) is quasi-concave since

f (Q1; Q2) = eln(Q1)+ln(Q2)

where ex is a monotonic transformation and ln (Q1) + ln (Q2) is concave sinceit has a Hessian:

Hf (Q1; Q2) =

" ¡ 1Q21

0

0 ¡ 1Q22

#which is a diagonal matrix with all diagonal elements negative and hence isnegative de…nite for all Q1; Q2:The constraint: g (Q1; Q2) is concave, and hence quasi-concave, since it has

a Hessian:

Hg (Q1; Q2) =

· ¡2 00 ¡2

¸

Page 266: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 258

which is negative de…nite for all Q1; Q2: Finally ¸¤ = 1 > 0: Thus f (Q1; Q2)

and g (Q1; Q2) are quasi-concave and ¸¤ > 0 is satis…ed so that ¸¤ = 1

2 ; Q¤1 =

Q¤2 =1p2is a global maximum.

The constrained maximum Q¤1 = Q¤2 =1p2yields U (Q¤1; Q¤2) =

12 units of

utility and occurs where the indi¤erence curve Q2 = 12Q1

is just tangent to theproduction possibilities curve as illustrated below:

0

0.2

0.4

0.6

0.8

1

Q2

0.5 0.6 0.7 0.8 0.9 1Q1:

4.7 Econometrics

4.7.1 Linear Regression

Consider the simple linear regression model:

Yi = ®+ ¯Xi + ei; i = 1; 2; : : : n:

The least squares estimators ® and ^ are the values of ® and ¯ which minimizethe sum of squares function:

S (®; ¯) =nXi=1

(Yi ¡ ®¡ ¯Xi)2 :

We have using the sum and chain rules that:

@S (®; ¯)

@®= ¡2

nXi=1

(Yi ¡ ®¡ ¯Xi)

@S (®; ¯)

@¯= ¡2

nXi=1

Xi (Yi ¡ ®¡ ¯Xi)

Page 267: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 259

so that the …rst-order conditions for a minimum are:

@S³®; ^

´@®

= ¡2nXi=1

³Yi ¡ ®¡ ^Xi

´= 0 =) n®+

ÃnXi=1

Xi

!^ =

nXi=1

Yi

@S³®; ^

´@¯

= ¡2nXi=1

Xi

³Yi ¡ ®¡ ^Xi

´= 0 =)

ÃnXi=1

Xi

!®+

ÃnXi=1

X2i

!^ =

nXi=1

XiYi

or in matrix notation:·n

Pni=1XiPn

i=1XiPni=1X

2i

¸·®^

¸=

· Pni=1 YiPn

i=1XiYi

¸:

From the …rst equation it is easy to show that:

® = ¹Y ¡ ¹X ^

so the di¢culty is in obtaining ^: Solving ^ using Cramer’s rule we …nd that:

^ =nPni=1XiYi ¡ (

Pni=1Xi) (

Pni=1 Yi)

nPni=1X

2i ¡ (

Pni=1Xi)

2 :

Example: Suppose one has data on the consumption of n = 4 families alongwith their income as:

Yi = 72 58 63 55Xi = 98 80 91 73

where Yi is the consumption and Xi is the income of family i. We wish toestimate a consumption function of the form:

Yi = ®+ ¯Xi + ei

where ¯ is the marginal propensity to consume.The sum of squares is then:

S (®; ¯) = (72¡ ®¡ 98¯)2 + (58¡ ®¡ 80¯)2 + (63¡ ®¡ 91¯)2 + (55¡ ®¡ 73¯)2 :

To calculate ^ and ® we need:

4Xi=1

Xi = 98 + 80 + 91 + 73 = 342 =) ¹X =342

4= 85:5

4Xi=1

X2i = 982 + 802 + 912 + 732 = 29614

4Xi=1

XiYi = 98£ 72 + 80£ 58 + 91£ 63 + 73£ 55 = 21444

4Xi=1

Yi = 72 + 58 + 63 + 55 = 248 =) ¹Y =248

4= 62

Page 268: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 260

It follows then that:

^ =4£ 21444¡ 342£ 2484£ 29614¡ 3422 = 0:643

and:

® = ¹Y ¡ ¹X ^ = 62¡ 0:643£ 85:5 = 7:023:

Thus the estimated consumption function is:

Yi = 7:023 + 0:643Xi

and the estimated marginal propensity to consume is 0:643:

4.7.2 Maximum Likelihood

Maximum likelihood can also be applied to cases where µ is a vector of param-eters so that:

µ = µ1; µ2; : : : µp

so that the likelihood

L (µ) = L (µ1; µ2; : : : µp)

is a multivariate function.As before we estimate µ by maximizing L (µ) and denote the solution as µ

which solves the …rst-order conditions:

@L³µ1; µ2; : : : µp

´@µj

= 0 for j = 1; 2; : : : p

or equivalently if we de…ne the log-likelihood as l (µ) = ln (L (µ)) then:

@l³µ1; µ2; : : : µp

´@µj

= 0 for j = 1; 2; : : : p:

Once µ is found from the …rst-order conditions, a 95% con…dence intervalfor µ can be found as follows. Let the p£ p matrix:

H³µ´

be the Hessian of the log-likelihood evaluated at µ: This is referred to as theinformation matrix. Now calculate

¢ =³¡H

³µ´´¡1

Page 269: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 261

and let ±j be the jth diagonal element of ¢: Then a 95% con…dence interval forthe unknown µj is

µj § 1:96£p±j :

Example 1: Suppose that Yi » N£¹; ¾2

¤so that Yi has a mean of ¹ and a

standard deviation of ¾:We wish to estimate µ1 = ¹ and µ2 = ¾ using maximumlikelihood from a sample Y1; Y2; : : : ; Yn: The likelihood function is:

L (¹; ¾) = (2¼)¡n2 ¾¡ne¡

12¾2((Y1¡¹)2+(Y2¡¹)2+¢¢¢+(Yn¡¹)2):

The log-likelihood is given by:

l (¹; ¾) = ln (L (¹; ¾))

= ¡n2ln (2¼)¡ n ln (¾)¡ 1

2¾2

³(Y1 ¡ ¹)2 + (Y2 ¡ ¹)2 + ¢ ¢ ¢+ (Yn ¡ ¹)2

´:

The maximum likelihood estimator of ¹ is ¹ = ¹Y the sample mean since:

@l (¹; ¾)

@¹=

(Y1 ¡ ¹) + (Y2 ¡ ¹) + ¢ ¢ ¢+ (Yn ¡ ¹)¾2

=) @l (¹; ¾)

@¹= 0 =

(Y1 ¡ ¹) + (Y2 ¡ ¹) + ¢ ¢ ¢+ (Yn ¡ ¹)¾2

=) Y1 + Y2 + ¢ ¢ ¢+ Yn = n¹=) ¹ =

Y1 + Y2 + ¢ ¢ ¢+ Ynn

= ¹Y :

The maximum likelihood estimator of ¾ is the sample standard deviationsince:

@l (¹; ¾)

@¾= ¡n

¾+1

¾3

³(Y1 ¡ ¹)2 + (Y2 ¡ ¹)2 + ¢ ¢ ¢+ (Yn ¡ ¹)2

´and:

@l (¹; ¾)

@¾= 0

=) ¡n¾+1

¾3

³(Y1 ¡ ¹)2 + (Y2 ¡ ¹)2 + ¢ ¢ ¢+ (Yn ¡ ¹)2

´= 0

=) ¡n¾+1

¾3

³¡Y1 ¡ ¹Y

¢2+¡Y2 ¡ ¹Y

¢2+ ¢ ¢ ¢+ ¡Yn ¡ ¹Y

¢2´= 0

=) ¾2 =1

n

³¡Y1 ¡ ¹Y

¢2+¡Y2 ¡ ¹Y

¢2+ ¢ ¢ ¢+ ¡Yn ¡ ¹Y

¢2´=) ¾ =

r1

n

³¡Y1 ¡ ¹Y

¢2+¡Y2 ¡ ¹Y

¢2+ ¢ ¢ ¢+ ¡Yn ¡ ¹Y

¢2´:

Page 270: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 262

Now to calculate con…dence intervals for ¹ and ¾ we need the Hessian ofl (¹; ¾) : We have:

@2l (¹; ¾)

@¹2=

¡1 +¡1 + ¢ ¢ ¢+¡1¾2

= ¡ n¾2

@2l (¹; ¾)

@¹@¾= ¡2(Y1 ¡ ¹) + (Y2 ¡ ¹) + ¢ ¢ ¢+ (Yn ¡ ¹)

¾3

= ¡2¡Y1 ¡ ¹Y

¢+¡Y2 ¡ ¹Y

¢+ ¢ ¢ ¢+ ¡Yn ¡ ¹Y

¢¾3

= ¡2Y1 + Y2 + ¢ ¢ ¢+ Yn ¡ n¹Y

¾3

= 0

@2l (¹; ¾)

@¾2=

n

¾2¡ 3 1

¾4

=n¾2z }| {³¡Y1 ¡ ¹Y

¢2+¡Y2 ¡ ¹Y

¢2+ ¢ ¢ ¢+ ¡Yn ¡ ¹Y

¢2´= ¡2n

¾2

and so the information matrix is:

H (¹; ¾) =

· ¡ n¾2

00 ¡2n

¾2

¸and hence:

¢ = (¡H (¹; ¾))¡1

=

"¾2

n 0

0 ¾2

2n

#:

Thus a 95 % con…dence interval for the unknown ¹ takes the form:

¹§ 1:96s¾2

n

while a 95 % con…dence interval for the unknown ¾ takes the form:

¾ § 1:96s¾2

2n:

Example 2: Suppose we are given n = 5 observations:

Y1 = 5:5; Y2 = 3:3; Y3 = 7:1; Y4 = 9:2; Y5 = 4:1:

Page 271: Introduction to Mathematical Economics Part 1 - Loglinear Publications

CHAPTER 4. MULTIVARIATE CALCULUS 263

We are seeking the values of ¹ and ¾ which maximize the log-likelihood plottedbelow:

musigma

ln(L)

l (¹; ¾)

:

We have

¹ = ¹Y =5:5 + 3:3 + 7:1 + 9:2 + 4:1

5= 5:84

and

¾ =

s(5:5¡ 5:84)2 + (3:3¡ 5:84)2 + (7:1¡ 5:84)2 + (9:2¡ 5:84)2 + (4:1¡ 5:84)2

5

= 2: 12:

A 95% con…dence interval for the unknown ¹ is then:

5:84§ 1:96r2: 122

5

or:

5:84§ 1:8583:

A 95% con…dence interval for the unknown ¾ is then:

2: 12§ 1:96r2: 122

2£ 5or:

2: 12§ 1:314: