Peter P. Wakker (& Gijs van de Kuilen, Theo Offerman, Joep Sonnemans) December 22, 2005 Tel Aviv

44
Adapting de Finetti's proper scoring rules for Measuring Subjective Beliefs to Modern Decision Theories ofAmbiguity Peter P. Wakker (& Gijs van de Kuilen, Theo Offerman, Joep Sonnemans) December 22, 2005 Tel Aviv Aim: See title. Modern theories: rank- dependent utility of Giboa ('87) & Schmeidler ('89) and prospect theory (Kahneman & Tversky '79, '92)

description

Adapting de Finetti's proper scoring rules for Measuring Subjective Beliefs to Modern Decision Theories ofAmbiguity. Peter P. Wakker (& Gijs van de Kuilen, Theo Offerman, Joep Sonnemans) December 22, 2005 Tel Aviv. - PowerPoint PPT Presentation

Transcript of Peter P. Wakker (& Gijs van de Kuilen, Theo Offerman, Joep Sonnemans) December 22, 2005 Tel Aviv

Page 1: Peter P. Wakker (& Gijs van de Kuilen, Theo Offerman, Joep Sonnemans) December 22, 2005 Tel Aviv

Adapting de Finetti's proper scoring rules for Measuring

Subjective Beliefs to Modern Decision Theories ofAmbiguity

Peter P. Wakker (& Gijs van de Kuilen, Theo Offerman, Joep Sonnemans)

December 22, 2005Tel Aviv

Aim: See title. Modern theories: rank-dependent utility of Giboa ('87) & Schmeidler ('89) and prospect theory (Kahneman & Tversky '79, '92)

Page 2: Peter P. Wakker (& Gijs van de Kuilen, Theo Offerman, Joep Sonnemans) December 22, 2005 Tel Aviv

Part I (Didactical). de Finetti's Proper Scoring Rules

(under Subjective Expected Value)

2

For clarity, I give example of grading, which is not primary application.

Page 3: Peter P. Wakker (& Gijs van de Kuilen, Theo Offerman, Joep Sonnemans) December 22, 2005 Tel Aviv

Say, you grade a multiple choice exam in geography to test students' knowledge about Statement E: Capital of Zeeland = Assen.

3

Reward: if E true if not E true E not-E

1 0 0 1

Assume: Two kinds of students. 1. Those who know. They answer correctly.2. Those who don't know. Answer 50-50.

Problem: Correct answer does not completely identify student's knowledge. Some correct answers, and high grades, are due to luck. There is noise in the data.

Page 4: Peter P. Wakker (& Gijs van de Kuilen, Theo Offerman, Joep Sonnemans) December 22, 2005 Tel Aviv

One way out: oral exam, or ask details.Too time consuming.

I now promise a perfect way out:Exactly identify state of knowledge of each student. Still multiple choice, taking no more time!

Best of all worlds!

4

Page 5: Peter P. Wakker (& Gijs van de Kuilen, Theo Offerman, Joep Sonnemans) December 22, 2005 Tel Aviv

5

For those of you who do not know answer to E. What would you do?Best to say "don't know."System perfectly well discriminates between students!

Reward: if E true if not E true E

not-E

1 0

0 1 don't know 0.75 0.75

Page 6: Peter P. Wakker (& Gijs van de Kuilen, Theo Offerman, Joep Sonnemans) December 22, 2005 Tel Aviv

New Assumption: Students have all kinds of degrees of knowledge. Some are almost sure E is true but not completely; others have a hunch; etc.

Above system does not discriminate perfectly well between such various states of knowledge.

One solution (too time consuming): Oral exams etc.Second solution is to let them make many binary choices:

6

(way too time consuming; given here only because binary choice widely used in decision theory).

Page 7: Peter P. Wakker (& Gijs van de Kuilen, Theo Offerman, Joep Sonnemans) December 22, 2005 Tel Aviv

7

don't know 0.20 0.20 E Reward: if E true if not E true

1 0 choice 2

.

.

.

don't know 0.90 0.90 E Reward: if E true if not E true

1 0 choice 9

don't know 0.10 0.10 E Reward: if E true if not E true

1 0 choice 1

don't know 0.30 0.30 E Reward: if E true if not E true

1 0 choice 3

Page 8: Peter P. Wakker (& Gijs van de Kuilen, Theo Offerman, Joep Sonnemans) December 22, 2005 Tel Aviv

8Etc. Can get approximation of true subjective probability p, i.e. degree of belief (!?). Binary-choice ("preference") solution is popular in decision theory. This method works under subjective expected value. Justifiable by de Finetti's (1931) famous book making argument.

Too time consuming for us. Rewards for students are noisy this way.Spinoff: This lecture will remind decision theorists that multiple choice can be more informative than binary choice.

Page 9: Peter P. Wakker (& Gijs van de Kuilen, Theo Offerman, Joep Sonnemans) December 22, 2005 Tel Aviv

9

partly know E, to degree r r r

EReward: if E true if not E true 1 0

Just ask students for "indifference r," i.e. ask: "Which r makes lower row equally good as upper row?"

Problems: Why would student give true answer r = p? What at all is the reward (grade?) for the student? Does reward give an incentive to give true answer?BDM … complex and noisy payment …

Third solution [introspection]Student chooses number r ("reported probability").

Page 10: Peter P. Wakker (& Gijs van de Kuilen, Theo Offerman, Joep Sonnemans) December 22, 2005 Tel Aviv

10

I now promise a perfect way out:de Finetti's dream-incentives.Exactly identifies state of knowledge of each student, no matter what it is.Still multiple choice, taking no more time.Need no BDM.Rewards students fairly, with little noise.

Best of all worlds.For the full variety of degrees of knowledge.Student can choose reported probability r for E from the [0,1] continuum, as follows.

Page 11: Peter P. Wakker (& Gijs van de Kuilen, Theo Offerman, Joep Sonnemans) December 22, 2005 Tel Aviv

11

Reward: if E true if not E true

Claim: Under "subjective expected value," optimal reported probability r = true subjective probability p.

1 0 r=1

0 1 r=0

1 – (1–r)2 1–r2r

don't know 0.75 0.75 r=0.5:

: E

: not-E

degree ofbelief in E:

Page 12: Peter P. Wakker (& Gijs van de Kuilen, Theo Offerman, Joep Sonnemans) December 22, 2005 Tel Aviv

Proof. p true probability; r reported probability.EV = p(1 – (1–r)2) + (1–p)(1–r2).1st order optimality:2p(1–r) – 2r(1–p) = 0.r = p!Figure!?

One corollary (a test of EV): r is additive.Incentive compatible ... Many implications ...

12

Reward: if E true if not E true 1 – (1–r)2 1–r2 r: degree of

belief in E

To help memory:

Page 13: Peter P. Wakker (& Gijs van de Kuilen, Theo Offerman, Joep Sonnemans) December 22, 2005 Tel Aviv

Theoretical Example [Expected Value, EV],alternative to "Assen-Zeeland."An urn K ("known") contains 100 balls, 25 G(reen), 25 R(ed), 25 S(ilver), 25 Y(ellow). One ball is drawn randomly.E: ball is not red, so E = {G,S,Y}. p of E = 0.75.Under expected value,optimal rE is 0.75.rG + rS + rY = 0.25 + 0.25 + 0.25 = 0.75 = rE:r satisfies additivity; as probabilities should!

13

Continued later.

Page 14: Peter P. Wakker (& Gijs van de Kuilen, Theo Offerman, Joep Sonnemans) December 22, 2005 Tel Aviv

0.25 0.50 0.75 10p

14Reported probability R(p) = rE as function of true probability p, under:

nonEU

0.69

R(p)

EU

0.61

rEV

EV

rnonEU

rnonEUA

rnonEUA refers to nonexpected utility for unknown probabilities ("Ambiguity").

(c) nonexpected utility for known probabilities, with U(x) = x0.5 and with w(p) as common;

(b) expected utility with U(x) = x0.5 (EU);

(a) expected value (EV);

reward: if E true if not E true EV rEV=0.75 0.94 0.44 0.8125 rEU=0.69 0.91 0.52 0.8094

rEU

rnonEU=0.61 0.85 0.63 0.7920 rnonEUA=0.52 0.77 0.73 0.7596

next p.

go to p. 19, Example EU

go to p. 24, Example nonEU

0

0.50

1

0.25

0.75

go to p. 31, Example nonEUA

Page 15: Peter P. Wakker (& Gijs van de Kuilen, Theo Offerman, Joep Sonnemans) December 22, 2005 Tel Aviv

Part II. Adapting de Finetti's Proper Scoring Rules to

Deviations from Expected Value and also from Expected

Utility

15

Page 16: Peter P. Wakker (& Gijs van de Kuilen, Theo Offerman, Joep Sonnemans) December 22, 2005 Tel Aviv

However, EV (= risk neutrality) is empirically violated.We will, one by one, incorporate three deviations:1. Nonlinear utility, with expected utility instead of expected value;2. Nonexpected utility for given probabilities (Allais paradox etc.);3. No subjective probabilities for unknown probabilities (no probabilistic sophistication; Ellsberg paradox, ambiguity aversion, etc.).

16

Page 17: Peter P. Wakker (& Gijs van de Kuilen, Theo Offerman, Joep Sonnemans) December 22, 2005 Tel Aviv

1st deviation of EV: nonlinear utility Bernoulli (1738) proposed expected utility (EU).

17

EU(r) = pU(1 – (1–r)2) + (1–p)U(1–r2).

Then r = p need no more be optimal.

Reward: if E true if not E true 1 – (1–r)2 1–r2degree of

belief in E r:

Page 18: Peter P. Wakker (& Gijs van de Kuilen, Theo Offerman, Joep Sonnemans) December 22, 2005 Tel Aviv

Theorem. Under expected utility with true probability p,

18

U´(1–r2)U´(1 – (1–r)2)

(1–p)p +

pr =

U´(1–r2)U´(1 – (1–r)2)

(1–r)r +

rp =

Explicit expression:

A corollary to distinguish from EV: r is nonadditive as soon as U is nonlinear.

Page 19: Peter P. Wakker (& Gijs van de Kuilen, Theo Offerman, Joep Sonnemans) December 22, 2005 Tel Aviv

Example continued [Expected Utility, EU]. (urn K, 25 G, 25 R, 25 S, 25 Y).E: {G,S,Y}; p = 0.75.EV: rEV = 0.75.Expected utility, U(x) = x: rEU = 0.69.rG + rS + rY = 0.31 + 0.31 + 0.31 = 0.93 > 0.69 = rE:additivity violated!Such data prove that expected value cannot hold.

19

Further continued later.

go to p. 14, with figure of R(p)

Page 20: Peter P. Wakker (& Gijs van de Kuilen, Theo Offerman, Joep Sonnemans) December 22, 2005 Tel Aviv

2nd violation of EV: nonexpected utility Allais (1953), Machina (1982) proposed nonexpected utility (nonEU). Many such models: - original prospect theory (Kahneman & Tversky 1979), - rank-dependent utility (Quiggin 1982),- disappointment aversion (Gul 1991), - new prospect theory (Tversky & Kahneman 1992), etc.

20

For two-gain prospects, all theories are as follows:

Reward: if E true if not E true 1 – (1–r)2 1–r2degree of

belief in E r:

Page 21: Peter P. Wakker (& Gijs van de Kuilen, Theo Offerman, Joep Sonnemans) December 22, 2005 Tel Aviv

21 Reward: if E true if not E true 1 – (1–r)2 1–r2 r: degree of

belief in E

For r 0.5, nonEU(r) = w(p)U(1 – (1–r)2) + (1–w(p))U(1–r2).

For r 0.5, nonEU(r) = w(1–p) U(1–r2) + (1–w(1–p))U(1 – (1–r)2).

Different treatment of highest and lowest outcome. "Rank-dependence."Quiggin (1982), Gilboa (1987), Schmeidler (1989; first version 1982).

Page 22: Peter P. Wakker (& Gijs van de Kuilen, Theo Offerman, Joep Sonnemans) December 22, 2005 Tel Aviv

p

w(p)

1

1

0 1/3

Figure. The common weighting fuction.w(p) = exp(–(–ln(p))) for = 0.65.

w(1/3) 1/3;

22

1/3

w(2/3) .51

2/3

.51

Page 23: Peter P. Wakker (& Gijs van de Kuilen, Theo Offerman, Joep Sonnemans) December 22, 2005 Tel Aviv

Theorem.

23

U´(1–r2)U´(1 – (1–r)2)(1–w(p))w(p) +

w(p)r =

U´(1–r2)U´(1 – (1–r)2)

(1–r)r +

rp =

Explicit expression:

w –1(

)

Page 24: Peter P. Wakker (& Gijs van de Kuilen, Theo Offerman, Joep Sonnemans) December 22, 2005 Tel Aviv

Example continued [nonEU]. (urn K, 25 G, 25 R, 25 S, 25 Y).E: {G,S,Y}; p = 0.75.EV: rEV = 0.75.EU: rEU = 0.69.Nonexpected utility, U(x) = x, w(p) = exp(–(–ln(p))0.65).rnonEU = 0.61.rG + rS + rY = 0.39 + 0.39 + 0.39 = 1.17 > 0.61 = rE:additivity is strongly violated!

24

go to p. 14, with figure of R(p)

Page 25: Peter P. Wakker (& Gijs van de Kuilen, Theo Offerman, Joep Sonnemans) December 22, 2005 Tel Aviv

3rd violation of EV: Ambiguity (unknown probabilities)

Preparation for continued example.Random draw from additional urn A.100 balls, ? Ga, ? Ra, ? Sa, ? Ya. Unknown composition ("Ambiguous").Ea: {Ga,Sa,Ya}; p = ?.How to deal with unknown probabilities?

25

Page 26: Peter P. Wakker (& Gijs van de Kuilen, Theo Offerman, Joep Sonnemans) December 22, 2005 Tel Aviv

Proposal: probabilistic sophistication (Machina & Schmeidler 1992): assign "subjective" probabilities to events. Then behave as if known probs (may be nonEU).

26

Page 27: Peter P. Wakker (& Gijs van de Kuilen, Theo Offerman, Joep Sonnemans) December 22, 2005 Tel Aviv

With Ea = {Ga, Sa, Ya}, choose P(Ea).

Then

27

For r 0.5, nonEU(r) = w(P(Ea))U(1 – (1–r)2) + (1–w(P(Ea)))U(1–r2).

However, also this is often violated empirically:

Reward: if Ea true if not Ea true 1 – (1–r)2 1–r2degree of

belief in E r:

Page 28: Peter P. Wakker (& Gijs van de Kuilen, Theo Offerman, Joep Sonnemans) December 22, 2005 Tel Aviv

For urn A (100 balls, ? Ga, ? Ra, ? Sa, ? Ya),By symmetry, P(Ga) = P(Ra) = P(Sa) = P(Ya). Must all be 0.25. Then P(Ea) = 0.75 must be. So,

28

must be evaluated the same as (E: known urn)Reward: if E true if not E true 1 – (1–r)2 1–r2degree of

belief in E r:

Reward: if Ea true if not Ea true 1 – (1–r)2 1–r2degree of

belief in Ea r:

Value of both isw(0.75)U(1 – (1–r)2) + (1–w(0.75))U(1–r2).rE = rEa must hold. Empirical finding: People prefer known probs.

>< Probabilistic sophistication!

Page 29: Peter P. Wakker (& Gijs van de Kuilen, Theo Offerman, Joep Sonnemans) December 22, 2005 Tel Aviv

29

Instead of modeling beliefs through additive P(Ea), beliefs may be nonadditive, B(Ea), with B(Ea) B(Ga) + B(Sa) + B(Ya).(Dempster&Shafer, Tversky&Koehler, etc.)

Reward: if Ea true if not Ea true 1 – (1–r)2 1–r2degree of

belief in E r:

For r 0.5, nonEU(r) = w(B(Ea))U(1 – (1–r)2) + (1–w(B(Ea)))U(1–r2).Writing W(Ea) = w(B(Ea)):W(Ea)U(1 – (1–r)2) + (1–W(Ea))U(1–r2).P.s.: From this, can always write B(Ea) = w–1(W(Ea)).

Page 30: Peter P. Wakker (& Gijs van de Kuilen, Theo Offerman, Joep Sonnemans) December 22, 2005 Tel Aviv

Theorem.

30

U´(1–r2)U´(1 – (1–r)2)(1–w(B(E)))w(B(E)) +

w(B(E))rE =

U´(1–r2)U´(1 – (1–r)2)

(1–r)r +

rB(E) =

Explicit expression:

w –1(

)

Page 31: Peter P. Wakker (& Gijs van de Kuilen, Theo Offerman, Joep Sonnemans) December 22, 2005 Tel Aviv

Example continued [Ambiguity, nonEUA]. (urn A, 100 balls, ? G, ? R, ? S, ? Y).Ea: {Ga,Sa,Ya}; p = ?rEV = 0.75.rEU = 0.69.rnonEU = 0.61.Typically, w(B(Ea)) << w(B(E)) (E from known urn). rnonEUA is, say, 0.52.rG + rS + rY = 0.48 + 0.48 + 0.48 = 1.44 > 0.52 = rE:additivity is very extremely violated!r's are close to always saying fifty-fifty.Belief component B(E) = 0.62.

31

go to p. 14, with figure of R(p)

Page 32: Peter P. Wakker (& Gijs van de Kuilen, Theo Offerman, Joep Sonnemans) December 22, 2005 Tel Aviv

Many debates about whether or not B(E) can be interpreted as belief, or comprises other things about ambiguity attitude. We do not enter such debates.Before entering such debates, question is first how to measure B(E). This is what we do. We show how proper scoring rules can, after some corrections, measure B(E).

Earlier proposals for measuring B:

32

Page 33: Peter P. Wakker (& Gijs van de Kuilen, Theo Offerman, Joep Sonnemans) December 22, 2005 Tel Aviv

Proposal 1 (common in decision theory):Measure U,W, and w from behavior, and derive B(E) = w–1(W(E)) from it.Problem: Much and difficult work!!!

Proposal 2 (common in decision analysis of the 1960s, and in modern experimental economics): measure canonical probabilities, that is,for Ea, find event E with objective probability p such that (Ea:100) ~ (p:100). Then B(E) = p.Problem: measuring indifferences is difficult.

Proposal 3 (common in proper scoring rules): Calibration …Problem: Need many repeated observations.

33

Page 34: Peter P. Wakker (& Gijs van de Kuilen, Theo Offerman, Joep Sonnemans) December 22, 2005 Tel Aviv

34Our proposal: Take the best of all worlds!

Get B(E) = w–1(W(E)) without measuring U,W, and w from behavior.

Get canonical probability without measuring indifferences, or BDM.

Calibrate without needing long repeated observations.

Do all that with no more than simple proper-scoring-rule questions.

Page 35: Peter P. Wakker (& Gijs van de Kuilen, Theo Offerman, Joep Sonnemans) December 22, 2005 Tel Aviv

35

We reconsider explicit expressions:

U´(1–r2)U´(1 – (1–r)2)

(1–r)r +

rp = w –1(

)

U´(1–r2)U´(1 – (1–r)2)

(1–r)r +

rB(E) = w –1(

) Same right-hand sides. If p and B(E) belong to

the same r, then we can immediately substitute B(E) = p. Need not calculate the rhs!! We simply measure directly the R(p) curves, and use their inverses.

Page 36: Peter P. Wakker (& Gijs van de Kuilen, Theo Offerman, Joep Sonnemans) December 22, 2005 Tel Aviv

Part III. Experimental Test of Our Correction Method

36

Page 37: Peter P. Wakker (& Gijs van de Kuilen, Theo Offerman, Joep Sonnemans) December 22, 2005 Tel Aviv

Method

Participants. N = 93 students. Procedure. Computarized in lab. Groups of 15/16 each. 4 practice questions.

37

Page 38: Peter P. Wakker (& Gijs van de Kuilen, Theo Offerman, Joep Sonnemans) December 22, 2005 Tel Aviv

38Stimuli 1. First we did proper scoring rule for unknown probabilities. 72 in total.

For each stock two small intervals, and, third, their union. Thus, we test for additivity.

Page 39: Peter P. Wakker (& Gijs van de Kuilen, Theo Offerman, Joep Sonnemans) December 22, 2005 Tel Aviv

39Stimuli 2. Known probabilities: Two 10-sided dies thrown. Yield random nr. between 01 and 100.Event E: nr. 75 (etc.).Done for all probabilities j/20.Motivating subjects. Real incentives. Two treatments. 1. All-pay. Points paid for all questions.6 points = €1. Average earning €15.05.2. One-pay (random-lottery system).One question, randomly selected afterwards, played for real. 1 point = €20. Average earning: €15.30.

Page 40: Peter P. Wakker (& Gijs van de Kuilen, Theo Offerman, Joep Sonnemans) December 22, 2005 Tel Aviv

40

Results

Preliminary …

Page 41: Peter P. Wakker (& Gijs van de Kuilen, Theo Offerman, Joep Sonnemans) December 22, 2005 Tel Aviv

41

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0.000 0.100 0.200 0.300 0.400 0.500 0.600 0.700 0.800 0.900 1.000Reported probability

Corr

ecte

d pr

obab

ility

ONE (rho = 0.70) ALL (rho = 1.14) 45°

Average correction curves.

Page 42: Peter P. Wakker (& Gijs van de Kuilen, Theo Offerman, Joep Sonnemans) December 22, 2005 Tel Aviv

42

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

-2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5ρ

F(ρ )

treatmentone

treatmentall

Individual corrections.

Page 43: Peter P. Wakker (& Gijs van de Kuilen, Theo Offerman, Joep Sonnemans) December 22, 2005 Tel Aviv

43

0%10%20%30%40%50%60%70%80%90%

100%

Subjects

sum P[small]<P[large] sum P[small]=P[large] sum P[small]>P[large]

0%10%20%30%40%50%60%70%80%90%

100%

Subjects

sum P[small]<P[large] sum P[small]=P[large] sum P[small]>P[large]

0%10%20%30%40%50%60%70%80%90%

100%

Subjects

sum P[small]<P[large] sum P[small]=P[large] sum P[small]>P[large]

0%10%20%30%40%50%60%70%80%90%

100%

Subjects

sum P[small]<P[large] sum P[small]=P[large] sum P[small]>P[large]

Reported probability Corrected probability

Treat-ment one

Treat-ment all

Page 44: Peter P. Wakker (& Gijs van de Kuilen, Theo Offerman, Joep Sonnemans) December 22, 2005 Tel Aviv

Summary and Conclusion.Modern decision theories show that proper scoring rules are heavily biased.We present ways to correct for these biases (get right beliefs even if EV violated).Experiment shows that correction improves quality and reduces deviations from ("rational"?) Bayesian beliefs.Correction does not remove all deviations from Bayesian beliefs. Beliefs seem to be genuinly nonadditive/nonBayesian/sensitive-to-ambiguity.

44