Download - Ranking and Rating Data in Joint RP/SP Estimation

Ranking and Rating Data in Joint RP/SP Estimation

by

JD Hunt, University of CalgaryM Zhong, University of Calgary

PROCESSUS Second International Colloquium

Toronto ON, CanadaJune 2005

Overview• Introduction

• Context• Motivations

• Definitions• Revealed Preference Choice• Stated Preference Rankings• Revealed Preference Ratings• Stated Preference Ratings

• Estimation Testbed• Concept• Synthetic Data Generation• Results

• Conclusions

Overview• Introduction

• Context• Motivations

• Definitions• Revealed Preference Choice• Stated Preference Rankings• Revealed Preference Ratings• Stated Preference Ratings

• Estimation Testbed• Concept• Synthetic Data Generation• Results – so far

• Conclusions – so far

Introduction• Context

• Common task to estimate logit model utility function for non-existing mode alternatives

• Joint RP/SP estimation available• Good for sensitivity coefficients• Problems with alternative specific constants (ASC)

• Motivation• Improve situation regarding ASC• Seeking to expand on joint RP/SP estimation• Add rating information

• 0 to 10 scores• Direct utility

• Increase understanding of issues regarding ASC generally

Definitions• Revealed Preference Choice• Stated Preference Ranking• Revealed Preference Ratings• Stated Preference Ratings

• Linear-in-parameters logit utility function

Um = Σk αm,k xm,k + βm

Definitions• Revealed Preference Choice• Stated Preference Ranking• Revealed Preference Ratings• Stated Preference Ratings

• Linear-in-parameters logit utility function

Um = Σk αm,k xm,k + βmsensitivitycoefficient ASC

Revealed Preference Choice

• Actual behaviour• Best alternative choice from existing• Attribute values determined

separately• Indirect utility measure

– observe outcome

Umr = λr [ Σk αm,k xm,k + βm ] + βm

r

Revealed Preference Choice

• Disaggregate estimation provides

Umr = Σk α’m,k

r xm,k + β’mr

with

α’m,kr = λr αm,k

β’mr = λr βm + βm

r

Stated Preference Ranking

• Stated behaviour• Ranking alternatives from

presented set• Attribute values indicated• Indirect utility measure – observe

outcome

Ums = λs [ Σk αm,k xm,k + βm ] + βm

s

Stated Preference Ranking

• Disaggregate (exploded) estimation provides

Ums = Σk α’m,k

s xm,k + β’ms

with

α’m,ks = λs αm,k

β’ms = λs βm + βm

s

Revealed Preference Ratings

• Stated values for selected and perhaps also unselected alternatives

• Providing 0 to 10 score with associated descriptors

10 = excellent; 5 = reasonable; 0 = terrible

• Attribute values determined separately

• Direct utility measure (scaled?)

Rmg = θ

g [ Σk αm,k xm,k + βm ] + βmg

Revealed Preference Ratings

• Regression estimation provides

Rmg = Σk α’m,k

g xm,k + β’mg

with

α’m,kg = θ

g αm,k

β’mg = θ

g βm + βmg

Stated Preference Ratings• Stated values for each of set of

alternatives• Providing 0 to 10 score with

associated descriptors 10 = excellent; 5 = reasonable; 0 = terrible

• Attribute values indicated• Provides verification of rankings• Direct utility measure (scaled?)

Rmh = θ

h [ Σk αm,k xm,k + βm ] + βmh

Stated Preference Ratings

• Regression estimation provides

Rmh = Σk α’m,k

h xm,k + β’mh

with

α’m,kh = θ

h αm,k

β’mh = θh βm + βm

h

Estimation Testbed

• Specify true parameter values (αm,k and βm)• Generate synthetic observations

• Assume attribute values and error distributions• Sample to get specific error values• Calculate utility values using attribute values,

true parameter values and error values• Develop RP choice observations and SP ranking

observations using utility values• Develop RP ratings observations and SP ratings

observations by scaling utility values to fit within 0 to 10 range

• Test estimation techniques in terms of returning to true parameter values

True Utility Function

Um = Σk αm,k xm,k + βm + em

Alternative (m)

αm,k=1 αm,k=2 αm,k=3 αm,k=4 βm

1 -0.25 -0.50 -0.25 -0.50 0 2 -0.80 -0.50 -0.15 -0.40 -0.75 3 -0.65 -0.20 -0.55 -0.30 2.50 4 -0.50 -1.20 -0.70 -0.20 0 5 -0.40 -0.50 -0.15 -0.4 1.5 6 -0.05 -0.30 -0.50 -0.55 -0.80 7 -0.25 -0.50 -0.10 -0.80 1.50

True Parameter Values

Attribute Values

Alternative (m)

μm,k=1 σm,k=1 μm,k=2 σm,k=2 μm,k=3 σm,k=3 μm,k=4 σm,k=4

1 10.0 0.25 5.0 0.50 20.0 3.25 15.0 0.40

2 5.0 0.50 5.0 0.50 25.0 7.00 20.0 4.10

3 15.0 1.20 2.0 0.20 19.0 2.50 15.0 2.20

4 10.0 2.20 10.0 4.00 20.0 5.00 12.0 1.80

5 10.0 2.50 8.0 2.00 16.0 3.00 10.0 1.50

6 15.0 2.30 7.0 1.30 15.0 2.00 15.0 2.70

7 15.0 1.50 5.0 1.00 14.0 1.20 25.0 3.50

sampled from N(μm,k ,σm,k) with

Error ValuesSampled from N(μ= 0 , σ m )

σ m varies by observation type:

• RP Choice: σ m = σ rm = 2.4

• SP Rankings: σ m = σ sm =

1.5

• RP Ratings: σ m = σ gm =

2.1

• SP Ratings: σ m = σ hm = 1.8

Generated Synthetic Samples

• Each of 4 observation types• 7 alternatives for each observation

(m=7)• Set of 15,000 observations• Sometimes considered subsets of

alternatives with overall across observation types, as indicated below

Testbed Estimations

• RP Choice• SP Rankings• Joint RP/SP Data• Ratings• Combined RP/SP Data and Ratings

RP Choice

• Used ALOGIT software

• Set β’m=1r = 0 to avoid over-

specification• Provides:

• α’m,kr = λr αm,k

• β’mr = λr βm + βm

r

• Know that λr = π / ( √6 σ rm )

= 0.534

RP Choice Estimated vs True Valueswith 15,000 Observations

-18

-12

-6

0

6

12

18

-18 -12 -6 0 6 12 18

observed

estim

ated

RP Choice Estimates vs True Values

with 15,000 Observations

-2

-1.5

-1

-0.5

0

-2 -1.5 -1 -0.5 0

observed

estim

ated

RP Choice Estimated vs True Valueswith 15,000 Observations

-18

-12

-6

0

6

12

18

-18 -12 -6 0 6 12 18

observed

estim

ated

1

2

3

4 56

7

(2.3)(1.6)

(1.2)(0.6)

(0.8)

(1.1)

ρ20= 0.1834

ρ2c= 0.6982

RP ChoiceSelection frequencies and ASC estimates

Alternative (m)

Times Selected

Estimated ASC t-ratio

1 1,377 0 - 2 835 -3.63 1.13 53 13.94 2.34 25 -3.49 0.65 12,100 -3.79 1.26 594 -2.57 0.87 16 15.21 1.6

RP Choice 2Selection frequencies and ASC estimates

Alternative (m)

Times Selected


1 1,051 0 - 2 894 -0.74 0.23 3,046 0.7 0.24 236 0.04 0.05 6,331 1.61 0.56 2,271 2.92 0.87 1,171 1.04 0.3

RP Choice 2 Estimated vs True Valueswith 15,000 Observations

-4

-3

-2

-1

0

1

2

3

4

-4 -3 -2 -1 0 1 2 3 4

observed

estim

ated

1

2

3

4

5

6

7

(0.2)(0.3)

(0.5)

(0.0)

(0.8)

(0.2)

ρ20= 0.3151

ρ2c= 0.1683

RP Choice 3Selection frequencies and ASC estimates

Alternative (m)

Times Selected


1 1,682 0 - 2 1,783 -0.05 0.03 2,842 2.25 0.84 3,044 0.01 0.05 1,944 2.07 0.76 1,940 -0.12 0.07 1,765 2.6 0.9

RP Choice 3 Estimated vs True Valueswith 15,000 Observations

-4

-3

-2

-1

0

1

2

3

4

-4 -3 -2 -1 0 1 2 3 4

observed

estim

ated

1

2

3

4

5

6

7

(0.8)

(0.9

(0.7)

(0.0)(0.0)

(0.0)

ρ20= 0.1852

ρ2c= 0.1736

SP Rankings


• Set β’m=1s = 0 to avoid over-

specification• Provides:

• α’m,ks = λs αm,k

• β’ms = λs βm + βm

s

• Know that λs = π / ( √6 σ sm )

= 0.855

SP Ranking Estimates vs True Valueswith 15,000 observations

-4

-3

-2

-1

0

1

2

3

4

-5 -4 -3 -2 -1 0 1 2 3 4 5

observed

estim

ated

observed

est

imate

d

86420-2-4-6-8

8

6

4

2

0

-2

-4

-6

-8

0

0

20 Runs for RP Choice Estimated vs True Values with 15,000 Observations

SP Rankings

• More information with full ranking• Also confirm against RP above

• ‘ranking version’ available• estimate using full ranking

RP Rankings Estimates vs True Valueswith 15,000 observations

-4

-3

-2

-1

0

1

2

3

4

-5 -4 -3 -2 -1 0 1 2 3 4 5

observed

estim

ate

d

SP vs RP Rankings

• ASC translated en bloc to some extent

SP Rankings: Role of σm,k

• Impact of changing σm,k used when synthesizing attribute values

• Sampling from N(μm,k ,σm,k)

• Different σm,k means different spreads on attribute values

• Impacts relative size of σ sm

• Implications for SP survey design

Attribute Values

Alternative (m)

μm,k=1 σm,k=1 μm,k=2 σm,k=2 μm,k=3 σm,k=3 μm,k=4 σm,k=4

1 10.0 0.25 5.0 0.50 20.0 3.25 15.0 0.40

2 5.0 0.50 5.0 0.50 25.0 7.00 20.0 4.10

3 15.0 1.20 2.0 0.20 19.0 2.50 15.0 2.20

4 10.0 2.20 10.0 4.00 20.0 5.00 12.0 1.80

5 10.0 2.50 8.0 2.00 16.0 3.00 10.0 1.50

6 15.0 2.30 7.0 1.30 15.0 2.00 15.0 2.70

7 15.0 1.50 5.0 1.00 14.0 1.20 25.0 3.50

sampled from N(μm,k ,σm,k) with

0.0001

0.001

0.01

0.1

1

10

0.1

0·σ

m,k

0.5

0·σ

m,k

1.0

0·σ

m,k

2.0

0·σ

m,k

4.0

0·σ

m,k

10

.00

·σm

,k

Factoring on σm,k

Av

era

ge

Ab

so

lute

Dif

fere

nc

e

Es

tim

ate

d a

nd

Tru

e V

alu

es α

β

SP Rankings: Role of σm,k

• Increasingσm,k improves estimators• Roughly proportional

• Ratio of βm toαm,k maintained

• Use 1.00 ·αm,k in remaining work here

• Implications for SP survey design• More variation in attribute values is

better

Joint RP/SP Data• Two basic approaches for αm,k

• Sequential (Hensher)

First estimate α’m,ks using SP observations;

Then estimate α’m,kr using RP observations,

also forcing ratios among α’m,kr to match those

obtained first for α’m,ks

• Simultaneous (Ben Akiva; Morikawa; Daly; Bradley)

Estimate α’m,kr using RP observations and α’m,k

r using SP observations and (λs/λr) altogether where (λs/λr) α’m,k

r is used in place of α’m,ks

• Little concensus on approach for βm

Joint RP/SP Data


• Set β’m=1s = 0 and β’m=1

r = 0 to avoid over-specification

• Provides:• α’m,k

s = λs αm,k α’m,kr = λr αm,k

• β’ms = λs βm +βm

s β’mr = λr βm +βm

r

• λr/λs

• Know that λr = 0.855 and λs = 1.166

Joint RP/SP Ranking Estimation for Full set of RP and SP

15,000 Observations (7 Alternatives for each)

-3

-2

-1

0

1

2

3

-3 -2 -1 0 1 2 3

observed

est

imate

d

Joint RP/SP Ranking Estimationwith 15,000 RP Observations for Alternative 1-4and 15,000 SP Observations for Alternatives 4-7

-3

-2

-1

0

1

2

3

-3 -2 -1 0 1 2 3

Observed

Est

imate

d

RP Ratings• Two potential interpretations of ratings

• Value provided is a (scaled?) direct utility• Value provided is 10x probability of

selection• Issue of reference

• ‘excellent’ in terms of other people’s travel• ‘excellent’ relative to other alternatives for

respondent specifically• Related to interpretation above

• Here: Use direct utility interpretation and thus reference is in terms of other people’s travel

RP Ratings• Used MINITAB MLE Provides:

• α’m,kg = θ

g αm,k

• β’mg = θ

g βm + βmg

RP Ratings Estimation with 15,000 Observations

-12

-10

-8

-6

-4

-2

0

2

4

6

8

10

12

-12 -10 -8 -6 -4 -2 0 2 4 6 8 10 12

Observed

Estim

ated

Estimation of PlottedRP Ratings Values

• θ g is found by minimizing the

minimum square error between estimated sensitivities (θ

g αm,k ) and the true values αm,k

• The estimated values for βm are then found using (β’m

g - βmg

min)/ θ g with the above-determined value for θ

g

SP Ratings• Used MINITAB • Provides:

• α’m,kh = θ

h αm,k

• β’mh = θ

h βm + βmh

SP Ratings Estimation with 15,000 Observations

-12

-10

-8

-6

-4

-2

0

2

4

6

8

10

12

-12 -10 -8 -6 -4 -2 0 2 4 6 8 10 12

Observed

Estim

ated

Estimation of PlottedSP Ratings Values

• θ h is found by minimizing the

minimum square error between estimated sensitivities (θ

h αm,k ) and the true values αm,k

• The estimated values for βm are then found using (β’m

h - βmh

min)/ θ h with the above-determined value for θ

h

Combined RP/SP Data and Ratings

• Purpose-built software• Log-Likelihood function: L = Σk Prob(ms*) + Σk Prob(mr*) - wg Σk Σm (Rm

gobs - Rm

gmod) 2

- wh Σk Σm (Rmhobs - Rm

hmod) 2

where:mr* = selected alternative in RP observationms* = selected alternative in SP observationProb(m) = probability model assigns to alternative m


Prob(ms*) = exp( [Σk α’m*,ks xm*,k] + β’m*

s)/ ( Σmexp( [Σk α’m,k

s xm,k] + β’ms ) )

Prob(mr*) = exp( [Σk α’m*,kr xm*,k] + β’m*

r)/ ( Σmexp( [Σk α’m,k

r xm,k] + β’mr ) )

Rmg

mod = Σk α’m,kg xm,k + β’m

g

Rmh

mod = Σk α’m,kh xm,k + β’m

h


• Consider range of results for βmr for

different settings on variables• Example planned settings

• Set θ h = 1

• Set wg and wh = 1

• This ‘anchors’ utilities to values provided in SP Ratings

Conclusions• Work in progress

• Not complete, but still discovering things

• βm estimators problematic generally• Even with existing alternatives • Not as efficient as those for αm,k

• Influenced by variation in attribute values σm,k

• Influenced by frequency of chosen alternatives?

• T-statistics not a useful guide?• Ranking (exploded) helps• Rating also expected to help