Stevens Direct Scaling Methods and the Uniqueness Problem:

1

Stevens Direct Scaling Methods and the

Uniqueness Problem:

Empirical Evaluation of an Axiom fundamental to Interval Scale Level

2

OverviewHypothesis – The Interval Scale Axiom

MethodParticipantsStimuliProcedure

ResultsTesting the Interval Scale AxiomViolations of the Interval Scale Axiom

DiscussionCommutativityMathematical knowledge

3

Simple visual task:

Ratio Production Experiment: Subjects were required to adjust the area of variable squares to a prescribed ratio p.

Is a single subject able to produce adjustments on an interval scale level?

4

The Interval Scale Axiom

The hypothesis that Φ is a subscale of an interval scale is equivalent to the assumption that the following interval scale axiom is satisfied: Let us assume real numbers p, q, r, r´ and stimuli s, t, x, y such that:

5

___________ (t, q, x) E,x q t __________________ (s, p, t) E, t p s______________________________ (s, r, x) E,x r s

______________________________________ (y, r´, x) E,x r´ y

Then and

___________________________ (y, p´, t) E t p´ y

qr

pqrprp

)1´(

´:

6

The Interval Scale Axiom can be put to an empirical test

7

Participants

- 10 students and 7 graduates- 9 female and 8 male- Age: between 21 and 35 years old ( 26.9 years)- None had prior knowledge of hypothesis- Normal vision or corrected to normal vision

Method

8

Stimuli

White squares on a black background wereprojected on a white screen by means of a video projector

9

Apparatus

White screenVideo projector (Sony VPL-X1000E)Keyboard (Cherry G83), PCSoftware Orange TA (Michael Kickmeier)

10

Procedure• Welcome...• Thank you very much for participating...

• Collection of personal data name, age, normal sight? (y/n), student (y/n)• Instruction• Practice session (4 trials)

test trial – collecting data

darkened room (Heizhaus, SR 12.31) participants distance from the screen was 3 m

11

Instruction

„... The task is to adjust the right square in such a way, that its area appers to be n times as large as the area of the standard stimulus on the left. Use the „“ and the „“ keys to adjust the right square and indicate a satisfactory match by pressing the „Return“ key. If the projector screen appears to be to small to adjust the area to the prescribed ratio n, then you have the opportunity to press the „Esc“ key and the next task will appear ...“

13

2 standard squares (x)- 80 Pixel (13.3 cm on the screen)- 120 Pixel side length (20 cm)

11 ratio production factors- (q = 2,3,4,....,12)

for each (x, q) combination: 10 adjustments 2 11 10 = 220 adjustments

Session 1

14

Evaluation of Session 1

Calculating the mean areas of 10 adjustments 80 2 „802“ 10 adjustments 80 3 „803“ 10 adjustments 120 2 „1202“ 10 adjustments 120 3 „1203“

(will be the standard squares in Session 2)

15

Session 2

4 standard squares (different for each participant)

802 1202

803 1203

5 ratio production factors(p = 2,3,4,5,6)

for each (xq, p) combination: 10 adjustments 4 5 10 = 200 adjustments

16

Notation: mean area for the 10 adjustments802 3: „802,3“

In both Sessions:

Standard square and ratio production factors were randomly intermixed

The 2 Sessions were conducted on 2 different days

17

Results

A few descriptives

- One participant had to be excluded after Session 1 - „Esc“ key was not used

no ceiling effects - Session 1 took an average time of 46 min - Session 2 took an average time of 30 min

18

Ratio Production Factor

12111098765432M

ean

Are

as in

Pix

el²

200000

180000

160000

140000

120000

100000

80000

60000

40000

20000

0

adjusted

"correct"Ratio Production Factor

12111098765432

Mea

n A

reas

in P

ixel

²100000

90000

80000

70000

60000

50000

40000

30000

20000

10000

0

adjusted

"correct"

Mean adjustments in Session 1. Standard = 80 Pixel side length Mean adjustments in Session 1.

Standard = 120 Pixel side length

19

Testing the interval scale axiom

Analyzable Quadruple: There exists a quadruple (x, q, p ,r) such that the q p adjustment xq,p is statistically indistinguishable from the r adjustment xr. Potentially 4 analyzable quadruples for any participant

(1) 802 p1 = 80 r1

(2) 803 p2 = 80 r2

(3) 1202 p3 = 80 r3

(4) 1203 p4 = 80 r4

20

Data Selection

Rank OrderThe interval scale axiom is based on the assumption, that the adjustments preserve the mathematical order of the ratio production factors.

n adjustments < (n+1) adjustments 2 < 3 3 < 4

21

For any participant all such tuples were analyzed:

We evaluated only those participants with less than 2 rank order violations

4 particpants were excluded

p(rank order violation) 1/36 < .03

22

After Data Selection:

33 opportunities to check for the Interval Scale Axiom

23

One single case (Participant M.A)

Standard square: 80 Pixel side length q p = r 3 2 = 5 z(U)= .23, p = .82, n.s

q = 3 p = 2 r = 5

Analyzable Quadruple: (80, 3, 2, 5)

A natural number r´ > r was fixed in such a way, that the ratio production factor p´ is an integer.

24

Adjustments generated by a single participant (M.A.) starting from the standard square having 80 Pixel side length. Tripling the standard and doubling the outcome is statistically indistinguishable from the 5 adjustment. Therefore, tripling the standard and then tripling the outcome should be statistically indistinguishable from the 7 adjustment. This is not the case. The axiom is violated.

25

qr

pqrprp

)1´(

´:

5.235

325)12(6´:

p ... r´=6

335

325)12(7´

p ... r´=7

26

If adjustments are on an interval scale level, then:

If (80 3) 2 = 80 5

then there also must hold

(80 3) 3 = 80 7

Formally:

803,2 = 805

803,3 = 807

27

The area 803,3 is different from the area 807

(z(U) = 3.41, p < .01)

The axiom is violated.

28

Violations of the interval scale axiom in the whole sample

33 Tests: 18 violations vs. 15 non-violations Only 3 of the (remaining) 12 participants showed no violation of the axiom

29

Transform the pattern of outcomes into an overall statistical statement: binomial distribution

Probability for a single violation by chance is .05

Probability for observing 18 or more violations in 33 tests is:

p = 1.91 10-15

Assuming that the interval scale axiom holds for the whole sample, then the above pattern is highly unlikely.

30

ConclusionFor area production of squares:

Single subject is not able to produce adjustments on an

interval scale level

31

Discussion

The study generalizes earlier findings by Ellermeier and Faulhammer (2000) indicating that multiplicativity fails to hold.

Loudness Production: Data are not on a Ratio Scale Level

32

CommutativityDoubling a standard and tripling the outcome converges on the same final outcome as first tripling the (same) standard and then doubling the outcome

2 3 = 3 2 (?)Ellermeier & Faulhammer (2000), Zimmer (in press) found Commutativity to hold.

32 tests (16 participants, 2 standard squares)Commutativity was violated in 15 cases.

Commutativity is not a general law that holds across all sensory modalities.

33

Mathematical knowledge

4 and 9 ... play a special role

Phyiscal area of a square can be quadrupled by doubling the side length

Coefficient of Variation separately for the 11 ratio production factors

34

Mean coefficients of variation calculated separately for each ratio production factor p. The 4 and 9 adjustments show less variability than their respective neighbors.

x

sV Mathematical knowledge:

35

Participants applied mathematical knowledge

Teghtsoonian (1965)„ ...ability to make accurate area-judgements depends on (1) his ability accurately to

estimate length, and (2) his knowledge that the area of a two-dimensional figure is proportional to the square of a linear dimension“

36

Stevens psychophysical functions are based on mean values over the whole sample

The aim of the present study was to investigate, whether a single subject is able to produce adjustments on an interval scale level.

area production of squares: no (too many violations of the interval scale axiom)

Consequences: Parametric statistics have to be replaced by nonparametric methods.

37

Further studies:

- different sensory modalities (loudness, brightness,...)- other direct scaling methods (ratio estimation)- are different scale-types more appropriate

38

Thank you very much

39

Anderson, N. H. (1970). Functional Measurement and Psychophysical Judgment.Psychological Review, 77, 153-170.

Anderson, N. H. (1976). Integration Theory, Functional Measurement andthe Psychophysical Law. In H. G. Geissler, & Y. M. Zabrodin (Eds.),Advances in Psychophysics. Berlin: VEB Deutscher Verlag der Wissenschaften.

Ellermeier, W., & Faulhammer, G. (2000). Empirical Evaluation of AxiomsFundamental to Stevens's Ratio-Scaling Approach: I. Loudness Production.Perception & Psychophysics, 62, 1505-1511.

Graham, C. H. (1958). Sensation and Perception in an Objective Psychology.Psychological Review, 65, 65-76.

Krider, R. E, Raghubir, P., & Krishna, A. (2001) "Pizzas: or Square? PsychophysicalBiases in Area Comparisons," Marketing Science, 20(4), Fall, 405-425.

Luce, R. D. (2002). A Psychophysical Theory of Intensity Proportions, JointPresentations, and Matches. Psychological Review, 109, 520-532.

Luce, R. D. (2004). Symmetric and Asymmetric Matching of Joint Presentations.Psychological Review, 111, 446-454.

McKenna, F. P. (1985). Another Look at the „New Psychophysics“. BritishJournal of Psychology, 76, 97-109.

References

40

Narens, L. (1996) A Theory of Ratio Magnitude Estimation. Journal ofMathematical Psychology, 40, 109-129.

Narens, L. (2002) The Irony of Measurement by Subjective Estimations.Journal of Mathematical Psychology, 46, 769-788.

Orth, B. (1982). Zur Bestimmung der Skalenqualität bei ,direkten´ Skalierungsverfahren. Zeitschrift für experimentelle und

angewandte Psychologie, Band XXIX, Heft 1, S. 160-178. Peißner, M. (1999) Experimente zur direkten Skalierbarkeit von gesehenen

Helligkeiten [Experiments on the direct scalability of perceived brightness].Unpublished master's thesis, Universität Regensburg.

Shepard, R. N. (1978). On the Status of „Direct“ Psychological Measurement.In C. W. Savage (Ed.), Minnesota Studies in the Philosophy ofScience (Vol. 9, pp. 441-490). Minneapolis: University of Minnesota Press.

Shepard, R. N. (1981). Psychological Relations and Psychophysical Scales:On the Status of „Direct" Psychological Measurement. Journal of Math-ematical Psychology, 24, 21-57.

Stevens, S. S. (1936). A Scale for the Measurement of a Psychological Magnitude:Loudness. Psychological Review, 43, 405-416.

41

Stevens, S. S., & Guirao, M. (1963). Subjective Scaling of Length and Areaand the Matching of Length to Loudness and Brightness. Journal ofExperimental Psychology, 66, 177-186.

Teghtsoonian, M. (1965). The Judgment of Size. American Journal of Psycholoy,78, 392-402.

Zimmer, K. (in press). Examining the Validity of Numerical Ratios in LoudnessFractionation. Perception & Psychophysics.

Allgemein zur Methodik

Huber, O. (2000) Das psychologische Experiment: Eine Einführung. Hans Huber Verlag, Bern.

Bortz, Lienert, Boehnke: (2000) Verteilungsfreie Methoden in der Biostatistik. Springer Verlag, Berlin.Vorberg, D. & Blankenberger, S. (1999). Die Auswahl statistischer Tests und Maße. Psychologische Rundschau, 50, 157-164

42

Erg

ebni

stab

elle

. E

in |z

(U)|

Wer

t >

1.96

bed

eute

t, da

ß di

e be

iden

Fl

äche

n di

e lt

. A

xiom

gle

ich

sein

sol

lten

, ve

rsch

iede

n gr

oß

sind

. In

dies

en

Fäll

en w

urde

das

A

xiom

ver

letz

t (f

ettg

edru

ckt)

. D

as V

orze

iche

n gi

bt d

ie R

icht

ung

der

Ver

letz

ung

an. E

in n

egat

ives

V

orze

iche

n be

deut

et, d

aß d

ie

q

p´

Schä

tzun

g kl

eine

r is

t als

die

r´

Sc

hätz

ung.

43

Es geht in dieser Untersuchung um das Schätzen von Flächen.

Du wirst an der Wand eine Fläche sehen. Darüber steht eine Aufforderung, zb. „2“. Das bedeutet, daß es Deine Aufgabe ist, eine 2 mal so große Fläche

herzustellen.

Es erscheint eine weitere Fläche, dessen Größe Du mit der „Pfeil aufwärts“ (↑) - Taste oder mit der „Pfeil abwärts“ (↓) - Taste entsprechend verändern kannst. Stelle also einfach eine Fläche her, welche Dir entsprechend der Aufforderung zb. 2 mal, 3

mal,... so groß erscheint.

Wenn Du denkst, die von Dir hergestellte Fläche ist entsprechend der Aufforderung 2 mal, 3 mal,... so groß, dann drücke bitte die Return-Taste.

Wenn Du denkst, die Leinwand ist zu klein für Deine Schätzung, dann drücke einfach die „Esc“-Taste und die nächste Aufgabe erscheint.

Es gibt keinen Zeitdruck. Trotzdem solltest du nicht allzu lange an einer Aufgabe

„herumgrübeln“. Die heutige Sitzung besteht aus ca. 200 Schätzungen. Nach 20 min gibt´s eine kurze Pause. Du kannst auch jederzeit selber mal durchatmen, die Augen

fest schließen,... . Bitte versuche, dich so gut wie möglich zu konzentrieren.

Instruktion

Stevens Direct Scaling Methods and the Uniqueness Problem:

Documents

Transcript of Stevens Direct Scaling Methods and the Uniqueness Problem: