CHI'07: Biases in Human Estimation of Interruptibility

1

Avrahami, Fogarty, Hudson

Biases in Human Estimation of Interruptibility:

Effects and Implications for Practice

Daniel Avrahami, James Fogarty & Scott Hudson


Introduction

Estimating someone else’s interruptibility is something we do every day At home, at work, at the store

…but it’s also something we’re not always good at

This becomes much harder when done over a distance

Let’s try this together:


Estimating Interruptibility

1 2 3 4 5Highly Interruptible Highly Non-Interruptible

30 seconds


Estimating Interruptibility



Prompting for Self-Report



And the answer is…



And the answer is…


3


Goals

A better understanding on the biases in estimating others’ interruptibility can inform the design of CMC and awareness systems

Provide an insight into how people are likely to use different pieces of contextual information For example, the state of an office door was significantly

correlated with errors in estimation

A lot of previous work on CMC and awareness systems [cf. Fish’90, Mantei’91,Dourish’92, Bly’93, Adler’94, Tang’01]


Research Questions

Which contextual cues (e.g., working on the computer) affect the error in human estimation of another person’s interruptibility?

What is the source of a contextual cue’s effect on the bias in estimation? e.g., overrating the strength of a cue


Talk Outline

Study Design

Measures and Analysis

Results

Conclusions and Future work

Details (f-measures, etc.)

13


Study Design


Study Design

Compare Self-Reports of Interruptibility (Reported) with Estimations of Interruptibility (Estimated)

Two groups of participants: Reporters (in their natural work environment) Estimators (viewing video and audio recordings)


Reporters

Four high-level staff members (3 females, 1 male) Audio and video recordings in their offices during a

month-long period Experience-sampling method used to collect self-

reports of interruptibility at random intervals (30-minutes average)

672 self-reports and over 600 hours of data

[Used in Hudson’03, Fogarty’05 for the creation of predictive models]


Estimators

40 subjects (Online recruitment, majority were students)

Watched 15- or 30-second clips of reporters Ensured that did not know the Reporters

Provided estimates of interruptibility 60 clips each

Allowed to watch each clip as many times as wanted

Task lasted about 1 hour

[Used in Fogarty’05 to compare performance of estimators vs. ML]


Estimations vs. Self-Reports

Tested the relationship between Estimations and Reports:

Estimated Interruptibility was significantly correlated with Reported Interruptibility (p<.001)

(This is good). Means that Estimators examined the situation presented to them

Estimated Interruptibility was significantly different from Reported Interruptibility (p<.001)

Estimators, on average, estimated Reporters to be more interruptible than reported

(Two outliers’ data excluded from the remaining analyses)

19


Measures and Analysis


Contextual Cues

Coded by six paid coders

For each 15 seconds segment, coded for a large set of contextual cues that could be coded reliably Reporters activities Guest activities Environmental cues

Inter-coder agreement was 93.4% (evaluated on 5% of the data)


Contextual Cues (cont.)

PhoneSocial Engagement

Computer

Desk

PapersFile Cabinet

Food

Writing

Door is ClosedDrink

Standing

Present


“Estimation Error”

Estimation Error = Reported – EstimatedReporte

d

4

3

1

2

5

2Estimated1 3 4 5


4

3

3

2

2

1

1

1

12

-1

-1

-1

-1

-2

-2

-2

-3

-3

-40

0

0

0

0



d

4

3

1

2

5

2Estimated1 3 4 5




d

4

3

1

2

5

-1

-1

-1

-1

-2

-2

-2

-3

-3

-40

0

0

0

04

3

3

2

2

1

1

1

12

0

0

0

0

0

Under-estimation

Over-estimation

2Estimated1 3 4 5


Analysis Approach

Step 1: Find which cues have an effect on Estimation Error Effect on Under-Estimation errors? Effect on Over-Estimation errors?

Step 2: Investigate the cause for a cue’s effect on Estimation Error Effect on Reported Interruptibility? Effect on Estimated Interruptibility?

„‚


Analysis Approach

Step 1: Find which cues have an effect on Estimation Error Effect on Under-Estimation errors? Effect on Over-Estimation errors?

Step 2: Investigate the cause for a cue’s effect on Estimation Error Effect on Reported Interruptibility? Effect on Estimated Interruptibility?

„‚

For example:

Found that the Reporter using the had a

• Significant effect on Over-Estimation (no significant effect on Under-Estimation)

• Significant effect on Estimated Interruptibility, but…no significant effect on Reported Interruptibility

“Considering a cue that is not significant”

For example:

Found that the Reporter using the had a

• Significant effect on Over-Estimation (no significant effect on Under-Estimation)

• Significant effect on Estimated Interruptibility, but…no significant effect on Reported Interruptibility


‚


A couple of notes on the analysis

A self-report determines the possible range of Estimation Errors Reported = 1, Error can be -4 … 0 Reported = 5, Error can be 0 … 4

=> Need to include the Reported Interruptibility as a control measure in the analysis of error

All done using Mixed Model analysis


Results


Under-estimated when the reporter was socially engaged

Over-estimated when the reporter wasn’t socially engaged

Social Engagement

„ ‚

“Overrating the strength of a cue”


Over-estimated reporter’s interruptibility

when wasn’t using the phone

Phone

‚



Greater over-estimation error when the reporter was standing

Reporter Standing significantly correlated with situation more interruptible (both R,E) Link between physical transitions and interruptions [Ho’05]

Reporter is Standing

‚



Computer

Estimators more likely to interpret a situation as more interruptible than reported when the Reporter was using the computer Link to issues of online presence and availability

‚



Under-estimated when the door was closed Correlation between the state of the door

and Reported interruptibility was not significant

Door is Closed

„ ‚



Estimators assessing Reporters as more interruptible when they were drinking

Correlation between drinking and Reported interruptibility was not significant

Drink

‚


35


Conclusions and Future Work


Conclusions

Presented results from an in-depth analysis of causes for biases in human estimation of interruptibility

Compared self-reports, collected in the field, and estimations based on audio and video recordings


Conclusions (cont.)

Findings suggest that providing too much information may not only be a concern for privacy, but may lead to errors in estimations Sharing certain contextual cues will likely result in

misinterpretations of a person’s interruptibility

A new system, informed by our results, could Avoid exposing certain cues (or specific levels of cues) Enhance (or moderate) others


Future Work

Examine the effect of other clip-durations

Examine the effect of degree of familiarity between reporter and estimator on estimation errors

Observe reporters in other settings and jobs

Investigate the use of these findings for effective creation of computer-supported communication and awareness systems


Acknowledgements

Yaakov Kareev

Darren Gergle

Laura Dabbish

Joonhwan Lee

40


This work was funded in part by NSF Grants IIS-0121560, IIS-0325351, and by DARPA Contract No. NBCHD030010

Thank you

for more info visit: www.cs.cmu.edu/~nx6

or email:[email protected]

[email protected]@cs.cmu.edu


FAQ

2PT

Y-SPRTE

LEN1530


Why separate over and under?

Shouldn’t just use absolute or squared error because over and under estimation will make it seem like there is no effect

Shouldn’t just put all together because errors will cancel each other out

back


Is the length of the clips reasonable?

The information available to Estimators in this study (15/30 second video+audio clips) was similar to information available to users of media space systems, and far richer than information available in most awareness systems

back


Why not use a 2-point scale?

With 2 levels, Estimation is either 100% correct, or 100% incorrect With 5-levels, we get degrees of error

Doesn’t make sense asking Reporters and Estimators to make a binary decision when Don’t know what interruption is about Don’t know whom the interruption is from

Couldn’t analyze 5-scale data as 2-points If Reported=1 and Estimated=4, should we count as

correct??back

CHI'07: Biases in Human Estimation of Interruptibility

Travel

Transcript of CHI'07: Biases in Human Estimation of Interruptibility