Download - Making sense out of recorded user-system interaction Dr Willem-Paul Brinkman Lecturer Department of Information Systems and Computing Brunel University.

Making sense out of Making sense out of recorded user-system recorded user-system

interactioninteraction

Dr Willem-Paul BrinkmanLecturer

Department of Information Systems and Computing

Brunel University([email protected])

TopicsTopics VIVID Research Centre Motivation - Component-Based Software

Engineering Experiment 1: Searching for a

component-specific measure Experiment 2: Validating a component-

specific measure New and future research

VIVID Research VIVID Research CentreCentre

Based in the Department of Information Systems and Computing, Brunel University (London)

Original focus on visualisation, but now also includes:

- Mobile technology

- Design for diverse user groups

- Novel input/output devices

11 academics member of staff, 13 PhD Students

disc.brunel.ac.uk/research/vivid/index.htm

MotivationMotivationStudying the usability of

a system

Work conducted together with Reinder Haakma (Philips), Don Bouwhuis (Eindhoven University of Technology)

MotivationMotivation

ExternalExternal ComparisonComparison relating difference in usability to differences in the systems

InternalInternal ComparisonComparison trying to link usability problems with parts of the systems

Component-Based Software Component-Based Software EngineeringEngineering

Create

S upport

Reuse

M anage

new components

components from repos ito ry

productsP roduct requirementsand exis ting so ftware

feedback

feedback

Multiple versions testing paradigm

Single version testing paradigm

Manage

Support

Re-use

Create

Re-use

MotivationMotivationPROBLEM

1. Only empirical analysis of the overall system such as (task time, keystrokes, questionnaires etc) - not powerful

2. Usability tests, heuristic evaluations, cognitive walkthroughs where experts problems – unreliable

SOLUTION

• Component-Specific usability measures: more powerful and reliable

Searching for a Searching for a component-specific component-specific

measuremeasureQuestions

1. What is a component?

2. What interaction data should be recorded?

3. How do we link interaction data with the usability of a component?

Layered Protocol Layered Protocol TheoryTheory

(Taylor, 1988)

Interaction layersInteraction layers

15 + 23 =

15+23=

01111

10111

Add

100110

38

Processor

Editor

Control results

Control equation

User Calculator

15

15

15 +

15 +

15 + 23

15 + 23

38

38

Experiment 1 Experiment 1 – Fictitious – Fictitious InterfaceInterface

User Task: Rotate User Task: Rotate the Trumpetthe Trumpet

Experiment 1 Experiment 1 - Architecture- Architecture

Other symbols

Rotator

MapSelector

Buttons

Bike Aeroplane

RotateChange

X

Rotate(x)

Experiment 1 Experiment 1 - Architecture- Architecture

Other symbols

Rotator

MapSelector

Buttons

Bike Aeroplane

RotateChange

X

Rotate(x)

Low

High Measures

Task time

#Rotate(T0),

#Rotate(T-1),

#Rotate(T-2)

#change, #rotate

#bike,#aeroplane, #other

#clicks

Experiment 1 Experiment 1 - Training- Training

Training Groups I II III IV V VI VII VIII Rotator 0 0 0 0 + + + + Map 0 0 + + 0 0 + + Selector 0 + 0 + 0 + 0 + + useful training, 0 dummy training. Each group exists out of ten subjects.

Experiment 1 : Experiment 1 : Test Test ProcedureProcedure

80 participants, all students of Eindhoven University of Technology

8 different trainings

After training participants were asked to rotate, as fast as possible, a specific music instrument

User interaction with the system was recorded in log file

Once a task was complete the recording stops

Experiment 1 - Experiment 1 - Low-level Low-level Effect of Selector trainingEffect of Selector training

Clicks on

Nu

mb

er

messag

es

0

5

10

15

20

bike aeroplane others

withheld

provided

Experiment 1 - Experiment 1 - High-level High-level Effect Rotator TrainingEffect Rotator Training

#Rotate (X)

Nu

mb

er

messag

es

0

1

2

3

4

5

target target-1 target-2

withheld

provided

Experiment 1 – Control LoopExperiment 1 – Control Loop

Reliability: how do we link interaction data with the usability of a component?

Evaluation

Component

User message

Feedback

Reference value

User

System

Each message is a cycle of the control loop Number of messages presents the user’s effort to control the component

Each message is a cycle of the control loop Number of messages presents the user’s effort to control the component

Experiment 1 - ConclusionExperiment 1 - Conclusion

1. What is a component?

An interaction component is a unit within a device that directly or indirectly receives signals from the user. These signals enable the user to change the physical state of the interaction component

2. What interaction data should be recorded?

Message exchange between the interaction components

Experiment 2 : ValidationExperiment 2 : Validation

80 users8 mobile telephones3 components were manipulated

according to Cognitive Complexity Theory (Kieras & Polson, 1985)

1. Function Selector 2. Keypad3. Short Text Messages

Architecture Mobile Architecture Mobile telephonetelephone

Send Text Message

Send Text Message Function

SelectorFunction Selector

KeypadKeypad

Experiment 2 Experiment 2 – Function – Function SelectorSelector

Versions:

Broad/shallow

Narrow/deep

Experiment 2 Experiment 2 – Keypad– Keypad

Versions

Repeated-Key Method

“L”

Modified-Model-Position method

“J”

Experiment 2 Experiment 2 – Send Text – Send Text MessageMessage

Versions

Simple

Complex

Statistical Tests Statistical Tests

p-value: probability of making type I, or , error, wrongly rejecting the hypothesis that underlying distribution is same.

Results Results – Function – Function SelectorSelector

Mean df Measure Broad Deep Hyp. Er. F p η2 Normal Joint measure — — 7 66 34.47 <0.001 0.80 Time in seconds 947 1394 1 72 29.56 <0.001 0.29 Number of keystrokes 461 686 1 72 37.72 <0.001 0.34 Number of messages received 67 265 1 72 155.34 <0.001 0.68 Ease of use mobile phone 5.5 4.8 1 72 11.86 0.001 0.14 Ease of use menu 5.6 4.5 1 72 22.33 <0.001 0.24 Satisfaction of mobile phone 4.4 3.8 1 72 4.25 0.043 0.06 Satisfaction of menu 4.6 3.5 1 72 15.96 <0.001 0.18 Correcteda Joint measure — — 2 71 60.96 <0.001 0.63 Number of keystrokes 437 602 1 72 20.27 <0.001 0.22 Number of messages received 52 190 1 72 75.36 <0.001 0.51

aCorrected for all a-priori differences between versions of the components.

Results of two multivariate analyses and related univariate analyses of variance with the version of the Function Selector as independent between-subjects variable.

Results Results – Keypad– Keypad

Results of multivariate and related univariate analyses of variance with the version of the Keypad as independent between-subjects variable.

Mean df Measure RK MMP Hyp. Er. F p η2 Normal Joint measure — — 7 66 4.05 0.001 0.30 Time in seconds 872 1083 1 72 9.44 0.003 0.12 Number of keystrokes 438 537 1 72 10.34 0.002 0.13 Number of messages received 233 271 1 72 13.92 <0.001 0.16 Ease of use mobile phone 5.3 5.0 1 72 1.07 0.305 0.02 Ease of use keyboard 5.6 4.9 1 72 11.13 0.001 0.13 Satisfaction of mobile phone 4.3 3.9 1 72 1.76 0.188 0.02 Satisfaction of keyboard 4.6 3.8 1 72 8.97 0.004 0.11

Results Results – Send Text – Send Text MessageMessage

Results of two multivariate analyses and related univariate analyses of variance with the version of the STM component as independent between-subjects variable

Mean df

Measure Simple Compl

ex Hyp. Er. F p η2

Normal Joint measure — — 7 66 18.16 <0.001 0.66 Time in seconds 523 672 1 72 8.15 0.006 0.10 Number of keystrokes 269 320 1 72 4.56 0.036 0.06 Number of messages received

12 49 1 72 74.18 <0.001 0.51

Ease of use mobile phone 5.0 5.3 1 72 1.15 0.288 0.02 Ease of use STM function 5.1 4.9 1 72 0.35 0.555 0.01 Satisfaction of mobile phone 3.9 4.2 1 72 0.93 0.339 0.01 Satisfaction of STM function 3.9 3.8 1 72 0.26 0.614 0.01 Correcteda Joint measure — — 2 71 20.85 <0.001 0.37 Number of keystrokes 249 289 1 72 2.30 0.134 0.03 Number of messages received

12 34 1 72 26.23 <0.001 0.27

aCorrected for all a-priori differences between versions of the components.

Power of number of messages as a Power of number of messages as a usability measureusability measure

Statistical Power: 1 - β

ResultsResults

Average probability that a measure finds a significant (α = 0.05) effect for the usability difference between the two versions of FS, STM, or the Keypad components

0

0.2

0.4

0.6

0.8

1

0 20 40 60 80

Number of subjects

Po

wer

1. messages received

2. corrected messagesreceived

3. task duration

4. keystrokes

5. corrected keystrokes

6. comp.-spec. ease-of-use7. comp.-spec. satisfaction

8. overall eas-of-use

9. overall satisfaction

Component-Based Software Component-Based Software EngineeringEngineering

Create

S upport

Reuse

M anage

new components

components from repos ito ry

productsP roduct requirementsand exis ting so ftware

feedback

feedback

Multiple versions testing paradigm

Single version testing paradigm

Manage

Support

Re-use

Create

Re-use

Testing Different Testing Different ComponentsComponents

Component specific objective performance

measure:1. Messages received + Weight factor

A common currency

2. Compare with ideal userA common point of reference

Usability of individual components in a single device can be compared with each other and prioritized on potential improvements

Click <right>Click <left on Properties option>

{1}{1}

Click <left on Fill tab>Click <left on on colour red>Click <left on Outline tab>Click <left No Line button>Click <left no Ok button>

{1}{1}{1}{1}{1}

Call <>{2}

Set <Fill colour red, no border>{7}

Right MouseButton Menu

Properties

Assigning weight factors to represent theAssigning weight factors to represent the user’s effort in the case of ideal user user’s effort in the case of ideal user

Total effort valueTotal effort value

Total effort = MRi.W

• MRi.W : Message received. Weight factor

Click <right>

Click <left on Properties option>

{1}{1}

Click <left on Fill tab>Click <left on on colour red>Click <left on Outline tab>Click <left No Line button>Click <left no Ok button>

{1}{1}{1}{1}{1}

Call <>{2}


Properties

5 2 = 7+2

Assigning weight factors in Assigning weight factors in case of real usercase of real user

Correction for inefficiency of higher and lower components

Visual Drawing Objects

Properties


Assigning weight factors in Assigning weight factors in case of real usercase of real user

Assign weight factors as if lower components operate optimal


Properties


Inefficiency of lower level components: need more messages to pass on a message upwards than ideally required

Assigning weight factors in case of real Assigning weight factors in case of real useruser


Properties


Inefficiency of higher level components: more messages are requested than ideally required

UE : User effort

MRi.W : Message received. Weight factor

#MSUreal :Number of messages sent upward by real user

#MSUideal :Number of messages sent upward by ideal user

MRi.W

#MSU real

#MSU ideal

UE =

Ideal User versus Real UserIdeal User versus Real User

Extra User Effort = User Effort - Total effort

The total effort an ideal user would make

The total effort a real user made

The extra effort a real user made

Calculate for each component:

Prioritize

Experiment 2 Experiment 2 - Single - Single versionversion

40 users4 mobile telephones2 components were manipulated

(Keypad only Repeated-Key Method)

1. Function Selector 2. Short Text Messages

ResultsResults

010

20304050

6070

Broad & Simple

Narrow & Simple

Broad &Complex

Narrow &Complex

1 2 3 4

Function Selector

Send Text Message

Mobile phones

Ext

ra U

ser

Eff

ort

ResultsResults

Measure Function Selector

Send Text Message

Objective

Extra keystrokes 0.64** 0.44**

Task duration 0.63** 0.39**

Perceived

Overall ease-of-use -0.43** -0.26*

Overall satisfaction -0.25* -0.22

Component-specific ease-of-use -0.55** -0.34**

Component-specific satisfaction -0.41** -0.37**

Partial correlation between extra user effort regarding the two components and other usability measures

*p. < .05. **p. < .01.

TopicsTopics VIVID Research Centre Motivation - Component-Based Software

Engineering Experiment 1: Searching for a component-

specific measure Experiment 2: Validating a component-

specific measure New and future research

- Extending the analysis outside the lab

- Extending the analysis beyond only usability issues

New Projects New Projects - Field - Field usabilityusability

• CD player, which 10 users will use at home

• Record interaction: online assignment of weigh factors, both optimal and real user, to messages

• Correlated interaction data with other data (questionnaire, dairy, interview)

(Pui-Fong Man)

New Projects New Projects - PROSKIN- PROSKIN

• Exciting Interface designed for the average user. However, the average user does not exist.

• Developing skins for specific user groups could be a way forward

• Question:

• How to identify user groups?

• What do user groups want?

Work conducted together with Nick Fine

User profiling for skinnable domestic technology

New Projects New Projects - PROSKIN- PROSKIN

Possible solution

• Recording online interaction, Identifying user groups, Developing skins for these user groups

Question

• How to establish user groups that are relevant for designer?

• This time, how to make sense of the interaction data beyond usability? Work conducted

together with Nick Fine

User profiling for skinnable domestic technology

New Projects New Projects - PROSKIN- PROSKINApproachApproach

Interaction data

User metrics

User groups based on interaction data Design of

Skins

Online Validation

Conclusions and Final Conclusions and Final RemarksRemarks

Interaction data can be used to study the usability of interaction components

- External Comparison between different versions: More Powerful

- Internal Comparison: prioritized on potential improvements

Future questions - Usability analysis of everyday life interaction- Establishing new paradigms to understand

interaction data beyond usability issues

Questions

Thank you for your attention