Download - Component-specific usability testing Dr Willem-Paul Brinkman Lecturer Department of Information Systems and Computing Brunel University ([email protected])

Component-specific usability testing

Dr Willem-Paul BrinkmanLecturer

Department of Information Systems and Computing

Brunel University([email protected])

Topics Introduction

Whether and how the usability of components can be tested empirically.

- Testing different versions of component- Testing different components

Whether and how the usability of components can be affected by other components.

- Consistency- Memory load

Introduction

Component-Based Software Engineering

Empirical Usability Testing

Layered communication

Layered Protocol Theory

(Taylor, 1988)

15 + 23 =

15+23=

01111

10111

Add

100110

38

Processor

Editor

Control results

Control equation

User Calculator

15

15

15 +

15 +

15 + 23

15 + 23

38

38

Usability Testing

Aim to evaluate the usability of a component based on the message exchange between a user and a specific component

Two paradigms

Create

S upport

Reuse

M anage

new components

components from repos ito ry

productsP roduct requirementsand exis ting so ftware

feedback

feedback

Multiple versions testing paradigm

Single version testing paradigm

Manage

Support

Re-use

Create

Test Procedure

Normal procedures of a usability test

User task which requires interaction with components under investigation

Users must complete the task successfully

Component-specific component measures

Number of messages received

The effort users put into the interaction

Objective performance

Perceived ease-of-use

Perceived satisfaction

ComponentControl process

Control loop


Increasing the statistical power




y

y1 = xk+ k

y2 = xm+ m

k = k component + k rest

m = m component + m rest

Assumption

k rest m rest

messages

keys





Component-specific questionnaire increase the statistical power because they help help the users to remember their control experience with a particular interaction component





Perceived Usefulness and Ease-of-use questionnaire (David, 1989), 6 questions, e.g.

Learning to operate [name] would be easy for me.

I would find it easy to get [name] to do what I want it to do.

Unlikely Likely





Post-Study System Usability Questionnaire (Lewis, 1995)

The interface of [name] was pleasant.

I like using the interface of [name].

StronglyStrongly

disagree agree

Experimental validation

80 users8 mobile telephones3 components were manipulated

according to Cognitive Complexity Theory (Kieras & Polson, 1985)

1. Function Selector 2. Keypad3. Short Text Messages

Architecture Mobile telephone

Send Text Message

Send Text Message Function

SelectorFunction Selector

KeypadKeypad


Functions Selector

Broad/shallow

Narrow/deep


Keypad

Repeated-Key Method

“L”

Modified-Model-Position method

“J”


Send Text Message

Simple

Complex

Results

Average probability that a measure finds a significant (α = 0.05) effect for the usability difference between the two versions of FS, STM, or the Keypad components

0

0.2

0.4

0.6

0.8

1

0 20 40 60 80

Number of subjects

Po

wer

1. messages received

2. corrected messagesreceived

3. task duration

4. keystrokes

5. corrected keystrokes

6. comp.-spec. ease-of-use7. comp.-spec. satisfaction

8. overall eas-of-use

9. overall satisfaction

Wilcoxon Matched-Pairs Signed-Ranks Tests between the number of correct classification made by discriminant analyses on overall and component-specific measures

Correctly classified

Dimension Overall Component-Specific N T p

Observed performance 62% 78% 37 3 <0.001

Perceived ease-of-use 61% 60% 62 30 0.907

Perceived satisfaction 58% 61% 61 27 0.308

Results

Topics Introduction





Two paradigms

Create

S upport

Reuse

M anage

new components

components from repos ito ry

productsP roduct requirementsand exis ting so ftware

feedback

feedback

Multiple versions testing paradigm

Single version testing paradigm

Manage

Create

Support

Re-use

Testing Different Components

Component specific objective performance

measure:1. Messages received + Weight factor

A common currency

2. Compare with ideal userA common point of reference

Usability of individual components in a single device can be compared with each other and prioritized on potential improvements

Click <right>Click <left on Properties option>

{1}{1}

Click <left on Fill tab>Click <left on on colour red>Click <left on Outline tab>Click <left No Line button>Click <left no Ok button>

{1}{1}{1}{1}{1}

Call <>{2}

Set <Fill colour red, no border>{7}

Right MouseButton Menu

Properties

Assigning weight factors to represent the user’s effort in the case of ideal user

Total effort value

Total effort = MRi.W

• MRi.W : Message received. Weight factor

Click <right>

Click <left on Properties option>

{1}{1}

Click <left on Fill tab>Click <left on on colour red>Click <left on Outline tab>Click <left No Line button>Click <left no Ok button>

{1}{1}{1}{1}{1}

Call <>{2}


Properties

5 2 = 7+2

Assigning weight factors in case of real user

Correction for inefficiency of higher and lower components

Visual Drawing Objects

Properties



Assign weight factors as if lower components operate optimal


Properties


Inefficiency of lower level components: need more messages to pass on a message upwards than ideally required



Properties


Inefficiency of higher level components: more messages are requested than ideally required

UE : User effort

MRi.W : Message received. Weight factor

#MSUreal :Number of messages sent upward by real user

#MSUideal :Number of messages sent upward by ideal user

MRi.W

#MSU real

#MSU ideal

UE =

Ideal User versus Real User

Extra User Effort = User Effort - Total effort

The total effort an ideal user would make

The total effort a real user made

The extra effort a real user made

Calculate for each component:

Prioritize


40 users40 mobile telephones2 components were manipulated

(Keypad only Repeated-Key Method)

1. Function Selector 2. Short Text Messages

Results

010

20304050

6070

Broad & Simple

Narrow & Simple

Broad &Complex

Narrow &Complex

1 2 3 4

Function Selector

Send Text Message

Mobile phones

Ext

ra U

ser

Eff

ort

Results

Measure Function Selector

Send Text Message

Objective

Extra keystrokes 0.64** 0.44**

Task duration 0.63** 0.39**

Perceived

Overall ease-of-use -0.43** -0.26*

Overall satisfaction -0.25* -0.22

Component-specific ease-of-use -0.55** -0.34**

Component-specific satisfaction -0.41** -0.37**

Partial correlation between extra user effort regarding the two components and other usability measures

*p. < .05. **p. < .01.

Comparison with other evaluation methods

Overall measuresSequential Data

analysisGOMSThinking-aloud,

Cognitive Walkthrough and heuristic evaluation

Example: Keystrokes, task duration, overall perceived usability

Relatively easy to obtain

Unsuitable to evaluate components




Based only on lower-level events

Pre-processing: selection, abstraction, and re-coding

Relation between higher-level component and compound message less direct

Components’ status not recorded


Help to understand the problem

Only looking at error-free task execution

Considers the system only at the lowest-level layer





Quicker

Evaluator effect (reliability)





Topics Introduction





Consistency problems

ConsistencyActivation of the wrong mental model

Consistency experiments

48 Users Used 3 applications:

1. 4 Room Thermostats

2. 4 (2 Web-Enabled TV sets 2 Web Page Layouts)

3. 4 Applications (2 Timers 2 Application domains)

Within one layer

Within one layer – Experimental Design

Day time TemperatureN

ight

tim

e T

em

pera

ture

Moving Pointer

Moving Scale

Movin

g

Poin

ter

Movin

g

Sca

le

Daytime version

moving scalemoving pointer

Mes

sage

s N

ight

time

Tem

pera

ture

24

21

18

15

12

moving scale

moving pointer

Within on layer - Results

Between layers

Web-enable TV set

Browser versusWeb pages

Between layers - Page Layout

List layoutMatrix layout

Between layers - Browser

Between layers – Experimental Design

Web Page VersionB

row

ser

List MatrixLi

near

Pla

ne

Web Pages version

matrixlist

Mes

sage

s B

row

ser

40

30

20

10

0

linear-oriented

plane-oriented

Between layers - Results

Application domain

Between Application domain – Experimental Design

ApplicationTim

er

Alarm radio MicrowaveM

ech

an

ical

ala

rmH

ot

dis

h

Application domain

MicrowaveAlarm radio

Num

ber

of M

ode

mes

sage

s re

ceiv

ed27

24

21

18

15

12

9

hot dish

mechanical alarm

Application domain - Results

Topics Introduction





Mental effort problems

Mental Effort - Calculator

Processor

Editor

Control results

Control equation

User Calculator

Memory load – Experimental Design

EquationEd

itor

Easy DifficultLa

rge

dis

pla

yS

mall

dis

pla

y

Equation

difficulteasy

Log-

tran

sfor

med

HR

V 0

.1 H

z B

and

-2.5

-2.6

-2.7

-2.8

-2.9

-3.0

-3.1

large display

small display

Mental Effort - Heart-rate variability

Equation

difficulteasy

tran

sfor

med

num

ber

of s

tore

req

uest

s+1 1.2

1.0

.8

.6

.4

.2

small display

large display

Mental Effort - Control of higher-level layer

Conclusions Whether and how the usability of

components can be tested empirically.- Testing different versions of component : more powerful- Testing different components : prioritized on potential

improvements

Whether and how the usability of components can be affected by other components.- Consistency : components on the same or on higher-

level layers can activate wrong mental models- Memory load : lower-level interaction affects higher-

level interaction strategy

Questions

Thank you for your attention