Evaluation Guideelines and Methods
description
Transcript of Evaluation Guideelines and Methods
Human-Computer Systems
MM4HCI
2013
Lecture 3 – Evaluation methods and
guidelines
Professor Sarah Sharples
Department of Mechanical, Materials and
Manufacturing Engineering
AN EVALUATION
FRAMEWORK
Outline
1. Understand what evaluation is for
2. Preparing for an evaluation
3. The range of evaluation techniques and their uses
4. Understanding some of the practical issues of applying evaluation methods
Main reading - Sharp et al., Chapter 12, 14, 15
What is evaluation?
• Involving users, and user
representatives, in the technology / ICT
design and development process in a
structured manner
• Capturing responses to a design or a
design artefact
• Can be carried out at any point in the
development process
Usability
goals
Efficient to use
Effective to use
Safe to use
Have good utility
Easy to learn
Easy to remember how to use
fun
emotionally fulfilling
rewarding
supportive of creativity
aesthetically pleasing motivating
helpful
entertaining
enjoyable
satisfying
Source: Preece et al., 2002
Evaluation choice considerations
Why
• Why are you conducting the evaluation?
What • What do you have to evaluate (eg prototypes)?
Who • Who is going to help you (users, experts)?
When • When in the development process?
Where • Do you need a „clean‟ environment, or context?
How • What method are you going to use?
Why evaluate?
• Ensure a „user-centred design‟
• Easy to learn, easy to use, efficient, useful, satisfying to use
• From a human factors perspective
• Safe (for the operator), safe for the system, optimal system
performance (Holnagel and Woods, 2006)
• Inform and evolve the design (saves time and
money); verify requirements (Chevalier and Kicka,
2006)
• Benchmarking and comparison
What data do you need to capture?
Satisfaction
Ease of learning
Usability
Performance /
efficiency
What have you got to evaluate?
Lo-fi Hi-fi
Benefits • Cheap
• Addresses layout
• Proof-of-concept
• Open to participatory design
and comment (Erickson,
1995)
• Complete functionality
• Supports quantitive evaluation
(eg users error rates)
• Marketing and sales tool
• A living specification
Drawbacks • Navigation and flow
limitations for evaluation
• Does not support good
quantitative measures (eg
errors)
• Expensive
• Time consuming
• Perceived limited scope for
change
Best used early on or for
rapid re-designs
Best used for quantitative user-
evaluation, and as part of proofs
of concept crossing business
functions
Who is going to be involved?
• Do you need to match against certain
characteristics?
• Age, gender, education, prior knowledge
• Physical, cognitive and attitudinal implications
• Do any of your users pose particular
challenges?
• Older adults, children, children with special needs
• Can you use novices, or HCI experts?
• And how many (depends on method)
When in the development process?
Requirements
Effort
Concept Design and Development Implementation Deployment
Evaluation
Last
minute
panic
testing!!!
Formative vs summative
• Formative
• To inform the design process
• Explorative, using partially completed artefacts
(prototypes)
• Maybe more qualitative or subjective
• Summative
• A confirmation exercise
• To ensure meets intended aims
• Often against a recognised standard or set of
benchmarks (or initial requirements)
Lab
Simulation
Real world
Where? (see Duh et al, 2005)
Evaluation as part of user experience
User Experience
Tech-nology
Users
Tasks
Context
•Is the technology new?
•Is there a novel input or
output?
•How much does the
technology influence the
interaction?
•Are the users experts or have
prior knowledge?
•Do they have specific
characteristics?
• Is „how‟ they do it important?
• Is performance relevant?
• Are you investigating
functionality?
• Does where they use the
technology influence the
interaction?
•Social or physical factors
•Temporality – short or long
periods of use
EVALUATION METHODS
Evaluation Approaches
Read Sharp, Rogers & Preece, 2007
Chapters 14 & 15 for more information
Analytical
• Predictive evaluation methods
Field study
• Interpretive evaluation methods
• Collecting users‟ opinions
Lab study
• Experiments and benchmarking
• Usability studies
Analytical - Predictive evaluation
HCI experts use their knowledge of users and
technology to evaluate interface usability
• Inspection methods and heuristics
• Accessibility (WCAG, 1999)
• User modelling – GOMS and KLM
• Walkthroughs
Analytical - Heuristic evaluation
• ~ 5 HCI experts work independently
• General review of product
• Focus on specific features
• Structured expert reviewing against guidelines, e.g.
• “use simple and natural language”
• “provide shortcuts”
• Collate reviews to prioritise problems
• Five HCI experts typically find c.75% of usability
problems of an interface
• BUT see Cockton and Woolrych, 2002
Analytical - Walkthroughs
• Cognitive walkthrough focus on ease of learning
• Scenario-based evaluation
• 3 main questions:
• Will the correct action be evident to the user?
• Will the user notice that the correct action is available?
• Will the user associate and interpret the response from the
action correctly?
• Pluralistic walkthrough (experts, experts + users)
• Participatory design
Analytical evaluation
Advantages Disadvantages
Experienced reviewers
Can be difficult and
expensive to find experts
Users not involved Experts may have biases
Good experts will have
knowledge of users
Some problems may get
missed, trivial problems
identified
Easy to set up and run study
Field study - Interpretive evaluation
• Aims to enable designers to understand better how
users use systems in context
• Qualitative data
• Description of performance/outcome
Field study - Data collection
• Informal and naturalistic methods of data collection
• Observations, interviews, usage logging, focus groups
• Contextual Inquiry
• Originates from ethnography
• Observe the entire process of interface use, from switching
on computer to going home after task completion
• Co-operative and participative evaluation
• Focus groups
• Development of prototypes
• Iterative design process
Field study - Interviews vs focus
groups? • Do you want opinions or actual tasks?
• Can get error / timing / task data from FGs
• Are users familiar enough to remember
useage?
• Do you have something to „focus‟ on?
• Focus groups need careful planning and
careful facilitation
• See Nielsen, 2001b
Field study methods
Advantages Disadvantages
Reveals what really happens
in context of use
May not be easy to recruit
participants
Description of performance
or outcome
Can be disruptive to the
working environment
Users directly involved True ethnographic studies
require evaluator expertise
Works well for formative
evaluation of prototypes
Quality of results variable
Lab study - Experiments and
benchmarking • Traditional approach to HCI
• Predicted relationship between variables
• Manipulate Independent Variable (IV), measure
Dependent Variables (DV)
• Generally use time/error measurement
• Specific Human Factors measures
• Workload – Nasa TLX
• Situation Awareness – SAGAT
• Body Part Discomfort
Lab study - Usability testing
• An essential part of the evaluation process
• Structured interview and activity
• Observed and recorded (eye tracking, facial
expressions, comments)
• Tends to be summative, towards the end of the
process
• At very least needs interactive prototype
• Can then be backed up with a survey eg SUS
Lab study methods
Advantages Disadvantages
Studies conducted under
controlled conditions
Requires lab facilities and
resources
Experiments provide
quantitative measures
May require experimenter
expertise
Focus on specific aspects of
design or user performance
Can be time consuming and
expensive
Usability testing provides
qualitative results
Unnatural setting may affect
user behaviour
Highlights particular usability
problems
Unrealistic tasks may not
inform design
EVALUATION IN PRACTICE
DECIDE: a framework to guide
evaluation (Preece, Rogers and Sharp, 2002. Chapter 11)
• Determine the goals
• Explore the questions
• Choose the evaluation approach and methods
• Identify the practical issues
• Decide how to deal with the ethical issues
• Evaluate, analyze, interpret and present the data
Applying methods across a project
Effort
Concept Design and Development Implementation Deployment
Live field trials
Travel
application
concepts
Indoor
navigation
prototype
testing
Presentation
of
privacy
information
Lab usability
study
Practical issues
• Selection and recruitment of participants
• Number of participants
• Find evaluators
• Control over environment, study set-up
• Equipment
• Budget constraints
• Schedule/deadline
• Managing the session • Stepping back in interviews and focus groups
Ethical issues
• Develop an informed consent form
• Participants have a right to:
- Know the goals of the study
- Know what will happen to the findings
- Privacy of personal information
- Leave when they wish
- Be treated politely
Example evaluation exercise
You are required to propose an evaluation
programme to support the design of new voice
technologies to help older adults interact with
objects (e.g. furniture, electrical appliance) in
their homes
Summary (1)
There are many issues to consider before conducting an
evaluation study
These include the goals of the study, the approaches and
methods to use, practical issues, ethical issues, and how
the data will be collected, analysed and presented
Evaluation & design are closely integrated in user-
centered design
References
• Cockton, G., & Woolrych, A. (2002) Sale must end: should discount methods be cleared off HCI‟s shelves? Interactions, 9 (5), 13-18.
• Chevalier, A., & Kicka, M. (2006) Web designers and web users: Influence of the ergonomics quality of the web site on the information search. International Journal of Human-Computer Studies, 64 (10), 1031-1048.
• Duh, H. B-L., Tan, G. C. B., & Chen, V. H. (2005) Usability evaluation for mobile device: a comparison of laboratory and field tests. In Proceedings of the 8th conference on Human-computer interaction with mobile devices and services. pp 181-186. New York, NY.: ACM Press.
• Erickson, T. (1995) Notes on design practice: stories and prototypes as catalysts for communication. In J. M. Carroll (Ed.) Scenario-based design: Envisioning work technology in system development pp. 37-58. New York, NY: John Wiley & Sons
• NIELSEN, J. (2000a). The use and misuse of focus groups. http://www.useit.com/papers.
• Nielsen‟s Ten usability Heuristics (2001). retrieved from www.useit.com.
• Sharp, H., Rogers, Y. and Preece, J. (2007). Interaction Design, Beyond human-computer interaction (2nd edition). John Wiley and Sons:NY. Chapters 12, 13, 14 & 15.
• Shneiderman, B. (1998). Designing the User Interface (3rd edition). Addison-Wesley:MA.
• Standard Usability Measurement Inventory (SUMI) retrieved from http://sumi.ucc.ie/index.html January 2008.
• WCAG (1999) Web Content Accessibility Guidelines 1.0. Retrieved 28th Feb 2008, from http://www.w3.org/TR/1999/WAI-WEBCONTENT-19990505/
Summary (2)
Different evaluation approaches and methods are often combined in one study
Triangulation involves using a combination of techniques to gain different perspectives, or analysing data using different techniques
Dealing with constraints is an important skill for evaluators to
develop