Evaluation methods

Media Design course

Autumn 2016

Evaluation Methods

Usability

INDEX

1. Introduction

2. Usability metrics

3. Usability inspection methods

4. User-based evaluation

5. Usability evaluation methods

6. Planning evaluation

1. INTRODUCTION

The effectiveness, efficiency and satisfaction with which specified users achieve specified goals in particular environments.

Definition of usability

ISO/IEC 9126

1. INTRODUCTION

Usability quality components (Nielsen, 2012)

•  Learnability

•  Efficiency

•  Memorability

•  Errors

•  Satisfaction

User behavior has low tolerance for difficult designs or slow sites. If they don’t

understand the functioning of the sites, they will leave.

Why investing in usability?

1. INTRODUCTION

•  Offer a good return on investment

•  Can bring competitive advantage

•  Help making the design adequate.

Usability studies:

EFFECTIVENESSThe accuracy and completeness with which users achieve specified goals

Usability metrics (ISO/IEC 9126-4 – Metrics)

•  Completion rate

•  Number of errors Errors can be unintended actions, slips, mistakes or omissions that a user makes

while attempting a task. Errors detected during user testing should be described,

rated and classified.

2. USABILITY METRICS


•  Overall relative efficiency The overall relative efficiency uses the ratio of the time taken by the users who successfully

completed the task in relation to the total time taken by all users.


EFFICIENCYThe resources expended in relation to the accuracy and completeness with which

users achieve goals.

•  Time-based efficiency This value helps identify the average time spent on a task (which might be successfully

completed or not)



SATISFACTIONThe comfort and acceptability of use.

•  Task level satisfactionJust after users attempt a task they should be given a questionnaire to measure how difficult that task was.

A popular post-task questionnaire is the SEQ (Single Ease Question) consisting in 1 question.


•  Test level satisfactionAt the end of the test, each participant fills a questionnaire to assess their general level of satisfaction��

1. I think that I would like to use this system frequently.2. I found the system unnecessarily complex.3. I thought the system was easy to use.4. I think that I would need the support of a technical person to be able to use this system.5. I found the various functions in this system were well integrated.6. I thought there was too much inconsistency in this system.7. I would imagine that most people would learn to use this system very quickly.8. I found the system very cumbersome to use.9. I felt very confident using the system.10. I needed to learn a lot of things before I could get going with this system

SYSTEM USABILITY SCALE

User experience is concerned with “all aspects of the user’s experience when interacting with the product, service, environment or facility

User experience versus user satisfaction

ISO 9241-210


3. USABILITY INSPECTION METHODS

•  Heuristic evaluation

•  Cognitive walkthrough

Experts assessment

•  Feature inspection

•  Consistency inspection

•  Standards inspection


Heuristic evaluation

•  Systematic inspection method of the GUI

•  Check if basic design rules have been

followed appropriately

•  Goal is to find usability problems

•  Carried out often before the user test

•  Evaluators go trough the list of ten rules (principles)

http://www.melodychou.com/spoter.html


Heuristic evaluation (Normal Nielsen group)

•  Use of simple and natural dialogue•  Speak the users language•  Minimise users’ memory load•  Make user interface consistent

•  Give user feedback•  Mark exits clearly•  Make shortcuts available•  Give clear error messages•  Prevent errors•  Provide enough help and documentation


•  Cognitive walk through is useful to understand the user's thought processes and decision making when interacting with a system, specially for first-time or infrequent users.

•  Aim is to “walk trough” all the tasks in the service that the users is supposed to perform.

Cognitive walkthrough

Pluralistic usability walkthrough

•  The service is gone trough with experts and the designers, users and they discuss the elements

4. USER-BASED EVALUATION

The purpose of user testing is not to understand users but to evaluate how particular users can carry out particular tasks (using your system) in a particular context.

Aim

User testing followed by a single design iteration improves measured usability by 38% on average


Validity versus reliability

Reliability

Validity•  Does the usability test measures something of relevance to usability of real products

in real use?•  Requires methodological understanding of the test method•  Typical problems deal with user selection, task selection, omitting time constraints

and social influences.

•  Would the same results be obtained if the tests were repeated?•  Reliability of usability tests is a problem because of the huge individual differences

between test users.

Types:

Formative

Summative


•  Can take place at any point during development.•  The focus is to identify problems and potential solutions (possibly with indication of

frequency).•  The designers can then use these frequencies to help rate the severity of the

problems and prioritise.

•  Assesses the success of a finished system or product (find fixes before release and assess future releases)

•  The performance levels are measured using a set of predefined benchmark tasks.

Usability benchmarking

Basic usability study measurements:

•  Success/Failure within a time threshold•  Time on task•  # of errors before completion


Image from Flickr user “Josep Ma Rosell”

Usability-lab studies


Usability-Lab Studies: participants are brought into a lab, one-on-one with a researcher, and given a set of scenarios that lead to tasks and usage of specific interest within a product or service.

Image from Flickr user “Yandle” Image from Flickr user “Witflow”

Eyetracking


An eyetracking device measures where participants look as they perform tasks or interact naturally with websites, applications, physical products, or environments.

https://www.flickr.com/photos/cinteractionlab/4557712520

https://www.flickr.com/photos/rosenfeldmedia/10910197294

Card sorting


Users are asked to organize items into groups and assign categories to each group. This method helps create or refine the information architecture of a site by exposing users’ mental models.

Image from Flickr user “Yandle”

Remote studies

•  Moderated remote usability studiesStudies conducted remotely with tools such as screen-sharing and remote control capabilities.

•  Unmoderated remote panel studiesTrained participants who have video recording and data collection software installed on their own personal devices use a product while thinking aloud.

•  Unmoderated UX studiesResearch tool installed on participants devices that captures behaviors and attitudes, usually by giving users goals or scenarios to accomplish

•  True-intent studiesRandom site visitors are asked about their goals on a site and how successfully they were.

•  Intercept surveysA survey that is triggered during the use of a site or application.


Clickstream analysis


Analysis of the record of screens or pages that users clicks on and sees, as they use a site or software product; it requires the site to be instrumented properly or the application to have telemetry data collection enabled.

http://www.userzoom.com/ux-benchmarking/7-reasons-why-userzooms-mobile-solution-makes-usability-benchmarking-easy-and-efficient/

A/B testing


Testing different designs on a site by randomly assigning groups of users to interact with each of the different designs and measuring the effect of these assignments on user behavior.

http://ecommerceconsulting.ca/2012/07/five-best-tools-for-ecommerce-diagnostics-part-2/

Think aloud protocol


A verbal protocol used to understand what users are thinking while they are performing tasks. Verbal protocol has negative impacts on many measurable user performance metrics, such as time to complete tasks.

https://www.youtube.com/watch?v=-h8hUtwkMCE

Wizard of Oz


Research experiment in which subjects interact with a (computer) system that subjects believe to be autonomous, but which is actually being operated or partially operated by an unseen (hidden) human being.

http://www.ericsson.com/uxblog/2012/12/smokes-and-mirrors/

A cost-effective method proposed by Jakob Nielsen of usability evaluation based on

three techniques:

•  Scenarios

Discount usability engineering

•  Simplified think-aloud

•  Heuristic evaluation

5. USABILITY EVALUATION METHODS

The primary characteristics of discount usability engineering are small sample sizes, frequent repetition of these small tests, and reliance on direct observations rather than on statistically established findings.

6. PLANNING USER TESTING

1. Purpose of the test (formative/summative evaluation)

2. Test goals (ex: overall usability, learnability, errors…)

3. Users Who are the users going to be and how are we going to get hold of them?

How many users are needed?

5. Tasks What test tasks will the users be asked to perform?

What criteria will be used to determine when the users have finished each of the test tasks correctly?

To what extend will the experimenter be allowed to help the users during the test?

4. Organization Who will facilitate the test?

Where will the test take place? What equipment is required? How long it will take?...

7. Data collectionWhat data is going to be collected?What data collection instruments will be used?

8. Data analysis What evaluation measures will be used?

What data analysis methods will be adopted?

6. MethodsWhat methods are going to be used?


Ethical aspectsProvide enough information to the participants (take informed written consent)

Participants should be aware that they can withdraw at any point

Participants’ data should be anonymized

Background informationHow much does the user know about the product?User background information (age, occupation…)

Before the testProviding information to the user (what is the test about, aims, structure of the session…)

Make clear :

- The purpose is to evaluate the software, not the user- All information is confidential

- The user may pause or stop the test at any given moment


Test stages

During the testThe user should have feeling of controlCreate a scenario of what the user should do i.e., tasksTasks are usually given one at timeNatural tasksObserve, ask, give guidance when needed


Test stages

After the testDebriefing

Questionnaire Image from Flickr user “Leandro Agrò”

FURTHER READINGS

This material uses Creative Commons License

Recognition – Share alike.

Brooke, J. (1996). SUS-A quick and dirty usability scale. Usability evaluation in industry, 189(194), 4-7.

Cockton, G. Usability evaluation. Interaction Design Foundation.https://www.interaction-design.org/literature/book/the-encyclopedia-of-human-computer-interaction-2nd-ed/usability-evaluation

Cooper, A. (2004). The inmates are running the asylum:[Why high-tech products drive us crazy and how to restore the sanity]. Indianapolis, IN, USA:: Sams.

Dumas, Joseph S. and Redish, Janice C. (1993): A Practical Guide to Usability Testing. Norwood, NJ, Intellect

Forlizzi, Jodi (2008): The product ecology: Understanding social product use and supporting design culture. In International Journal of Design, 2 (1) pp. 11-20

Gould, J. D., & Lewis, C. (1985). Designing for usability: key principles and what designers think. Communications of the ACM, 28(3), 300-311.

International Standards Association (2001). ISO/IEC 9126-1:2001 Software engineering - Product quality - Part 1: Quality model,. Retrieved 1 December 2011 from International Standards Association: http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=22749

FURTHER READINGS



International Standards Association (2004). ISO/IEC TR 9126-4:2004 Software engineering -- Product quality -- Part 4: Quality in use metrics. Retrieved 1 December 2011 from International Standards Association: http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=39752

International Standards Association (2011). ISO/IEC 25010 Systems and software engineering -- Systems and software Quality Requirements and Evaluation (SQuaRE) -- System and software quality models. Retrieved 1 December 2011 from International Standards Association: http://www.iso.org/iso/home/store/catalogue_ics/catalogue_detail_ics.htm?csnumber=35733

Lavery, D., Cockton, G., and Atkinson, M. P. 1996. Cognitive Dimensions: Usability Evaluation Materials, Technical Report TR-1996-17, University of Glasgow. Accessed 15/9/11 at http://www.dcs.gla.ac.uk/asp/materials/CD_1.0/materials.rtf

Lavery, D., and Cockton, G. 1997. Cognitive Walkthrough: Usability Evaluation Materials, Technical Report TR-1997-20, Department of Computing Science, University of Glasgow. Edited version available 15/9/11 as http://www.dcs.gla.ac.uk/~pdg/teaching/hci3/cwk/cwk.html

Nielsen, J. (1994). Guerrilla HCI: Using discount usability engineering to penetrate the intimidation barrier. Cost-justifying usability, 245-272.

Nielsen, J., & Molich, R. (1990, March). Heuristic evaluation of user interfaces. In Proceedings of the SIGCHI conference on Human factors in computing systems (pp. 249-256). ACM.

FURTHER READINGS



Nielsen, J.; Norman, D. A. (14 January 2000). "Web-Site Usability: Usability On The Web Isn't A Luxury". JND.org. http://www.jnd.org/dn.mss/usability_is_not_a_l.html

Nielsen, J. (4 January 2012). "Usability 101: Introduction to Usability". Nielsen Norman Grouphttps://www.nngroup.com/articles/usability-101-introduction-to-usability/

Nielsen, J. (1 January 1995). “10 Usability Heuristics for User Interface Design”. Nielsen Norman Grouphttps://www.nngroup.com/articles/ten-usability-heuristics/

Polson, P. G., Lewis, C., Rieman, J., & Wharton, C. (1992). Cognitive walkthroughs: a method for theory-based evaluation of user interfaces. International Journal of man-machine studies, 36(5), 741-773.

Ritter, F. E., Kim, J. W., Morgan, J. H., & Carlson, R. A. (2013). Running behavioral studies with human participants: A practical guide. Thousand Oaks, CA: Sage.

Whiteside, John, Bennett, John and Holtzblatt, Karen (1988): Usability Engineering: Our experience and Evolution. In: Helander, Martin and Prabhu, Prasad V. (eds.). "Handbook of human-computer interactio". pp. 791-817

Evaluation methods

Design

Transcript of Evaluation methods