Mirri w4a2012

15
Silvia Mirri Ludovico A. Muratori Paola Salomoni Matteo Battistelli Department of Computer Science University of Bologna Getting one voice: tuning up experts’ assessment in measuring accessibility

Transcript of Mirri w4a2012

Page 1: Mirri w4a2012

Silvia Mirri

Ludovico A. Muratori

Paola Salomoni

Matteo Battistelli

Department of Computer Science

University of Bologna

Getting one voice:

tuning up experts’ assessment in

measuring accessibility

Page 2: Mirri w4a2012

W4A 2012 – April 16th&17th, 2012 - Lyon, France

Summary

2

Introduction

Automatic and manual accessibility evaluations

Our proposed metric

Conclusions and future works

Page 3: Mirri w4a2012

W4A 2012 – April 16th&17th, 2012 - Lyon, France

Introduction

3

Web accessibility evaluations

automatic tools + human assessment

Metrics quantify accessibility level or barriers, providing

numerical synthesis

• automatic tools return binary values

• human assessments are subjective and can get values from a

continuous range

Page 4: Mirri w4a2012

W4A 2012 – April 16th&17th, 2012 - Lyon, France

Our main goal

4

Providing a metric to measure how far a Web

page is from its accessibility version, taking into

account

• integration of human assessments with automatic

evaluations on the same target

• many humans assessments

Page 5: Mirri w4a2012

W4A 2012 – April 16th&17th, 2012 - Lyon, France

Steps

5

1. Mixing up the manual evaluation together with the

automatic ones

2. Combining the assessments coming from different

human evaluations • Values distributed into a given range

• The more experts' assessments contribute to compute a

value, the more this value is stable and reliable

Page 6: Mirri w4a2012

W4A 2012 – April 16th&17th, 2012 - Lyon, France

Automatic and manual evaluations: an example

6

Combination between the IMG element and its ALT

attribute:

1. If the ALT attribute is omitted the automatic check outputs 1

2. If the ALT attribute is present the automatic check outputs 0

Manual evaluation might state that:

• there is no lack of information once the images are hidden (this

can happen in case 1, if the image is a pure decorative one)

• there is a lack of information once the image is hidden

Page 7: Mirri w4a2012

W4A 2012 – April 16th&17th, 2012 - Lyon, France

Our metric

7

• A first version of our metric (Barriers Impact Factor) is

computed on the basis of a barrier-error association

table

• This table reports the list of assistive

technologies/disabilities affected by any error • screen reader/blindness

• screen magnifier/low vision

• color blindness

• input device independence/movement impairments

• deafness

• cognitive disabilities

• photosensitive epilepsy

Page 8: Mirri w4a2012

W4A 2012 – April 16th&17th, 2012 - Lyon, France

Our metric

8

• Comparing automatic checks with WCAG 2.0 success

criteria and identified relationships

• Each barrier is related to one success criterion and to

one level of conformity (A, AA or AAA)

• Manual evaluations take values on the [0, 1] real

numbers interval: • 1 means that an accessibility error occurs

• 0 means the absence of that accessibility error

A check fails a certain error occurs or a

manual control is necessary

Page 9: Mirri w4a2012

W4A 2012 – April 16th&17th, 2012 - Lyon, France

Our metric

9

Page 10: Mirri w4a2012

W4A 2012 – April 16th&17th, 2012 - Lyon, France

Weighting automatic and manual checks

10

1. m(i)=a(i): the formula is a mere average among automatically

and manually detected errors

2. m(i)>a(i): the failure in manual assessment is considered more

significant than the automatic one

3. m(i)<a(i): the failure in automatic assessment is considered

more significant than the manual one

MANUAL [0,

AUTOMATIC

,1]

10

I

II

III

IV MANUAL [0,

AUTOMATIC

,1]

10

I

III

II

IV

Page 11: Mirri w4a2012

W4A 2012 – April 16th&17th, 2012 - Lyon, France

Some considerations

11

• The more human operators provide evaluations about

an accessibility barrier and the more the value of

accessibility level is reliable

• Behavior similar to online rating systems ones

• New users rating can be influenced by already

expressed evaluations from other users

• Variance must be considered so as to reinforce the

computed accessibility level

Page 12: Mirri w4a2012

W4A 2012 – April 16th&17th, 2012 - Lyon, France

A first assessment

12

MANUAL EVALUATIONS

0,7 Expert A

1 Expert B

0,8 Expert C

1 Expert D

0,5 Expert E

AUTOMATIC EVALUATION

0 (no known errors,

1 alert: placeholder

detected)

m=2

a=1

Average=0,8

Variance=0,036

CBIF=0,53

ALT=“Image”

NO LINK, NO TITLE

PAGE CONTENT

CBIF

Page 13: Mirri w4a2012

W4A 2012 – April 16th&17th, 2012 - Lyon, France

Conclusions

13

• We have defined an accessibility metric with the aim to

evaluate barriers as a whole, combining results

provided by using automatic tools and manual

evaluations done by experts

• The metric has been preliminary tested by measuring

accessibility barriers in several local public

administration Web sites

• Five experts are manually evaluating barriers related to

WCAG 2.0 1.1.1 (using an automatic monitoring system

to verify the page content and to collect data from

manual evaluations)

Page 14: Mirri w4a2012

W4A 2012 – April 16th&17th, 2012 - Lyon, France

Future Work

14

• Propose and discuss weights for the whole WCAG 2.0

set of barriers

• Investigate how the number of experts involved in the

evaluation, together with their rating variance, could

influence the reliability of the computed values

Page 15: Mirri w4a2012

W4A 2012 – April 16th&17th, 2012 - Lyon, France 15

Contacts

Thank you for your attention!

For further information:

[email protected]