Creating Accessible Research Papers

21
Creating Accessible Research Papers Erin Brady 1,2,4 , Yu Zhong 1,2,3 , and Jeffrey P. Bigham 1 Carnegie Mellon University, University of Rochester, Google, IUPUI

Transcript of Creating Accessible Research Papers

Creating Accessible PDFs for Conference Proceedings

Creating AccessibleResearch PapersErin Brady1,2,4, Yu Zhong1,2,3, and Jeffrey P. Bigham1Carnegie Mellon University, University of Rochester, Google, IUPUI

Research is Disseminated as PDFs

Our research is disseminated as PDFs. This is what appears on the ACM digital library, and its what appears on our web pages.

People with disabilities are under-represented in science. Papers that cannot be easily assessed and used by people with disabilities cannot be helping.2

Research is Disseminated as PDFs

But, perhaps this is okay, because you can make a PDF accessible by adding structural tags and metadata to a PDF document. While screenreaders can make basic automatic assumptions about how to read a document, it cannot generate information like alternative text for images or figures, and often is incorrect.3

Video by NCSU IT Accessibility, available at https://www.youtube.com/watch?v=GaNwnsT4B5sWithout Tags Reading Can Be Confusing

This is a video created by NCSUs accessibility team, which shows how a screenreader reads an untagged two-column document. The visual reading order for the text is [read paragraphs], but due to the two-column format, the screenreader reads each line as a whole, rather than reading each column separately.4

What is the state of accessibility for conference PDFs?

Automated check of all papers from 2011-2014: 1,811 papersManual checks on ASSETS, W4A papers from 2014: 26 papers

In order to see how to improve accessibility of conferences, we first wanted to check how accessible current conference proceedings are. We examined the proceedings of three conferences - CHI, W4A, and ASSETS from 2011 to 2014, looking at a total of 1,811 papers.

We chose these conferences specifically CHI gave us access to a large amount of data, with close to 400 papers per year, while the communities of W4A and ASSETS are focused around accessibility, and may be more representative of what accessible conference proceedings could look like when information is generated by the authors.5

Automated ChecksMore documents with tags

We ran both automated and manual checks. The automated checks mostly revealed improvements, with more documents being tagged6

Automated ChecksMore documents with heading tags

We ran both automated and manual checks. The automated checks mostly revealed improvements, with more documents being tagged7

Automated ChecksMore documents with the language specified

We ran both automated and manual checks. The automated checks mostly revealed improvements, with more documents being tagged8

Automated ChecksAutomated checks show improvementdrops may be due to expanding community, prevalence of Mac WordW4A now requires accessible PDFsHowever, automated checks are measuring metadata presence, not correctness

Manual Checks: Room to Improve26 papers in 2014 (ASSETS, W4A technical track)62% passed Acrobats full accessibility check

We also performed manual accessibility checks on the 26 papers from ASSETS 2014 and the W4A technical track in 2014. We ran a full accessibility check in Adobe Acrobat, which 62% of papers passed; this check is usually the first indicator that things are accessible, and for papers that were not 10

Manual Checks: Room to Improve26 papers in 2014 (ASSETS, W4A technical track)62% passed Acrobats full accessibility checkAlso examined specific metrics:73% had alt-text for all images85% had tab order specified11.5% had title tagged as a H1

We then checked for certain indicators of accessibility alternative text on all images, tab order specified, and correct use of tags. Alt-text and tab order, two common things mentioned in ACM accessibility guides, were relatively well covered, having more compliance than papers that passed the full check.

However, things like structural tags were much less predictable for example, only 12% of papers had their headings tagged as an H1. Theres clearly room to improve on these types of manual accessibility tags.11

Tagged 25 papers for other authorsProcess could be done by non-authorstructural tags, image and figure descriptionsTime costs were mostly front-loadedMaking CHI 2015 Accessible

In order to explore this in practice, we began an effort to improve the accessibility of CHI 2015. We allowed authors to send camera-ready papers to us, which we tagged to make more accessible. We tagged 25 papers, and while we did not record any metadata about the papers to analyze, we can reflect on the experience here.

For the most part, the process of tagging could be done by a non-author of the paper structural tags are clear from the ACM format, and image and figure descriptions were (for the most part) easy to generate, sometimes requiring a little reading through the text of the paper. The biggest time cost was the process of learning the intricacies of Acrobat in order to tag the documents well.12

Making PDFs of research accessibleis important,

but expensive and difficult.

So while we want to make sure our conference proceedings are accessible, we know its a resource-intensive process and that many people may struggle to tag their documents correctly.13

http://chi2015.acm.org/authors/guide-to-an-accessible-submission/

Financial Cost of PDF Accessibility

First is the financial cost its easiest to add additional metadata to a PDF using Adobe Acrobat, which is a relatively expensive piece of software for an individual. While Microsoft Word users can add metadata before exporting their documents to PDFs, this option is not available for LaTeX or other authoring tools, and theres no way to verify or edit the accessibility tags without Acrobat.15

103 pages

160 pages

800+ pagesTime-Intensive to Learn

PDF accessibility can also result in time costs. It takes a long time to learn how to make PDFs accessible there are long online guides, and many 100+ page books exist to try to teach the principles of accessible PDFs.16

Time-Intensive and Confusing to Verify

Tagged as textTagged as H1

Its also time-intensive to verify that a document is accessible, since automated checkers can only catch simple errors (e.g., an image has no alternative text) but cannot verify the correctness of tags or metadata.17

Embedding Tagged Fonts

Open Preflight1

2Run Embed fonts (even if text is invisible)

3Preflight lists unembedded fonts

4Locate each instance of an unembedded font

5Delete artifact

Hard to Master

Its also hard to master these skills even for this paper on PDF accessibility, we didnt get the tagging completely correct, and had to do some last-minute corrections.19

DiscussionAutomated measures are improvingPapers are not fully accessible

How can we keep improving?Better toolsAlternative Doc formatsInclude in publication process

Creating Accessible PDFs for Conference ProceedingsErin Brady (University of Rochester/IUPUI)Yu Zhong (University of Rochester/Google)Jeffrey P. Bigham (Carnegie Mellon University)