Making the grade without Clippy – Use of automatic readability scoring

50
© 2014 – StoneRidge Corporation Making the grade without Clippy – Use of automatic readability scoring Donna Cannady Nall Emory University School of Law [email protected] Bryce Roberts, MS, MSPH StoneRidge Corporation [email protected]

description

#CSUC14

Transcript of Making the grade without Clippy – Use of automatic readability scoring

Page 1: Making the grade without Clippy – Use of automatic readability scoring

© 2014 – StoneRidge Corporation

Making the grade without Clippy – Use of automatic

readability scoringDonna Cannady NallEmory University School of Law

[email protected]

Bryce Roberts, MS, MSPHStoneRidge [email protected]

Page 2: Making the grade without Clippy – Use of automatic readability scoring

© 2014 – StoneRidge Corporation

It looks like you’re writing a website.

Would you like help?

• Get help with it

• Just type away

At its most basic, [readability] is the ability to make readers continue from the top to the bottom of the page and then turn that page; and then make them do that 200 times … - John Curran

Page 3: Making the grade without Clippy – Use of automatic readability scoring

© 2014 – StoneRidge Corporation

Presentation Overview

• Legibility vs Readability

• Readability Scores with Application to Websites

• Implementation

• Additional Uses, Thoughts, and Questions

Page 4: Making the grade without Clippy – Use of automatic readability scoring

© 2014 – StoneRidge Corporation

Basic understanding of how standardizing ease of reading across a website impacts multiple strategic needs

How and why ease of reading measures are objective ways to evaluate the linguistic structure of content

Understand impact and application of ease of reading through better content management, analysis, and evaluation

Learning Objectives

Page 5: Making the grade without Clippy – Use of automatic readability scoring

© 2014 – StoneRidge Corporation

Readability & Legibility

Ultimate Accessibility

“You can lead a user to a webpage, but you can not force them to engage!”

© 2014 – StoneRidge Corporation

Page 6: Making the grade without Clippy – Use of automatic readability scoring

© 2014 – StoneRidge Corporation

Undoubtedly, comprehension of written language correlates with the linguistic structure, density of the visual typographic presentation, and the educational and cultural demographics of the reader.

Keep it simple or readers won’t stay on page for long.

Readability & Legibility

Page 7: Making the grade without Clippy – Use of automatic readability scoring

© 2014 – StoneRidge Corporation

Legibility is another area where the designer can be misled by what

seems like an obvious dictate in type selection and design. There

can be no question about the readability of the message, but legibility

and readability are not quite the same — a dull and uninteresting

presentation in a highly legible typeface will not be widely read.

There have been many studies of comparative legibility, and each

study seems to surface with slightly different conclusions. For the

designer, the best solution is to use his material in such a way that it

arouses interest and invites reading.

—Allen Hurlburt, Layout: The Design of the Printed Page

There is something to be said …

Page 8: Making the grade without Clippy – Use of automatic readability scoring

© 2014 – StoneRidge Corporation

So what is the difference?

Readability Legibility

The effort required to understand the sematic meaning of groups of characters

Factors◦ Number of words◦ Number of letters in words◦ Number of syllables in

words◦ Number of words in

sentences◦ Number of sentences

The effort required to recognize individual letters in relationship to other characters

Factors◦ Typography

Font (many facets) Leading Kerning

◦ Visual Elements Void Space Distractors Contrast

Page 9: Making the grade without Clippy – Use of automatic readability scoring

© 2014 – StoneRidge Corporation

Legibility (Design)

Page 10: Making the grade without Clippy – Use of automatic readability scoring

© 2014 – StoneRidge Corporation

Accessibility is the degree to which a resource (often information) is available and understandable to a wide audience

Affected by both legibility and readability

Requires an effective marriage of structure, design, and content working in harmony

Why do we care?ACCESSIBILITY

Not for just screen readers

Page 11: Making the grade without Clippy – Use of automatic readability scoring

© 2014 – StoneRidge Corporation

L e g i b i l i t y

Hierarchy

Higher Contrast

Appropriate font

Goldilocks type◦ Font size◦ Line height◦ Letter Spacing◦ Line Length

What to do …R e a d a b i l i t y

Headers

Minimal jargon

Infographics

Goldilocks writing◦ Word descent◦ Word average length◦ Sematic organization

Page 12: Making the grade without Clippy – Use of automatic readability scoring

© 2014 – StoneRidge Corporation

Academic Writing Style – Dense, Jargon-Heavy, Recondite/Understandable only by Experts in the Same Field

Website Writing Style – Clear, Concise, Easy to Read by a wide audience

Challenge: how to make an academic website readable and accessible to multiple audiences?

And so …The Great Divide

Page 13: Making the grade without Clippy – Use of automatic readability scoring

© 2014 – StoneRidge Corporation

What is our purpose?

The Nature of Writing

Page 14: Making the grade without Clippy – Use of automatic readability scoring

© 2014 – StoneRidge Corporation

What is our purpose?

The Nature of Writing

Page 15: Making the grade without Clippy – Use of automatic readability scoring

© 2014 – StoneRidge Corporation

A Nielsen study for Sun Microsystems reports◦ 79% of users scan the page and then move on. Only 16% read word-

for-word◦ Reading from computer screens is 25% slower than from paper◦ Web content should have 50% of the word count of its paper equivalent

Readability is related to objectively measureable factors and subjective factors

Most people read at levels lower than their highest academic attainment.

Without legibility, readability is meaningless (mobile)

Ugly truths

Page 16: Making the grade without Clippy – Use of automatic readability scoring

© 2014 – StoneRidge Corporation

A core value of website redesign was to improve accessibility through improvements in structure, readability, and legibility.

Emory Law School Website

Page 17: Making the grade without Clippy – Use of automatic readability scoring

© 2014 – StoneRidge Corporation

Readability Scores with Application to Websites

Measuring the unmeasurable by standing of the shoulders of great intellects

Bight people created tools that can help us!

Page 18: Making the grade without Clippy – Use of automatic readability scoring

© 2014 – StoneRidge Corporation

It looks like you’re evaluating the readability.

Would you like help?

• Get help with it

• Guess away

Automated Reading Scores

All the same but different

Automated Readability Index

Smog Index Coleman Liau Index Gunning Fog Score Flesch Reading Ease Flesch–Kincaid Grade

Level

Page 19: Making the grade without Clippy – Use of automatic readability scoring

© 2014 – StoneRidge Corporation

http://law.emory.edu/admission/juris-doctor/index.html

Good for most everybody

Page 20: Making the grade without Clippy – Use of automatic readability scoring

© 2014 – StoneRidge Corporation

http://law.emory.edu/academics/clinics/turner-environmental-clinic.html

Wow this is hard

Page 21: Making the grade without Clippy – Use of automatic readability scoring

© 2014 – StoneRidge Corporation

Subject matter experts (often our content contributors) struggle to understand the difficulty in reading their academic writing

Objective measures provide a correlative tools with which to evaluate text

Different measures are sensitive to different writing characteristics

Why do we use tools?

Page 22: Making the grade without Clippy – Use of automatic readability scoring

© 2014 – StoneRidge Corporation

Approximate representation of the US grade level needed to comprehend the text

Relies on characters per word and words per sentences

Insensitive to overall text length and polysyllabic words

Automated Readability Index

4.71( h𝑐 𝑎𝑟𝑎𝑐𝑡𝑒𝑟𝑠𝑤𝑜𝑟𝑑𝑠 )+0.5 ( 𝑤𝑜𝑟𝑑𝑠

𝑠𝑒𝑛𝑡𝑒𝑛𝑐𝑒𝑠 )−21.43

Page 23: Making the grade without Clippy – Use of automatic readability scoring

© 2014 – StoneRidge Corporation

Estimates the years of education need to understand a given text

Relies on the number of polysyllabic words (gobbledygook) and number of sentences

Highly inaccurate for unusually structured texts (non-traditional paragraphed structures)

SMOG IndexSimple Measure of Gobbledygook

1.043√¿𝑜𝑓 𝑝𝑜𝑙𝑦𝑠𝑦𝑙𝑙𝑎𝑏𝑙𝑒𝑠 𝑋30

¿𝑜𝑓 𝑠𝑒𝑛𝑡𝑒𝑛𝑐𝑒𝑠+3.1291

Page 24: Making the grade without Clippy – Use of automatic readability scoring

© 2014 – StoneRidge Corporation

Approximate representation of the US grade level needed to comprehend the text

L – average number of letters per 100 words S – average number of sentences per 100 words

Similar to Automated Readability Index Insensitive to complex words

Coleman-Liau Index

𝐶𝐿𝐼=0.0588𝐿−0.296𝑆−15.8

Page 25: Making the grade without Clippy – Use of automatic readability scoring

© 2014 – StoneRidge Corporation

Estimates the years of education need to understand a given text

SMOG is related to this measure Limitations

◦ Overly short sentences have significate and outsized impact

◦ Over estimates difficulty of familiar but polysyllabic words◦ Debated if comma should be considered full stops

Gunning Fog Index

0.4 [( 𝑤𝑜𝑟𝑑𝑠𝑠𝑒𝑛𝑡𝑒𝑛𝑐𝑒𝑠 )+100 (𝑐𝑜𝑚𝑝𝑙𝑒𝑥𝑤𝑜𝑟𝑑𝑠

𝑤𝑜𝑟𝑑𝑠 )]

Page 26: Making the grade without Clippy – Use of automatic readability scoring

© 2014 – StoneRidge Corporation

Inverse scale of ease of reading for a given text with 100 to –X value

Flesch Reading Ease

Score

90.0–100.0 Easily understood by an average 11-year-old student

60.0–70.0 Easily understood by 13- to 15-year-old students

0.0–30.0 Understood by university graduates

>0.0 Wow, you must not like your readers very much

Page 27: Making the grade without Clippy – Use of automatic readability scoring

© 2014 – StoneRidge Corporation

Sensitive to both complex words and sentence length

Limitations◦ Over estimates difficulty of familiar but polysyllabic

words◦ Short passages with long sentences or complex

words will be estimated to be overly difficult

Flesch Reading Ease

Page 28: Making the grade without Clippy – Use of automatic readability scoring

© 2014 – StoneRidge Corporation

Approximate representation of the US grade level needed to comprehend the text

Used extensively in education Well validated If you don’t know what else to look at try this one

Flesch–Kincaid Grade Level

0.39 ( 𝑡𝑜𝑡𝑎𝑙𝑤𝑜𝑟𝑑𝑠𝑡𝑜𝑡𝑎𝑙 𝑠𝑒𝑛𝑡𝑒𝑛𝑐𝑒𝑠 )+11.8( 𝑡𝑜𝑡𝑎𝑙 𝑠𝑦𝑙𝑙𝑎𝑏𝑙𝑒𝑠𝑡𝑜𝑡𝑎𝑙𝑤𝑜𝑟𝑑𝑠 )−15.59

Page 29: Making the grade without Clippy – Use of automatic readability scoring

© 2014 – StoneRidge Corporation

Not really – shows how two semantically equivalent (one text being shorter) are scored differently by the various ease of reading scoring tools

All were created equal

Page 30: Making the grade without Clippy – Use of automatic readability scoring

© 2014 – StoneRidge Corporation

But please don’t make us work that hard!

Your audience may try hard to understand

Page 31: Making the grade without Clippy – Use of automatic readability scoring

© 2014 – StoneRidge Corporation

It looks like you’re needing a readability tool set for Cascade.

Would you like help?

• Get help with it

• I like pen a paper

Automated Reading Scores

Making it easy of yourself

Automated Readability Index

Smog Index Coleman Liau Index Gunning Fog Score Flesch Reading Ease Flesch–Kincaid Grade

Level

Page 32: Making the grade without Clippy – Use of automatic readability scoring

© 2014 – StoneRidge Corporation

ImplementationsThe nuts and blots

Cascade Server can help you do almost anything!

Page 33: Making the grade without Clippy – Use of automatic readability scoring

© 2014 – StoneRidge Corporation

XSL or Velocity

XSL Velocity

Access to full java environment through extensions (JavaScript)

Library for readability Regular expression

easily to implement No serializer tool for

node Opaque object types

String Java Tools available

Easy to serialize node to usable string

Easy to configure variables

No regular expression tool easily accessible

May require added a java class tool or two to your environment

Page 34: Making the grade without Clippy – Use of automatic readability scoring

© 2014 – StoneRidge Corporation

Need to make serializer tool

Need to understand object types and passing to and from XSL to JavaScript and back

Implementations consideration

Based a great tool - https://github.com/cgiffard/TextStatistics.js

And the winner is …XSL

Page 35: Making the grade without Clippy – Use of automatic readability scoring

© 2014 – StoneRidge Corporation

Assumptions make an @$$ of everyone but we need to do it from time to time

Full stops ◦ li◦ p◦ h1- h6◦ dd◦ .!?

Website is not the same as a printed work…

Most punctuation becomes a space ◦ ,:;()-

Not all full stops are sentences (Mr. U.S.)

Capitalized words in middle of sentence may not be proper names

Some words are forever a problem

Page 36: Making the grade without Clippy – Use of automatic readability scoring

© 2014 – StoneRidge Corporation

Not all xpath objects are the same◦ *◦ Node()◦ Current()◦ org.w3c.dom.traversal.NodeIterator,

org.w3c.dom.NodeList, org.w3c.dom.Node or its subclasses

Extension element◦ org.apache.xalan.extensions.XSLProcessorContext

Function◦ Maps one of the many possible xpath objects

Getting it to JavaScript

Page 37: Making the grade without Clippy – Use of automatic readability scoring

© 2014 – StoneRidge Corporation

Current() passed to function gives org.apache.xml.dtm.ref.DTMNodeList

This has access to the easy interfaces to create a serialized object from the node and is easy to understand from XML to JavaScript context

Function it is with Current()

Page 38: Making the grade without Clippy – Use of automatic readability scoring

© 2014 – StoneRidge Corporation

rootNode =currentNode.getDTMIterator().getDTM(

nodeSet.getDTMIterator().getRoot()).getNode(currentNode.getDTMIterator().getRoot());

transformerFactory = new Packages.org.apache.xalan.processor.TransformerFactoryImpl();

transformer = transformerFactory.newTransformer();

buffer = new Packages.java.io.StringWriter();

transformer.transform(new javax.xml.transform.dom.DOMSource(rootNode), new Packages.javax.xml.transform.stream.StreamResult(buffer)

);

textToProcess = buffer.toString();

And a little Java (serializer)

Page 39: Making the grade without Clippy – Use of automatic readability scoring

© 2014 – StoneRidge Corporation

JavaScript String is not the same as a Java string

function javaToJavaScriptString(javaStr) {try{var len = javaStr.length(); var

tmpStr = ''; for (var i=0; i<len; i++) { tmpStr +=

String.fromCharCode(javaStr.charAt(i)); } return tmpStr;} catch(e){return e.toString();}

}

Before you JavaScript your Java

Page 40: Making the grade without Clippy – Use of automatic readability scoring

© 2014 – StoneRidge Corporation

Based on Christopher Giffard library Functions

◦ cleanText – get the text ready for process with our assumptions about HTML structures

◦ TextStatistics – Class like object that take a string as its contructor

◦ Each statistic and measurement is a function

Javascript

Page 41: Making the grade without Clippy – Use of automatic readability scoring

© 2014 – StoneRidge Corporation

Evaluate text statistics◦ <xsl:if test="text-

statistics:processNode(current())">

Get individual Statistic◦ <xsl:value-of select="text-

statistics:getStatistic('fleschKincaidReadingEase')"/>

Create Objects and Access

Page 42: Making the grade without Clippy – Use of automatic readability scoring

© 2014 – StoneRidge Corporation

It looks like you’re want to do great things with your knowledge.

Would you like help?

• Get help with it

• I like pen a paper

Ideas …

Musings …

Thoughts …

Opinions …

We got them all.

Page 43: Making the grade without Clippy – Use of automatic readability scoring

© 2014 – StoneRidge Corporation

Additional Uses, Thoughts,

and QuestionsThere really is a lot to do with all this information, and we promise you can use it too.

Page 44: Making the grade without Clippy – Use of automatic readability scoring

© 2014 – StoneRidge Corporation

Readability use and will vary based on audience and subject matter.

Not all audiences are the same

Page 45: Making the grade without Clippy – Use of automatic readability scoring

© 2014 – StoneRidge Corporation

http://linchpinseo.com/seo-reading-level-college-websites

Page 46: Making the grade without Clippy – Use of automatic readability scoring

© 2014 – StoneRidge Corporation

To Infinity and Beyond

Currently Future

Displayed in Cascade

Help content contributors evaluate content

Can be used as part of workflow management

Embedded on the page

Integration with analytics tools like GA for understanding of user engagement

Content adaptation based on audience

Page 47: Making the grade without Clippy – Use of automatic readability scoring

© 2014 – StoneRidge Corporation

Thank you and acknowledgements

I wouldn’t be here without you

© 2013 – StoneRidge Corporation

Page 48: Making the grade without Clippy – Use of automatic readability scoring

© 2014 – StoneRidge Corporation

Emory University◦ For being a great client whose challenging needs

and great ideas drive great solutions

Hannon Hill◦ For continuing to develop and add wonderful

features to Cascade Server◦ For nurturing a wonderfully vibrant user

community

Thank you!

© 2013 – StoneRidge Corporation

Page 49: Making the grade without Clippy – Use of automatic readability scoring

© 2014 – StoneRidge Corporation

Bryce Roberts, MS, MSPH

VP of Software

StoneRidge Corporation1050 E Piedmont Rd.Suite E-222Marietta GA, 30062

[email protected]

Contact InformationDonna Cannady Nall

Digital Communications Manager

1301 Clifton Road NEAtlanta GA, 30322

[email protected]

Page 50: Making the grade without Clippy – Use of automatic readability scoring

© 2014 – StoneRidge Corporation

Thank you and good morning!

And so we say good bye to our friend