Complexities of Practical Web Automation

Yury Puzis, Yevgen Borodin, I.V. Ramakrishnan

Complexities of Practical Web Automation

Stony Brook University2015

NSF Grant No. IIS-1218570

Contents

Goal: help design practical web automation tools by sharing observational experience

❖ Human-Computer Interaction Perspective

❖ Technical Perspective

❖ Example: Automation Assistant

❖ Conclusion

Why Web Automation?

❖ Problem: non-visual browsing is hard

❖ It is hard (or impossible) to find relevant information and easy to become overwhelmed by what is irrelevant

❖ There are many shortcuts (gestures) to learn and hard (or impossible) to accomplish non-trivial tasks

❖ Web automation has the potential to enable visually impaired users to breeze through Web browsing tasks that beforehand were slow, hard, or even impossible to achieve

Observation

User Environment Web Automation Tool

Browsing Actions Events

User Environment Web Automation Tool

Events Automation Instructions

Automation

Maximizing Trust

❖ Gaining and maintaining user trust is the cornerstone of web automation: even a few disasters is a big problem

❖ The user needs to know and influence what will happen (review, parameterize, choose) and what has happened (review, revert, recover) at all times

❖ Failure is inevitable and has to be graceful: terminate automation, ignore failed action, take corrective action, or suggest the user to take corrective action

Minimizing End-To-End Cost❖ Cognitive load and operation time must be end-to-end

lower when using automation than otherwise

❖ Web automation costs: managing creation, execution and consequences of automation; context switching

❖ Screen-Reader and browser costs: many and are well known, including the need to plan complex sequences actions by memory or execute exhaustive search and guess, guess, guess

❖ In conflict with the need to maximize trust

Dealing with Uncertainty

Goal: automate user intent without resorting to handcrafting scripts (programming), interpret environment reaction

Problem: we can only guess

❖ Semantics of user browsing actions

❖ Semantics of environment events

❖ Semantics of webpage elements

Making Observations

❖ Goal: make meaningful observations from events

❖ Problem: browsing actions can trigger multiple (including cascading) events, and there are different types of events: e.g., shortcut press -> JavaScript call -> DOM mutation -> virtual cursor movement

❖ Problem: over time, an event may change its semantics (same event - different results) or implementation (different event - same results)

Addressing Webpage Elements

❖ Goal: identify target webpage element

❖ Problem: most addressing approaches are designed to query DOM for elements at the specified address, but we need to query DOM for address of the specified element

❖ Solutions: sloppy programming, machine learning, etc. but no unbreakable approaches exist

Detecting Action Completion

❖ Goal: wait for action to complete (succeed or fail) before continuing to interact with the user & the environment

❖ Problem: no standard way to specify action completion; cascading, asynchronous and scheduled JavaScript events make things harder

❖ Solutions: listen to all relevant JavaScript events through callback functions; timeout; wait for predefined DOM mutations / value changes (success or failure)

Example: Automation Assistant❖ Observes everything the user is doing (no macros)

❖ Guides the user through browsing tasks step-by-step

❖ suggests several alternative browsing actions based on user’s prior actions

❖ automates only one action at a time

❖ each set of suggestions is explicitly requested, each action is explicitly chosen, each outcome is reviewed

❖ No context switch between automation and screen-reading

Puzis Y., Borodin Y., Puzis R., Ramakrishnan I. V., Predictive Web Automation Assistant for people with vision impairments. WWW '13.

Conclusion❖ There are some successes but automation is not there yet

❖ The biggest technical challenge is uncertainty which stems from lack standardization

❖ The biggest HCI challenges are building trust and keeping things “cheap”

❖ The HCI aspect of this talk is, to a large extent, applicable to all automation tools, not just web automation. It is also applicable to all users not just the visually impaired users (think handheld, wrist devices)

Thank You!

Complexities of Practical Web Automation

Software

Transcript of Complexities of Practical Web Automation