James R. Lewis, Ph.D., CHFP IBM Software Group [email protected] August 13, 2012

24
IBM Software Group 1 James R. Lewis, Ph.D., CHFP IBM Software Group [email protected] August 13, 2012 What Recent Research Says about the Design of Effective Speech Menus -- You Might Be Surprised

description

What Recent Research Says about the Design of Effective Speech Menus -- You Might Be Surprised. James R. Lewis, Ph.D., CHFP IBM Software Group [email protected] August 13, 2012. Introduction. - PowerPoint PPT Presentation

Transcript of James R. Lewis, Ph.D., CHFP IBM Software Group [email protected] August 13, 2012

Page 1: James R. Lewis, Ph.D., CHFP IBM Software Group jimlewis@us.ibm August 13, 2012

IBM Software Group

1

James R. Lewis, Ph.D., CHFPIBM Software [email protected] 13, 2012

What Recent Research Says about the Design of Effective Speech Menus -- You Might Be Surprised

Page 2: James R. Lewis, Ph.D., CHFP IBM Software Group jimlewis@us.ibm August 13, 2012

IBM Software Group

2

Introduction Historical role of spoken menusRecent arguments against using menus in speech recognition IVR appsHowever -- speech menus occur during normal human-human dialogsFor foreseeable future crafting effective speech menus will be part of practical VUI design – not a sexy topic, but subtleTwo important design topics

•Number of options

•Timing for extensions

Which dressing? We’ve got

Thousand Island, French, honey

mustard, oil and vinegar, raspberry vinaigrette, creamy

Italian, ranch, or blue cheese.

Page 3: James R. Lewis, Ph.D., CHFP IBM Software Group jimlewis@us.ibm August 13, 2012

IBM Software Group

3

Common guideline – no more than 4-5 options/menuRecommendation from earliest days of IVR designTwo key characteristics of a menu

•Breadth

•DepthFor a given number of options presented in an auditory menu, which is the better strategy?

•Fewer menus with more options/menu (Broad)

•More menus with fewer options/menu (Deep)

Number of Options

Page 4: James R. Lewis, Ph.D., CHFP IBM Software Group jimlewis@us.ibm August 13, 2012

IBM Software Group

4

Welcome. Please select:

Option 1 Option 8Option 7Option 6Option 5Option 4Option 3Option 2

Welcome. Please select:

Option 1 Option 8Option 7Option 6Option 5Option 4Option 3Option 2

Option B. Next, choose:

Option A. Next, choose:

Broader Menu Structure

Deeper Menu Structure

Broad and Deep Menu Structures

Page 5: James R. Lewis, Ph.D., CHFP IBM Software Group jimlewis@us.ibm August 13, 2012

IBM Software Group

5

Short-term memory limitation (Magic Number 7 ± 2)Miller’s famous paper from 1956Described experiments demonstrating that people have trouble holding more than about 7 (plus or minus 2) items at a time in working memoryThe clear application of this for menu design is to limit the number of options presented in a menu – assuming that callers memorize menu optionsFor a given total number of options, however, restricting the number of options per menu necessarily leads to a deeper rather than a broader menu structure

Basis for Limiting the Number of Options

Page 6: James R. Lewis, Ph.D., CHFP IBM Software Group jimlewis@us.ibm August 13, 2012

IBM Software Group

6

1997 - Huguenard, Lurch, Junker, Patz, & Kass•Created a cognitive model phone-based interaction

•“It is not the number of options per menu level that determines the magnitude of WM [working memory] load, but rather the amount of processing and storage required to evaluate the ‘goodness’ of each individual option”

•Study of 87 participants compared touchtone performance with broad (9x9) and deep (3x3x3x3) menu structures (81 terminals)

•Fewer navigation errors in broad condition

However …

Page 7: James R. Lewis, Ph.D., CHFP IBM Software Group jimlewis@us.ibm August 13, 2012

IBM Software Group

7

1997 – Virzi & Huitema•“More Options” choice lists remaining options

•24 participants interacted with broad and deep versions of four different touchtone IVR menu applications, each of which had eight target options

•Broad: 8 options

•Deep: 4 options + “More Options” link to other 4

•Selection was significantly (10-20 sec) faster with the broad version

•“There was clearly no advantage for the deep menus.”

What about a “More Options” Choice?

Page 8: James R. Lewis, Ph.D., CHFP IBM Software Group jimlewis@us.ibm August 13, 2012

IBM Software Group

8

2001 – Suhm, Freeman, & Getty•Captured data from 2834 calls with a broad menu (7 options) and 2909 calls with a deep version (4+other)

•Rates of timeouts and invalid responses for broad and deep versions were comparable

•The reprompt rate was significantly higher for the deep version (5.1% for deep, 1.7% for broad)

•“Presenting more choices in a menu allows designers of touch-tone voice interfaces to avoid multi-layered menus, which are clearly one of the most dreaded characteristics of touch-tone voice interfaces.”

Replication of Virzi & Huitema (1997)

Page 9: James R. Lewis, Ph.D., CHFP IBM Software Group jimlewis@us.ibm August 13, 2012

IBM Software Group

9

2008 – Commarford, Lewis, Al-Awar Smither, & Gentzler

•Compared broad and deep versions of a VUI menu

•Broad menu contained list of 8–11 options

•Deep had one layer of higher-level categories

•All participants completed test of working memory

•Users of the broad structure IVR performed better (completed more tasks in less time) and were more satisfied than users of the deep-structure IVR

•Effect more pronounced for those with low working memory capacity

Are Fewer Options Better when Memory Span is Low?

Page 10: James R. Lewis, Ph.D., CHFP IBM Software Group jimlewis@us.ibm August 13, 2012

IBM Software Group

10

2008 – Commarford, Lewis, Al-Awar Smither, & Gentzler

•Key finding: interaction between menu structure and WMC for task completion time (n = 58)

Commarford et al. (2008), continued

Page 11: James R. Lewis, Ph.D., CHFP IBM Software Group jimlewis@us.ibm August 13, 2012

IBM Software Group

11

User calls IVR with specific

goal in mind

IVR presents first menu option

Perceive Item 1

Process and evaluate Item 1

Store Item 1

Confident in stored item? Select stored item

New item presented? Perceive new item Process and

evaluate new itemNew better

than stored?

Discard stored item and keep new

item

Discard new item and keep stored

item

Yes

No

Yes

Yes

No

No

Model of Auditory Menu Selection (Commarford et al., 2008)

Page 12: James R. Lewis, Ph.D., CHFP IBM Software Group jimlewis@us.ibm August 13, 2012

IBM Software Group

12

2009 – Wolters, Georgila, Moore, Logie, MacPherson, & Watson

•WOZ study, n = 49, broad vs. deep voice menus

•Assessed participants working memory span

•When the application presented more options per turn and avoided explicit confirmations, participants booked appointments more quickly

•“Thus, our results complement Commarford et al.’s (2008) finding that users with a lower WMS benefit from being presented with more options at a time, because at each step in the interaction, they are more likely to be presented with the correct choice.”

Replication of Commarford et al. (2008)

Page 13: James R. Lewis, Ph.D., CHFP IBM Software Group jimlewis@us.ibm August 13, 2012

IBM Software Group

13

2010 – Hura•You can say. … Company Directory. Computer Assistance. Say Field Support to hear those options. For health benefits you can say Unified Healthcare, Rogers, Dental, … Or, you can say Representative if you know you need a person.

•“What we have observed in practice is that frequent users barge in and never hear the menu. Infrequent users who have a term in mind for what they need also barge in without hearing much of the menu. It’s only when the user does not know what word to say that they bother listening to the menu. And in that situation, having a menu is an asset, not a punishment.”

A Big, Fat (34-Option) Main Menu

Page 14: James R. Lewis, Ph.D., CHFP IBM Software Group jimlewis@us.ibm August 13, 2012

IBM Software Group

14

2007 – Wilkie, McInnes, Jack, and Littlewood•Activated a “hidden” overdraft option to a banking main menu to avoid increasing menu breadth

•Informed callers about the new overdraft option in one of three places: introduction, after caller identification, at the completion of first transaction

•37% of 114 participants failed the overdraft task•“A perhaps obvious solution … would be to simply add an overdraft option to the main menu listing. … This approach was employed in a follow-up experiment … which resulted in all participants successfully obtaining an overdraft. However, adding service options to the main menu in this way is not an ideal solution ... . An alternative method would be to … revisit the wording of the system-initiated proposal.”

Sometimes the Simple Answer is Hard to See

Page 15: James R. Lewis, Ph.D., CHFP IBM Software Group jimlewis@us.ibm August 13, 2012

IBM Software Group

15

No known experimental evidence of improved usability due to shorter menus and deeper structures7 studies since 1997 support broad over deep menusApparently cognitive demand of navigation exceeds that of selecting option from a menuDesigners must decide how best to organize optionsMenu length is not the only factorNeed unambiguous labels and logical option groupingIf the options fall nicely into groups of four or fewer, it is reasonable to organize them in this mannerDo not artificially limit the number of optionsPrefer broad over deep menu structures

Number of Options: Conclusions

Page 16: James R. Lewis, Ph.D., CHFP IBM Software Group jimlewis@us.ibm August 13, 2012

IBM Software Group

16

Timing Extensions to Menus and PromptsSometimes designers provide extensions to menus and prompts (Weegels, 2000)

•Which would you like? Intake Interviews, Genetic Testing, Court Date Info, or Upcoming Appointments? <pause> Or say, “It’s none of these.”

How long should the pre-extension pause be?•If much too short, not an effective turntaking cue•If slightly too short, deceptive turntaking cue that interrupts caller and can cause stuttering effect•If much too long, callers who are not sure how to respond may begin guessing – and never hear the extension

Timing is critical, but typically underspecified

Page 17: James R. Lewis, Ph.D., CHFP IBM Software Group jimlewis@us.ibm August 13, 2012

IBM Software Group

17

Findings from Analyses of ConversationsPauses in a face-to-face setting rarely last more than 1 second (Clark, 1996; Wilson & Zimmerman, 1986)

When a participant in a telephone conversation pauses longer than 1 second and the other participant does not take the turn, the first speaker usually interprets this as a problem (Roberts, Francis, & Morgan, 2006)

The mean pause duration for dialog turns in service-based telephone conversations was 426 ms, with a 95% confidence interval ranging from 264–588 ms, an estimated 95th percentile of 1010 ms, and an estimated 99th percentile of 1303 ms (Beattie & Barnard, 1979; Lewis, 2011)

Page 18: James R. Lewis, Ph.D., CHFP IBM Software Group jimlewis@us.ibm August 13, 2012

IBM Software Group

18

Findings from Analyses of Interactions with IVRs100 survey respondents provided information about how they knew when it was their turn to speak (Margulies, 2005)

•Pause in the dialog (41.6%), prompt syntax (26.2%), inflection (21.0%), and earcons (11.2%)

Analysis of several dozen people using speech IVR – poorly timed pauses caused 18.5% of observed task failures (Margulies, 2005)

•“when the machine seemingly yields a turn but then continues with instructions or a repeat of the declarative or interrogatory prompt—coincident with the subject either preparing to respond or in the act of responding”

Page 19: James R. Lewis, Ph.D., CHFP IBM Software Group jimlewis@us.ibm August 13, 2012

IBM Software Group

19

Findings from Analyses of Interactions with IVRsCommarford & Lewis, 2005

•Detailed analysis of six callers at task-terminal points in the completion of tasks with two different speech IVRs

•Goal to find the optimal pause between presentation of a menu at a task-terminal point and the presentation of global navigation commands as an extension

•Interesting differences in the distributions of caller response latencies as a function of whether the terminal menu included or did not include the target option for the task and whether it was the first or a subsequent presentation of the menu to the caller

Page 20: James R. Lewis, Ph.D., CHFP IBM Software Group jimlewis@us.ibm August 13, 2012

IBM Software Group

20

Findings from Analyses of Interactions with IVRsCommarford & Lewis, 2005

Page 21: James R. Lewis, Ph.D., CHFP IBM Software Group jimlewis@us.ibm August 13, 2012

IBM Software Group

21

Timing Extensions to Menus and Prompts: ConclusionsAnalyses of human-human conversations indicate:

•Turntaking pauses should be at least 1 sec in duration

•Longer pauses (1300 ms) provide better turntaking cues

•500 ms pause likely to cause conversational collisions

•250 ms pause not likely to trigger turntaking

Analyses of human-IVR interaction indicate:

•A 2000 ms pause should balance tradeoffs

•Try to design menus/prompts that do not need extension

•Need for more research (open-ended prompt, familiarity)

Page 22: James R. Lewis, Ph.D., CHFP IBM Software Group jimlewis@us.ibm August 13, 2012

IBM Software Group

22

Questions?

Page 23: James R. Lewis, Ph.D., CHFP IBM Software Group jimlewis@us.ibm August 13, 2012

IBM Software Group

23

ReferencesBeattie, G. W., & Barnard, P. J. (1979). The temporal structure of natural telephone conversations (directory enquiry calls). Linguistics, 17, 213–229.

Clark, H. H. (1996). Using language. Cambridge, UK: Cambridge University Press.

Commarford, P. M., & Lewis, J. R. (2005). Optimizing the pause length before presentation of global navigation commands. In Proceedings of HCI International 2005: Volume 2—The management of information: E-business, the Web, and mobile computing (pp. 1–7). St. Louis, MO: Mira Digital Publication.

Commarford, P. M., Lewis, J. R., Al-Awar Smither, J. & Gentzler, M. D. (2008). A comparison of broad versus deep auditory menu structures. Human Factors, 50(1), 77–89.

Huguenard, B. R., Lurch, F. J., Junker, B. W., Patz, R. J., & Kass, R. E. (1997). Working memory failure in phone-based interaction. ACM Transactions on Computer-Human Interaction, 4(2), 67–102.

Hura, S. L. (2010). My big fat main menu: The case for strategically breaking the rules. In W. Meisel (Ed.), Speech in the user interface: Lessons from experience (pp. 113–116). Victoria, Canada: TMA Associates.

Lewis, J. R. (2011). Practical speech user interface design. Boca Raton, FL: Taylor & Francis.

Margulies, E. (2005). Adventures in turn-taking: Notes on success and failure in turn cue coupling. In AVIOS 2005 proceedings (pp. 1–10). San Jose, CA: AVIOS.

Miller, G. A. (1956). The magical number seven, plus or minus two: Some limits on our capacity for processing information. The Psychological Review, 63, 81–97.

Page 24: James R. Lewis, Ph.D., CHFP IBM Software Group jimlewis@us.ibm August 13, 2012

IBM Software Group

24

ReferencesRoberts, F., Francis, A. L., & Morgan, M. (2006). The interaction of inter-turn silence with prosodic cues in listener perceptions of “trouble” in conversation. Speech Communication, 48, 1079–1093.

Suhm, B., Freeman, B., & Getty, D. (2001). Curing the menu blues in touch-tone voice interfaces. In Proceedings of CHI 2001 (pp. 131–132). The Hague, Netherlands: ACM.

Virzi, R. A., & Huitema, J. S. (1997). Telephone-based menus: Evidence that broader is better than deeper. In Proceedings of the Human Factors and Ergonomics Society 41st Annual Meeting (pp. 315–319). Santa Monica, CA: Human Factors and Ergonomics Society.

Weegels, M. F. (2000). Users’ conceptions of voice-operated information services. International Journal of Speech Technology, 3, 75–82.

Wilkie, J., McInnes, F., Jack, M. A., & Littlewood, P. (2007). Hidden menu options in automated human-computer telephone dialogues: Dissonance in the user’s mental model. Behaviour & Information Technology, 26(6), 517–534.

Wilson, T. P., & Zimmerman, D. H. (1986). The structure of silence between turns in two-party conversation. Discourse Processes, 9, 375–390.

Wolters, M., Georgila, K., Moore, J. D., Logie, R. H., MacPherson, S. E., & Watson, M. (2009). Reducing working memory load in spoken dialogue systems. Interacting with Computers, 21, 276–287.