Use Amazon Polly to Create Apps that Talk - April 2017 AWS Online Tech Talks

25
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Rafal Kuklinski Josiah Jordan, Steve Suhy Amazon Text-to-Speech Amazon Rapids 04/10/2017 Use Amazon Polly to Create Apps that Talk

Transcript of Use Amazon Polly to Create Apps that Talk - April 2017 AWS Online Tech Talks

© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Rafal Kuklinski Josiah Jordan, Steve Suhy

Amazon Text-to-Speech Amazon Rapids

04/10/2017

Use Amazon Polly to Create

Apps that Talk

What to Expect from the Session

Amazon Polly

• What is Amazon Polly?

• Polly out of the box

• How to get the most out of Polly

• Polly use cases

• Polly levels of complexity

Amazon Rapids

• What is Amazon Rapids?

• Integrating with Polly

• Best Practices

• Polly/Rapids in Action

• Lessons learned

Q & A

Amazon Polly

• A service that converts text into lifelike speech

• 47 voices, 24 languages

• You can store, replay and distribute generated

speech

What is Polly?

Polly out of the boxNatural sounding speech

A subjective measure of how close TTS output is to human speech.

Accurate text processingThe system interprets common text formats such as abbreviations, numerical sequences, homographs, etc.

1 PT 8 OZ (24FL OZ) 710 mL.

St. Mary’s Church is on 226 St. Mary’s St.

Highly intelligibileA measure of how comprehensible speech is.

”Peter Piper picked a peck of pickled peppers.”

• March 27th Webinar

• Punctuation example (Commas/periods example)

• Lexicon example (First name example)

• SSML Tags (Speech rate example)

How to get the most out of Polly

Polly Use Cases

Contact Center Training materials

Education/Elearning

Content Creation

Gaming/EntertainmentInternet of Things

Simple Complex

Languages One (e.g. US English) Many (US English, Spanish…)

Voices One (e.g. Joanna) Many (Joanna, Salli, Miguel…)

Lexicons None One+ (e.g. medical terms)

SSML tags None (out-of-the-box) Many (i.e. speech optimization)

Automation No (Console - small volume) Yes (API - speech at scale)

Audio Storage No (Regenerate speech) Yes (Cache and reuse speech)

Polly Levels of Complexity

Introducing Amazon Rapids

Contact Center Training materials

Education/Elearning

Content Creation

Gaming/EntertainmentInternet of Things

Simple Complex

Languages One (e.g. US English) Many (US English, UK English…)

Voices One (e.g. Joanna) Many (Joanna, Salli, Amy…)

Lexicons None One+ (e.g. medical terms)

SSML tags None (out-of-the-box) Many (i.e. speech optimization)

Automation No (Console - small volume) Yes (API - speech at scale)

Audio Storage No (Regenerate speech) Yes (Cache and Reuse Speech)

Introducing Amazon Rapids

Amazon Rapids

• Subscription based reading app for

kids 12 and under

• Original short stories, perfect for kids

on the go

What is Amazon Rapids?

Amazon Rapids Intro Video

Amazon Rapids Features

Read Along• Introduced to help younger readers

• Complaints: “Sounds too robotic”, “My kids don’t like the computer

voice”

• Our needs:

• Scalable

• Platform agnostic

• 2-4 speakers per story, 500+ stories (and counting)

• Entertaining

How Rapids uses Polly

Integrating with Polly

Implementing a UI

1. Upload manuscript to

admin tool

Automatic

conversion

process

3. Fill out story metadata

2. Add art for story

4. Assign voices to characters

5. Generate speech files

6. Proof-listen to story

6. Tweaks and customizations

Implementing a UI

• Existing tool for managing stories

• Process manuscripts, add images, assign metadata

• Goals

• Incorporate voice integration without breaking the flow

• Eliminate developer involvement

• Automate as much as possible

• Enable rapid iteration on generated speech

Empowering Content Creators

Ask yourself…

1. Who: Who is your customer?

2. What: What does your customer expect?

3. Where: Gauge Emotional vs Informational presentation

4. What: What does your target output sound like?

5. How: Develop an integration percentage breakdown

Polly/Rapids Relationship

Our goals…

1. Customer: Readers with mobile means, ages 5-12

2. Customer Expectations: Reading is both educational as well as

entertaining

3. Presentation: 70% Emotional, 30% Informational

4. Audio Example:

5. Development: 90% Polly out of the box, 10% customizations

Polly/Rapids Best Practices

• Identify ‘ideal cast’ for roles

• Written <> Verbal Presentation

• Customizations

• SSML

• Punctuation

• Variations in pronunication

• Combination phrases (“About you vs Aboutju vs Aboutchoo”)

• Phonetic Creativity

Polly/Rapids In Action

• “PreCoffee” Example• Default Justin voice

• <speak>But I thought I wouldn't like it at first,

remember? You made me try it, and now I love

it.</speak>

Voice Cast:

• Kendra (mom), Justin (Boy)

“Coffee!!!” by Carl Bowen

Polly/Rapids In Action

• “On Coffee” Example• Faster rate, Higher pitch

• <speak><prosody rate="130%"><prosody

pitch="20%">Am I talking loud? I can't even tell!

I should get dressed! I'll be right back! Bye

Mom!</prosody></prosody></speak>

Voice Cast:

• Kendra (mom), Justin (Boy)

“Coffee!!!” by Carl Bowen

Polly/Rapids In Action

• “Coffee Crash” Example• Slower Rate, Lower Pitch

• <speak><prosody rate="-40%"><prosody

pitch="-20%">Oh man.</prosody></prosody>

There was so much I wanted to do

today.</speak>

Voice Cast:

• Kendra (mom), Justin (Boy)

“Coffee!!!” by Carl Bowen

Lessons Learned

• Ease of integrating and building a UI to support

• With voice modifications we were able to support multi-character

conversations

• Proof listening worked wonders for us with static content

• We’d still be in free tier with our usage, even with hundreds of

stories.

• Contact us with any question about this webinar or Polly in general

[email protected]

• Contact us about Amazon Rapids

https://rapids.amazon.com/

• Introducing Amazon Polly at re:Invent 2016

https://www.youtube.com/watch?v=zjMqimHis3U&t=2s

• Other Polly webinars:

https://www.youtube.com/user/AWSwebinars/search?query=polly

• Amazon Polly/Rapids video:

https://www.youtube.com/watch?v=Q8lGMQDR_zI

Next Amazon Polly Webinar (June 19th): Title – Coming soon

Thank You!