Building an AI-based service with Rekognition, Polly and Lex

38
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Rebeker Choi, Solutions Architect 15-Sep, 2017 Building an AI-based service with Rekognition, Polly, and Lex

Transcript of Building an AI-based service with Rekognition, Polly and Lex

Page 1: Building an AI-based service with Rekognition, Polly and Lex

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Rebeker Choi, Solutions Architect

15-Sep, 2017

Building an AI-based service

with Rekognition, Polly, and Lex

Page 2: Building an AI-based service with Rekognition, Polly and Lex

The Challenge for Artificial Intelligence: SCALE

Tons of GPUs and CPUs

Prediction

Tons of GPUs

TrainingData

PBs of existing data

Page 3: Building an AI-based service with Rekognition, Polly and Lex

AWS is the Center of Gravity for

Artificial Intelligence

Page 4: Building an AI-based service with Rekognition, Polly and Lex

Amazon AIIntelligent Services Powered by Deep Learning

Page 5: Building an AI-based service with Rekognition, Polly and Lex

DIY Deep Learningfor Custom Models

AI EnabledManaged APIServices

Amazon AI: New Deep Learning Services

Polly LexRekognition

Deep Learning

FrameworksMXNet, TensorFlow, Theano, Caffe, Torch

CO

NT

RO

LU

SA

BIL

ITY

&

SIM

PL

ICIT

Y

Page 6: Building an AI-based service with Rekognition, Polly and Lex

Running AI in Production on AWS Today

Page 7: Building an AI-based service with Rekognition, Polly and Lex

Recommendation & Ranking at Netflix

Personalized

ranking, page

generation, search,

similarity, ratings

In 140 new

countries

simultaneously

Page 8: Building an AI-based service with Rekognition, Polly and Lex

Autonomous Driving System

Page 9: Building an AI-based service with Rekognition, Polly and Lex

Pinterest Visual Search Pinterest Lens

Page 10: Building an AI-based service with Rekognition, Polly and Lex

Amazon AI: New Deep Learning Services

Life-like Speech

Polly LexConversational

Engine

RekognitionImage Analysis

Page 11: Building an AI-based service with Rekognition, Polly and Lex

Amazon Lex

Conversational interfaces for your applications, powered

by the same Natural Language Understanding (NLU) &

Automatic Speech Recognition (ASR) models as Alexa

Page 12: Building an AI-based service with Rekognition, Polly and Lex

Lex: Build Natural, Conversational Interactions

Trigger AWS

Lambda functions

Continually improving

ASR & NLU models

Enterprise

connectors

Salesforce

Microsoft Dynamics

Marketo

Zendesk

Fully

ManagedVoice & Text

“Chatbots”

Text interaction

with Slack & Messenger

Improving human interactions…

• Contact, service, and support center interfaces (text + voice)

• Employee productivity and collaboration (minutes into seconds)

Page 13: Building an AI-based service with Rekognition, Polly and Lex

Intents

A particular goal that the

user wants to achieve

Utterances

Spoken or typed phrases

that invoke your intent

Slots

Data the user must provide to fulfill the

intent

Prompts

Questions that ask the user to input

data

Fulfillment

The business logic required to fulfill the

user’s intent

BookHotel

Page 14: Building an AI-based service with Rekognition, Polly and Lex

Origin

Destination

Departure Date

Flight Booking

“Book a flight

to London from Seattle”

Automatic

Speech RecognitionNatural Language

Understanding

Book Flight

London

Utterances

Flight booking

London Heathrow

Intent /

Slot model

London Heathrow

SeattleSeattle

Seattle

Page 15: Building an AI-based service with Rekognition, Polly and Lex

Origin

Destination

Departure Date

Flight Booking

“Book a flight

to London from Seattle”

Automatic

Speech RecognitionNatural Language

Understanding

Book Flight

London

Utterances

Flight booking

Intent /

Slot model

London Heathrow

Seattle

Prompt

“When would you like to fly?”

“When would you

like to fly?”

Polly

Seattle

London Heathrow

Seattle

Page 16: Building an AI-based service with Rekognition, Polly and Lex

Origin

Destination

Departure Date

Flight Booking

London Heathrow

Seattle

Prompt

“When would you like to fly?”

“When would you

like to fly?”

Polly

“Next Friday”

Page 17: Building an AI-based service with Rekognition, Polly and Lex

Origin

Destination

Departure Date

Flight Booking

“Next Friday”Automatic

Speech Recognition

Next Friday

Utterances

Natural Language

Understanding

Flight booking

02 / 24 / 2017

Intent /

Slot model

London Heathrow

Seattle

02/24/2017

Page 18: Building an AI-based service with Rekognition, Polly and Lex

Origin

Destination

Departure Date

Flight Booking

“Next Friday”Automatic

Speech Recognition

Next Friday

Utterances

Natural Language

Understanding

Flight booking

02 / 24 / 2017

Intent /

Slot model

London Heathrow

Seattle

02/24/2017

Confirmation

“Your flight is booked for next Friday”

“Your flight is booked

for next Friday”

Polly

Page 19: Building an AI-based service with Rekognition, Polly and Lex

Origin

Destination

Departure Date

Flight Booking

“Next Friday”Automatic

Speech Recognition

Next Friday

Utterances

Natural Language

Understanding

Flight booking

02 / 24 / 2017

Intent /

Slot model

London Heathrow

Seattle

02/24/2017

Hotel Booking

Page 20: Building an AI-based service with Rekognition, Polly and Lex

Amazon Polly

Turn Text into lifelike speech using deep learning

technologies to synthesize speech that sounds like a

human voice

Page 21: Building an AI-based service with Rekognition, Polly and Lex

Amazon Polly

“The temperature

in WA is 75°F”

“The temperature

in Washington is 75 degrees

Fahrenheit”

Amazon Polly: Text In, Life-like Speech Out

Page 22: Building an AI-based service with Rekognition, Polly and Lex

Converts text

to life-like speech

47 voices 24 languages Low latency,

real time

Fully managed

Polly: Life-like Speech Service

What is supported?

• Supports all programming language included in AWS SDK

(Java, Python, Node.js, etc) as well as HTTP API

• Audio stream formats: MP3, Vorbis, raw PCM

• Choose your sampling rate to optimize bandwidth & quality

• Customized Pronunciation

Articles and Blogs

Training Material

Chatbots (Lex)

Public Announcements

Page 23: Building an AI-based service with Rekognition, Polly and Lex

Polly: SSML and Lexicons

• Using version 1.1 SSML tags to adjust the speech rate, pitch, or volume. e.g.

• <break time="1s"/> pause 1 second between the initial two sentences

• <sub alias="World Wide Web Consortium">W3C</sub> substitute "World Wide Web Consortium" for the

acronym "W3C"

• <amazon:effect name="whispered">Score</amazon:effect> say the second "Score" in a whispered voice

<speak>He was caught up in the game.<break time="1s"/> In the middle of the 10/3/2014 <sub alias="World Wide Web Consortium">W3C</sub> meeting he shouted, "Score!" quite loudly. When his boss stared at him, he repeated <amazon:effect name="whispered">"Score"</amazon:effect> in a whisper.</speak>

• Pronounciation lexicons enable you to customize the pronunciation of words

<lexeme>

<grapheme>Bob</grapheme>

<alias>Robert</alias>

</lexeme>

aws polly synthesize-speech \

--lexicon-names LexA LexB \

--output-format mp3 \

--text 'Hello, my name is Bob' \

--voice-id Justin \

bobAB.mp3

“Hello, my name is Robert”

Page 24: Building an AI-based service with Rekognition, Polly and Lex

"Our Mapbox Navigation SDK offers a complete

turn-by-turn navigation solution that you can easily

add to your iOS or Android application, and having

clear, well-understood voice guidance is critical to

the user experience. Therefore, we’re excited to

offer natural-sounding pronunciation with highly

intelligible and pleasant voices in our users’ most

widely used languages with Amazon Polly’s Text-to-

Speech service."

– Paul Veugen, VP of Mobile, Mapbox.

Page 25: Building an AI-based service with Rekognition, Polly and Lex

Amazon Rekognition

Image Recognitions and Analysis powered by Deep

Learning which allows to search, verify and organize

millions of images

Page 26: Building an AI-based service with Rekognition, Polly and Lex

Amazon RekognitionDeep learning-based image recognition service

Search, verify, and organize millions of images

Object and Scene

DetectionFacial

Analysis

Face

Comparison

Facial

Recognition

Integrated with S3, Lambda, Polly, Lex

Page 27: Building an AI-based service with Rekognition, Polly and Lex

Object and Scene Detection

• Search, filter, and

curate image

libraries

• Smart searches for

user generated

content

• Photo, travel, real

estate, vacation

rental applications

Maple

Plant

Villa

Garden

Water

Swimming Pool

Tree

Potted Plant

Backyard

Page 28: Building an AI-based service with Rekognition, Polly and Lex

Request

Response

Object and Scene Detection – DetectLabels API

{"Image": {

"Bytes": blob,"S3Object": {

"Bucket": "string","Name": "string","Version": "string"

}},"MaxLabels": number,"MinConfidence": number

}

Maple

Plant

Villa

Garden

Water

Swimming Pool

Tree

Potted Plant

Backyard

{"Labels": [{

"Confidence": 95.78783416748047,"Name": "Villa"

},{

"Confidence": 68.914794921875,"Name": "Swimming Pool"

},{

"Confidence": 59.24593734741211,"Name": "Backyard"

},{

"Confidence": 59.24593734741211,"Name": "Yard"

},],"OrientationCorrection": "ROTATE_0" }

Generate labels for thousands of objects, scenes, and concepts, each with a

confidence score

S3 bucket

Page 29: Building an AI-based service with Rekognition, Polly and Lex

Facial Analysis

Demographic Data

Facial Landmarks

Sentiment Expressed

• Smart searches for

user generated

content

• Photo, travel, real

estate, vacation

rental applications

• Targeted marketing

• Dynamic,

personalized ads

• Improve online dating

match

recommendations

Page 30: Building an AI-based service with Rekognition, Polly and Lex

Facial Analysis"AgeRange": {"High": 38, "Low": 23},

"BoundingBox": {

"Height": 0.42500001192092896,

"Left": 0.1433333307504654,

"Top": 0.11666666716337204,

"Width": 0.2822222113609314

},

"Confidence": 99.8899917602539,

"Emotions": [

{"Confidence": 93.29251861572266,

"Type": "HAPPY"},

{"Confidence": 28.57428741455078,

"Type": "CALM" },

{"Confidence": 1.4989674091339111,

"Type": "ANGRY" }

],

"Eyeglasses": { "Confidence": 99.99998474121094,

"Value": true },

"Gender": { "Confidence": 100,

"Value": "Female" },

"Smile": { "Confidence": 99.47274780273438,

"Value": true },

"Sunglasses": { "Confidence": 97.63555145263672,

"Value": true }

DetectFaces

smart cropping

& ad overlays

sentiment

capture

demographic

analysis

face editing

& pixelation

Page 31: Building an AI-based service with Rekognition, Polly and Lex

Face Comparison

Measure the likelihood that faces in two images are of the same

person

• Add face verification to applications and devices

• Extend physical security controls

• Provide guest access to VIP-only facilities

• Verify users for online exams and polls

Page 32: Building an AI-based service with Rekognition, Polly and Lex

CompareFaces

"FaceMatches": [

{"Face": {"BoundingBox": {

"Height": 0.4601006507873535,

"Left": 0.32827046513557434,

"Top": 0.18212316930294037,

"Width": 0.3135717809200287},

"Confidence": 99.99964141845703},

"Similarity": 93

},

{"Face": {"BoundingBox": {

"Height": 0.2383333295583725,

"Left": 0.6233333349227905,

"Top": 0.3016666769981384,

"Width": 0.15888889133930206},

"Confidence": 99.71249389648438},

"Similarity": 0

}

],

"SourceImageFace": {"BoundingBox": {

"Height": 0.23983436822891235,

"Left": 0.28333333134651184,

"Top": 0.351423978805542,

"Width": 0.1599999964237213},

"Confidence": 99.99344635009766}

}

Similarity 93%

Similarity 0%

Page 33: Building an AI-based service with Rekognition, Polly and Lex

Celebrity Recognition

More Rekognition Capabilities

Image Moderation

Page 34: Building an AI-based service with Rekognition, Polly and Lex

Facial Recognition

Identify people in images by finding the closest match for an input face

image against a collection of stored face vectors

• Add friend tagging to social and messaging apps

• Assist public safety officers find missing persons

• Identify employees as they access sensitive locations

• Identify celebrities in historical media archives

Page 35: Building an AI-based service with Rekognition, Polly and Lex

Media Case Study

Identify who is on camera at what time for

each of 8 networks so that recorded video

streams can be indexed and searched

Video frame-sampling facial recognition

solution using Amazon Rekognition:

• Indexed 97,000 people into a face collection in

1 day

• Sample frames every 6 secs and test for image

variance

• Upload images to S3 and call Rekognition to

find best facial match

• Store time stamp and faceID metadata

Page 36: Building an AI-based service with Rekognition, Polly and Lex

Demo

Page 37: Building an AI-based service with Rekognition, Polly and Lex

Amazon AI Services

• Leveraging Amazon internal experiences with AI / ML

• Managed API services with embedded AI for maximum

accessibility and simplicity

• Full stack of platforms and engines for specialized deep

learning applications

Page 38: Building an AI-based service with Rekognition, Polly and Lex

Thank you!