DSR 09 Demo day presentation - Karthick Perumal

20
What do you call a fish with no eyes?

Transcript of DSR 09 Demo day presentation - Karthick Perumal

Page 1: DSR 09 Demo day presentation - Karthick Perumal

What do you call a fish with no eyes?

Page 2: DSR 09 Demo day presentation - Karthick Perumal

What do you call a fish with no eyes?

Page 3: DSR 09 Demo day presentation - Karthick Perumal

Instilling a Sense of Humourin Computers

Dr. Karthick Perumal

Page 4: DSR 09 Demo day presentation - Karthick Perumal

Karthick Perumal | Instilling a Sense of Humour in Computers | April 7th, 2017 | Page 4

Outline

● Language modeling

● Choice of algorithm

● Creating a Joke dataset

● Training model

● Generated output

Page 5: DSR 09 Demo day presentation - Karthick Perumal

Karthick Perumal | Instilling a Sense of Humour in Computers | April 7th, 2017 | Page 5

Language Modeling

Probability distribution over sequences of words

P( Data science is the future )

P (Data science is the future)=P (Data)x P (science∣Data) xP (is∣Data science )xP (the∣Data science is)xP ( future∣Data science is the)

> P( Data science is the Berlin )

> P( Data science ist die Zukunft)

> P(Data Science is the Zukunft)

Page 6: DSR 09 Demo day presentation - Karthick Perumal

Karthick Perumal | Instilling a Sense of Humour in Computers | April 7th, 2017 | Page 6

Why Language Modeling?

How to wreck a nice beach or How to recognize speech

Page 7: DSR 09 Demo day presentation - Karthick Perumal

Karthick Perumal | Instilling a Sense of Humour in Computers | April 7th, 2017 | Page 7

Why Language Modeling?

● Speech Recognition

P(How to recognize speech) > P(How to wreck a nice beach)

● Spelling correction/prediction

P(win a contest) > P(win a context)

● Machine Translation

P(give a high five) > P(give a large five)

● Text summarization, question-answering, etc.,

Page 8: DSR 09 Demo day presentation - Karthick Perumal

Karthick Perumal | Instilling a Sense of Humour in Computers | April 7th, 2017 | Page 8

Choice of Algorithm?

● Recurrent neural networks- have feedback loops allowing the network to use information

from previous passes, which act as memory- We specifically use LSTM (Long short term memory), which solves vanishing gradient problem

● Extremely efficient for language modeling and timeseries analysis

● Computationally expensive and longer training times

Page 9: DSR 09 Demo day presentation - Karthick Perumal

Karthick Perumal | Instilling a Sense of Humour in Computers | April 7th, 2017 | Page 9

Creating a Joke Dataset

● Extracted jokes only with good rating

● Lot of redundant jokes from various websites

Page 10: DSR 09 Demo day presentation - Karthick Perumal

Karthick Perumal | Instilling a Sense of Humour in Computers | April 7th, 2017 | Page 10

Interesting information about the dataset

● 310967 jokes: including duplicates, inappropriate words

● 219873 cleaned jokes

Page 11: DSR 09 Demo day presentation - Karthick Perumal

Karthick Perumal | Instilling a Sense of Humour in Computers | April 7th, 2017 | Page 11

Interesting information about the dataset

● Found some redundant jokes after cleaning

● What do you call a fish with no eye? Fsh.

● What do you call a fish with no eyes? A fsh.

● What do you call a fish with no eyes? A fsh. What do you call a fish

with four eyes? NEEEERRRRD

● Meaningless text also scraped and available in the dataset

● "Hey whatcha eating ? "A pluot" Wtf is a pluot ? "A cross between

a plum & an apricot" That 's really stupid. rides off on a liger"

● Alfijnbahkfnbsbbakrbbjdnebzk hzueonyvag macarena yrvixndvwhkga

ndhwkdbcbe hayvektoubabrjnahor HEYYYY MACARENA

Page 12: DSR 09 Demo day presentation - Karthick Perumal

Karthick Perumal | Instilling a Sense of Humour in Computers | April 7th, 2017 | Page 12

Hyperparameter Tuning

Page 13: DSR 09 Demo day presentation - Karthick Perumal

Karthick Perumal | Instilling a Sense of Humour in Computers | April 7th, 2017 | Page 13

Summary● What do you call a cow with no eyes?

● I have a problem with my mom. It's gonna be so great

● What do you call a Mexican who runs for Christmas? A secret enemy.

● Why did the blonde stare at her windows for hours? First she liked it.

● A zombie walks into a bar. the bartender says, “Hey, we don't serve food in here”.

● I was going to make a joke about the movie Titanic, but I didn't want to go on.

● How many hipsters does it take to change a light bulb? Only onebut I have no idea how they got in there

Disappointment

Page 14: DSR 09 Demo day presentation - Karthick Perumal

Karthick Perumal | Instilling a Sense of Humour in Computers | April 7th, 2017 | Page 14

Thanks

Any Questions?

Page 15: DSR 09 Demo day presentation - Karthick Perumal

Karthick Perumal | Instilling a Sense of Humour in Computers | April 7th, 2017 | Page 15

Page 16: DSR 09 Demo day presentation - Karthick Perumal

Karthick Perumal | Instilling a Sense of Humour in Computers | April 7th, 2017 | Page 16

Model pipeline

Page 17: DSR 09 Demo day presentation - Karthick Perumal

Karthick Perumal | Instilling a Sense of Humour in Computers | April 7th, 2017 | Page 17

State of NLP

Page 18: DSR 09 Demo day presentation - Karthick Perumal

Karthick Perumal | Instilling a Sense of Humour in Computers | April 7th, 2017 | Page 18

Page 19: DSR 09 Demo day presentation - Karthick Perumal

Karthick Perumal | Instilling a Sense of Humour in Computers | April 7th, 2017 | Page 19

Language Modeling

Probability distribution over sequences of words

P( Data science is the future )

P(w1w2 ...wn)=∏i

P(wi∣w1w2 ...wi−1)

P(A ,B ,C , D)=P (A) x P(B∣A) x P(C∣A ,B) x P(D∣A ,B ,C)

> P( Data science is the Berlin )

> P( Data science ist die Zukunft )

> P(Data Science is the Zukunft)

Page 20: DSR 09 Demo day presentation - Karthick Perumal

Karthick Perumal | Instilling a Sense of Humour in Computers | April 7th, 2017 | Page 20

Long Short Term Memory

1) Forget gate layer2) Input gate layer3) Tanh layer to update new candidate value4) output information relevant to the subject