Keepr presentation at MIT CMSW

43
Keepr Algorithm for Extracting Entities, Eyewitnesses and Amplifiers Hong Qu September 19, 2013

description

audio and live blog notes http://cmsw.mit.edu/liveblog-hong-qu-keepr/

Transcript of Keepr presentation at MIT CMSW

Page 1: Keepr presentation at MIT CMSW

KeeprAlgorithm for Extracting Entities, Eyewitnesses and Amplifiers

Hong QuSeptember 19, 2013

Page 2: Keepr presentation at MIT CMSW
Page 6: Keepr presentation at MIT CMSW
Page 8: Keepr presentation at MIT CMSW
Page 9: Keepr presentation at MIT CMSW
Page 10: Keepr presentation at MIT CMSW
Page 11: Keepr presentation at MIT CMSW
Page 12: Keepr presentation at MIT CMSW
Page 13: Keepr presentation at MIT CMSW
Page 14: Keepr presentation at MIT CMSW
Page 15: Keepr presentation at MIT CMSW
Page 16: Keepr presentation at MIT CMSW
Page 17: Keepr presentation at MIT CMSW
Page 18: Keepr presentation at MIT CMSW
Page 19: Keepr presentation at MIT CMSW
Page 20: Keepr presentation at MIT CMSW
Page 21: Keepr presentation at MIT CMSW
Page 22: Keepr presentation at MIT CMSW
Page 23: Keepr presentation at MIT CMSW
Page 24: Keepr presentation at MIT CMSW
Page 25: Keepr presentation at MIT CMSW

Humans vs Machines

Pattern RecognitionValue Judgment

Page 26: Keepr presentation at MIT CMSW

Computers count really, really fast!

Page 27: Keepr presentation at MIT CMSW

What is a Tweet?

140 characters:● words● @user mentions● #hashtags● links

Page 28: Keepr presentation at MIT CMSW

Hows does Keepr process tweets?

140 character * 100 tweets = 14,000 characters

❏ Parse it❏ Count it❏ Visualize it❏ Zoom in❏ Archive it

Page 29: Keepr presentation at MIT CMSW

https://canvas.instructure.com/courses/812708/assignments/syllabus

Natural Language Processing

Page 30: Keepr presentation at MIT CMSW
Page 31: Keepr presentation at MIT CMSW

Humans are way better at at making value judgements and telling stories

Page 33: Keepr presentation at MIT CMSW

Keepr’s Algorithm➔ Entity extraction

◆ Topics

➔ Media extraction◆ images, videos

➔ Link expansion◆ articles

➔ Conversation analysis◆ @ mentions

◆ source discovery

◆ amplification velocity

➔ Source verification◆ geo-location

◆ social media profiles

Page 34: Keepr presentation at MIT CMSW

Topic Extraction by Term Frequency

http://trimc-nlp.blogspot.com/2013/04/tfidf-with-google-n-grams-and-pos-tags.html

Page 36: Keepr presentation at MIT CMSW

Journalists want

❏ Source discovery and curation

❏ Passive monitoring and alerts

❏ Saving and archiving

❏ Visualizations

❏ Parity with TweetDeck user interface

Page 37: Keepr presentation at MIT CMSW

People want

“I want to catch up with a summary of key information about the breaking news story.”

I want to get a list of Twitter accounts who are official organizations related to that story.

Page 39: Keepr presentation at MIT CMSW
Page 40: Keepr presentation at MIT CMSW

What’s next for keepr?

➔ refine algorithm➔ source classification➔ conversation analysis and visualization➔ archiving search results and tweets

Rolling out a Beta program for newsroomsSign up at www.keepr.com/beta

Page 42: Keepr presentation at MIT CMSW

https://github.com/hqu/keepr