Rating Online Content (Movies)

65
Your Reactions Suggest You Liked the Movie Automatic Content Rating via Reaction Sensing Xuan Bao, Songchun Fan, Romit Roy Choudhury, Alexander Varshavsky, Kevin A. Li

description

Your Reactions Suggest You Liked the Movie Automatic Content Rating via Reaction Sensing Xuan Bao , Songchun Fan , Romit Roy Choudhury , Alexander Varshavsky , Kevin A. Li . Rating Online Content (Movies). Manual rating not incentivized, not easy … does not reflect experience. - PowerPoint PPT Presentation

Transcript of Rating Online Content (Movies)

Page 1: Rating Online Content (Movies)

Your Reactions Suggest You Liked the Movie Automatic Content Rating via Reaction Sensing

Xuan Bao, Songchun Fan, Romit Roy Choudhury, Alexander Varshavsky, Kevin A. Li

Page 2: Rating Online Content (Movies)

Rating Online Content (Movies)

Manual rating not incentivized, not easy … does not reflect experience

Page 3: Rating Online Content (Movies)

Our Vision

Overall star rating

Reaction tags

Reaction-based

highlights

Page 4: Rating Online Content (Movies)

Our Vision

Overall star rating

Reaction tags

Reaction-based

highlights

Automatically

Page 5: Rating Online Content (Movies)

Key Intuition Multi-modal sensing / learning

Reactions / Ratings

02:43 - Action09:21 - Hilarious

Overall – 5 Stars

……

12:01 - Suspense

Page 6: Rating Online Content (Movies)

Specific Opportunities• Visual

– Facial expressions, eye movements, lip movements …

• Audio– Laughter, talking

• Motion– Device stability

• Touch screen activities– Fast forward, rewind, checking emails and IM chats …

• Cloud– Aggregate knowledge from others’ reactions– Labeled scores from some users

Page 7: Rating Online Content (Movies)

Pulse: System Sketch

Page 8: Rating Online Content (Movies)

Applications (Beyond Movie Ratings)• Annotated movie timeline

– Slide forward to the action scenes

• Platform for ad analytics– Assess which ads grabbing attention … – Customize ads based on scenes that user reacts to

• Personalized replays and automatic highlights– User reacts to specific tennis shot, TV shows personalized replay– Highlights of all exciting moments in the superbowl game

• Online video courses (MOOCs)– May indicate which parts of lecture needs clarification

• Early disease symptom identifcation– ADHD among young children, and other syndromes

Page 9: Rating Online Content (Movies)

First Step: A Sensor Assisted Video Player

Page 10: Rating Online Content (Movies)

• Developed on Samsung Galaxy tablet (Android)– Sensor meta-data layered on video as output

Sensing threads control

Observe the user from front cam

Media player control functions monitored

Pulse Media Player

Page 11: Rating Online Content (Movies)

Basic Design

Features from Raw Sensor ReadingsMicrophone, Camera, Acc, Gyro, Touch, Clicks

Reactions: Laugh, Giggle, Doze, Still, Music …Signals to Reactions (S2R)

Reaction to Rating & Adjective (R2RA)

English Adjectives

Numeric Rating

Tag Cloud Final Rating

Data Distillation Process

Page 12: Rating Online Content (Movies)

Basic Design

Features from Raw Sensor ReadingsMicrophone, Camera, Acc, Gyro, Touch, Clicks

Reactions: Laugh, Giggle, Doze, Still, Music …Signals to Reactions (S2R)

Reaction to Rating & Adjective (R2RA)

English Adjectives

Numeric Rating

Tag Cloud Final Rating

Data Distillation Process

Cloud

Page 13: Rating Online Content (Movies)

Visual Reactions• Facial expressions (face size, eye size, blink,

etc.)– Track viewers’ face through the front camera– Track eye position and size (challenging with spectacles)– Track partial faces (via SURF points matching)

Face Tracking Eye Tracking (Green)Blink (Red)

Partial Face

Page 14: Rating Online Content (Movies)

Visual Reactions• Facial expressions (face size, eye size, blink,

etc.)– Track viewers’ face through the front camera– Track eye position and size (challenging with spectacles)– Track partial faces (via SURF points matching)– Detect blinks, lip size

Look for difference between frames

Page 15: Rating Online Content (Movies)

Acoustic Reactions• Laughter, Conversation, Shout-outs …

– Cancel out (known) movie sound from recorded sound – Laughter detection, conversation detection

Even with knowledge of the original movie audio (Blue), it is hard to identify user conversation (distinguish Red and Green)

Page 16: Rating Online Content (Movies)

Acoustic Reactions• Separating movie from user’s audio

– Spectral energy density comparison not adequate

– Different techniques for different volume regimes

High VolumeLow Volume

Page 17: Rating Online Content (Movies)

Acoustic Reactions• Laughter, Conversation, Shout-outs …

– Cancel out (known) movie sound from recorded sound – Laughter detection, conversation detection

Early results demonstrate promise of detecting acoustic reactions

Page 18: Rating Online Content (Movies)

Motion Reactions• Reactions also leave footprint on motion

dimensions– Motionless during intense scene– Fidget during boredom

Intense scene Calm sceneTime to stretch

Page 19: Rating Online Content (Movies)

Motion Reactions• Reactions also leave footprint on motion

dimensions– Motionless during intense scene– Fidget during boredom

Page 20: Rating Online Content (Movies)

Motion Reactions• Reactions also leave footprint on motion

dimensions– Motionless during intense scene– Fidget during boredom

Motion readings correlate with changing in ratings …

Page 21: Rating Online Content (Movies)

Motion Reactions

Motion readings correlate with changing in ratings …Timing of motions also correlate with timing of scene changes

• Reactions also leave footprint on motion dimensions– Motionless during intense scene– Fidget during boredom

Page 22: Rating Online Content (Movies)

Extract Reaction Features – Player control• Collect users’ player control operations• Pause, fast forward, jump, roll back, …• All slider movement

Seek bar

Page 23: Rating Online Content (Movies)

Challenges in Learning

Page 24: Rating Online Content (Movies)

Problem – A Generalized Model Does Not Work• Directly trained model does not capture the rating

trendWhy?

Page 25: Rating Online Content (Movies)

The Reason it Does Not Work is …• Human behaviors are heterogeneous

– Users are different– Environments are different even for same user (home vs.

commute)

home commute

Sensed motion patterns very different when the same movie wateched during a bus commute vs. in bed at home.

Page 26: Rating Online Content (Movies)

The Reason it Does Not Work is …• Human behaviors are heterogeneous

– Users are different– Environments are different even for same user (home vs.

commute)

• Gyroscope readings from same user (at home and office)

Page 27: Rating Online Content (Movies)

The Reason it Does Not Work is …• Human behaviors are heterogeneous

– Users are different– Environments are different even for same user (home vs.

commute)

• Gyroscope readings from same user (at home and office)

• Naïve solution build specific models one by one– Impossible to acquire data for all <User, Context, Movie>

tuplesOffice Home Commute …

Page 28: Rating Online Content (Movies)

Challenges in Learning

Approach:Bootstrap from Reaction Agreements

Page 29: Rating Online Content (Movies)

Approach: Bootstrap from Agreement• Thoughts

– What behavior means positive/negative for a particular setting– How do we acquire data without explicitly asking the user every

time

• Approach: Utilize reactions that most people agree on

TimeClimax Boring Cloud Knowledge

(Other users’ opinions)

Sensor Reading

Ratings

Page 30: Rating Online Content (Movies)

Approach: Bootstrap from Agreement• Solution: spawn from consensus

– Learn user reactions during the “climax” and the “boring” moments

– Generalize this knowledge of positive/negative reactions – Gaussian process regression (ratings) and svm (labels)

GPR SVM

Page 31: Rating Online Content (Movies)

Evaluation

Page 32: Rating Online Content (Movies)

User Experiment Setting• 11 participants watch preloaded movies (~50 movies)

• 2 comedies, 2 dramas, 1 horror movie, 1action movie

• Users provide manual ratings and labels – For ground truth

• We compare Pulse’s ratings with manual ratings

Page 33: Rating Online Content (Movies)

Pulse Truth 1 2 3 4 5

1 0 1 0 0 0

2 0 4 2 0 1

3 0 1 17 0 1

4 0 0 2 5 25 0 0 2 1 7

Preliminary Results – Final (5 Star) Rating

Page 34: Rating Online Content (Movies)

Difference with true 5 star manual rating

Preliminary Results – Final (5 Star) Rating

Page 35: Rating Online Content (Movies)

Preliminary Results – Myth behind the Error• Final ratings can deviate significantly from the

average segment ratings

• User-given scores may not be linearly related to quality

Page 36: Rating Online Content (Movies)

Preliminary Results – Lower Segment Rating Error• Final ratings come from averaging segment ratings• Our system outperforms other methods

Mean Error(5-point scale)

Random ratings Collaborativefiltering

Our system

3 4 4 2 2 2 5Per-segment ratings

Page 37: Rating Online Content (Movies)

Preliminary Results – Better Tag Quality• Tags capture users’ feelings better than SVM alone

Happy Intense Warm

Happy Intense Warm

Page 38: Rating Online Content (Movies)

Preliminary Results – Reasonable Energy Overhead• Reasonable energy overhead compared to without

sensing

More tolerable on tablets. May need duty-cycling on smart phones

Page 39: Rating Online Content (Movies)

Closing Thoughts• Human reactions are in the mind

– However, manifest into bodily gestures, activities

• Rich, multi-modal sensors on moble devices– A wider net for “catching” these reactions

• Pulse is an attempt to realize this opportunity– Distilling semantic meanings from sensor streams– Rating movies … tagging any content with reaction meta data

• Enabler for– Recommendation engines– Content/video search– Information retrieval, summarization

Page 40: Rating Online Content (Movies)

Thoughts?

Page 41: Rating Online Content (Movies)

Backup – potential questions• Privacy concern

– Like every technology, pulse may attract early adoptors– If only final ratings are uploaded, the privacy level is similar to current ratings

• Why not just emotion sensing/just laughter detection– Emotion sensing is a broad and challenging problem…but the goal is different than ours (rating)… – Explicit signs like laughter usually only account for a small duration of movie viewing, we need to

explore other opportunities (motion)– Our approach takes advantage of the specific task – 1. we know the user is watching a movie 2. we

can observe the user for a longer duration (than most emotion sensing work) 3. we know other users’ opinions

• How is this possible…human mind is too complex– Human thoughts are complicated… but they may produce footprints in behaviors– Using collaborative filtering explicitly uses knowledge of other users’ thoughts to bootstrap our

algorithm

• The sample size is small…only 11 users– The sample size is limited, but– Each user watched multiple movies (50+ movies viewed)… segment ratings are for 1-minute

segments (thousands of points)– Collaborative filtering shows that even within this data set, the ratings can diverge and naïve

solution does not work as well as ours

Page 42: Rating Online Content (Movies)

Preliminary Results – Better Retrieval Accuracy• Viewers care more about the highlights of a movie• Find the contribution by using sensing

Gain Additional error

Total goalOverall achieved performance

Page 43: Rating Online Content (Movies)

Challenges in Learning

Page 44: Rating Online Content (Movies)

Problem – A Generalized Model Does Not Work

• Directly trained model does not capture the rating trend

Why?

Page 45: Rating Online Content (Movies)

The Reason it Does Not Work is …

• Human behaviors are heterogeneous– Users are different– Environments are different (e.g., home vs. commute)

home commute

Sensed motion patterns very different when the same movie wateched during a bus commute vs. in bed at home.

Page 46: Rating Online Content (Movies)

The Reason it Does Not Work is …

• Human behaviors are heterogeneous– Users are different– Environments are different (e.g., home vs. commute)

• Impact of sensor readings histograms

Page 47: Rating Online Content (Movies)

• Human behaviors are heterogeneous– Users are different– Environments are different (e.g., home vs. commute)

• Impact on sensor readings histograms

• Naïve solution build specific models one by one– Impossible to acquire data for all <User, Context, Movie> tuples

Office Home Commute …

The Reason it Does Not Work is …

Page 48: Rating Online Content (Movies)

Challenges in Learning

Approach:Bootstrap from Reaction Agreements

Page 49: Rating Online Content (Movies)

Approach: Bootstrap from Agreement

• Thoughts– What behavior means positive/negative for a particular setting– How do we acquire data without explicitly asking the user every time

• Approach: Utilize reactions that most people agree on

Time

Climax Boring

Cloud Knowledge(Other users’

opinions)

Sensor Reading

Ratings

Page 50: Rating Online Content (Movies)

Approach: Bootstrap from Agreement

• Solution: spawn from consensus– Learn user reactions during the “climax” and the “boring” moments – Generalize this knowledge of positive/negative reactions – Gaussian process regression (ratings) and svm (labels)

GPR SVM

Page 51: Rating Online Content (Movies)

Approach: Bootstrap from Agreement

GPR

A Simple Example of GPR

Page 52: Rating Online Content (Movies)

Approach: Bootstrap from Agreement

• On GPR and SVM - SVM– SVM is a supervised learning method for classification

– Identify hyperplanes in high-dimensional space that can best separate observed samples

– For our purpose, we used non-linear SVM with RBF kernel for its wide applicability

Page 53: Rating Online Content (Movies)

User Experiment Setting

• 11 participants watch preloaded movies (46 movies)

• Two comedy, two dramas, one horror movie, one action movie

• Users give manual ratings and labels

• Evaluate by comparing generated ratings with manual ratings

Page 54: Rating Online Content (Movies)

Evaluation

Page 55: Rating Online Content (Movies)

Pulse Truth 1 2 3 4 5

1 0 1 0 0 0

2 0 4 2 0 1

3 0 1 17 0 1

4 0 0 2 5 25 0 0 2 1 7

Preliminary Results – Good Final Rating

Page 56: Rating Online Content (Movies)

Preliminary Results – Myth behind the Error

• Final ratings can deviate significantly from segment rating• User-given scores may not be linearly related to quality

Page 57: Rating Online Content (Movies)

Preliminary Results – Lower Segment Rating Error

• Final ratings come from averaging ratings for each segment• Our system outperforms other methods

Mean Error(5-point scale)

Random ratings Collaborativefiltering

Our system

3 4 4 2 2 2 5Movie

segments

Page 58: Rating Online Content (Movies)

Preliminary Results – Better Retrieval Accuracy

• Viewers care more about the highlights of a movie• Find the contribution by using sensing

Gain Additional error

Total goalOverall achieved performance

Page 59: Rating Online Content (Movies)

Preliminary Results – Better Tag Quality

• Generated tags captures users’ feelings much better than using SVM alone

Happy Intense Warm

Happy Intense Warm

Page 60: Rating Online Content (Movies)

Preliminary Results – Reasonable Energy Overhead

• Reasonable energy overhead compared to without sensing

More tolerable on tablets. May need duty-cycling on smart phones

Page 61: Rating Online Content (Movies)

Closing Thoughts• Human reactions are in the mind

– However, manifest into bodily gestures, activities

• Rich, multi-modal sensors on moble devices– Opportunity for “catching” these activities– Multi-modal capability – whole is greater than sum of parts

• Pulse is an attempt to realize this opportunity– Distilling semantic meanings from sensor streams– Rating movies … tagging any content with reaction meta data

• Enabler for– Recommendation engines– Content/video search– Information retrieval, summarization

Page 62: Rating Online Content (Movies)

Questions?

Page 63: Rating Online Content (Movies)

Extract Reaction Features – Player control• Player control and taps

– Pause, fast forward, jump, roll back, …– All slider movement

Seek bar

Page 64: Rating Online Content (Movies)

Approach: Bootstrap from Agreement

GPR

A Simple Example of GPR

Page 65: Rating Online Content (Movies)

Approach: Bootstrap from Agreement• On GPR and SVM - SVM

– SVM is a supervised learning method for classification

– Identify hyperplanes in high-dimensional space that can best separate observed samples

– For our purpose, we used non-linear SVM with RBF kernel for its wide applicability