Crowsourcing for Social Multimedia Task: Crowsorting Timed Comments about Music
-
Upload
multimediaeval -
Category
Automotive
-
view
154 -
download
0
Transcript of Crowsourcing for Social Multimedia Task: Crowsorting Timed Comments about Music
Crowdsourcing for Social Multimedia Task: Crowdsorting Timed Comments about Music
Karthik Yadati Pavala S. N. Chandrasekaran Ayyanathan
Martha Larson
1
Crowdsourcing
• Crowdsourcing uses collective knowledge to solve tasks which are difficult to solve for the machine
• Challenges: – Designing the task that is understandable to the
crowd – Recruitment of workers – Quality control – Compiling results
2
The objective of the crowdsourcing task is to combine human computation and conventional computation to solve problems
3
Hybrid human/conventional computation pipeline
Annotation 1
Annotation 2
Annotation n
Consensus Label
.....
4
Open Science Framework (OSF)
• Open source software project to facilitate – Collaboration – Version control – Share research materials (data, results, code etc.)
• Main goals: – Improve research practices – Increase transparency – Encourage reproducibility of experimental results
5
Crowdsorting Timed Comments about Music
• Classification of timed comments
– Using labels collected from crowd
• Basic goal: Develop an algorithm that generates a single accurate label for a comment, given multiple noisy labels collected using a crowdsourcing platform.
• Additionally explore hybrid computation – Human computation (crowdsourcing) – Conventional computation (text/audio analysis)
6
Dataset • Built on users’ timed comments in Electronic Dance
Music (EDM) on SoundCloud
• Focus on segments where the user mentions the term “drop”
• Drop: A point of emotional release. Musically, it involves 3 aspects – Build-up of tension – Drop in intensity – Reintroduction of the bassline
8
Dataset
• A set of 591 music segments of duration 15 seconds from 382 music tracks – With Creative Commons license
• Metadata
– Track (title, likes, shares etc.) – Comments (user_id, text, timestamp etc.)
• Crowd labels
– 3 worker labels for each of the 15-second segments
9
Basic Human Labels
• Use Amazon Mechanical Turk (AMT) to collect labels from the crowd
• In each microtask, we ask the worker to label 3 music segments – Average time spent on a single microtask: 2 min.
• Workers were recruited based on their
familiarity with EDM
11
Basic Human Labels
• Listen to the track from 02:00 to 02:15 and pick the best answer – I can hear the drop (including the build-up) within
the 15-second window – I can hear only a part of the drop within the 15-
second window – I cannot hear a drop within the 15-second window
12
Expert annotations & Evaluation
• Panel of 7 experts label the comments on AMT
• 3 experts per comment • Ground-truth is obtained through majority
vote • Evaluation: Weighted F1-score
– Weighted with the number of true examples in each class
13
Submissions
Team Annotations Content Metadata Additional Crowdsourcing
Weighted F1-score
Simula (4 runs) 0.72
SAIL (2 runs) 0.73
Emotion in Music organizing team (4 runs) 0.71
Baseline (Major Class) 0.38
15
SAIL
• Model each worker as a noisy channel which corrupts the true label
• Use EM to solve for the true label – Random initialization (F-score = 0.16) – Initialization using Majority vote (F-score = 0.73)
16
Observations
• In contrast to last year, there is a trend towards consensus computation algorithms beating majority vote
• Results are preliminary and need further investigation – What question to ask the experts/workers? – Small dataset
• Potential of OSF to support benchmarking
– Open Notebook Science
17