Similarity Algorithms for Music Plagiarismmas03dm/...MusicPlagiarism.pdf · for Music Plagiarism....
Transcript of Similarity Algorithms for Music Plagiarismmas03dm/...MusicPlagiarism.pdf · for Music Plagiarism....
![Page 1: Similarity Algorithms for Music Plagiarismmas03dm/...MusicPlagiarism.pdf · for Music Plagiarism. Structure 1 Introduction 2 Method and Examples 3 Algorithms 4 Empirical Study ...](https://reader031.fdocuments.us/reader031/viewer/2022020316/5b895b6a7f8b9a78618bb14b/html5/thumbnails/1.jpg)
Daniel Müllensiefen
Department of Psychology
GoldsmithsUniversity of London
Similarity Algorithmsfor Music Plagiarism
![Page 2: Similarity Algorithms for Music Plagiarismmas03dm/...MusicPlagiarism.pdf · for Music Plagiarism. Structure 1 Introduction 2 Method and Examples 3 Algorithms 4 Empirical Study ...](https://reader031.fdocuments.us/reader031/viewer/2022020316/5b895b6a7f8b9a78618bb14b/html5/thumbnails/2.jpg)
StructureStructure1 Introduction2 Method and Examples3 Algorithms4 Empirical Study5 Summary – Future Research
![Page 3: Similarity Algorithms for Music Plagiarismmas03dm/...MusicPlagiarism.pdf · for Music Plagiarism. Structure 1 Introduction 2 Method and Examples 3 Algorithms 4 Empirical Study ...](https://reader031.fdocuments.us/reader031/viewer/2022020316/5b895b6a7f8b9a78618bb14b/html5/thumbnails/3.jpg)
IntroductionIntroduction: : The The BackgroundBackgroundHuge Public Interest:
Importance for the pop industry, interestingemphasis on tunes (melodic plagiarism)
Ambigity surrounding UK law in defining asubstantail part
Lack of Research, Exceptions: Timothy English: Sounds Like Teen Spirit, (2008) Stan Soocher: They Fought The Law, (1999) Charles Cronin: Concepts of Melodic Similarity in
Music-Copyright, (1998)
![Page 4: Similarity Algorithms for Music Plagiarismmas03dm/...MusicPlagiarism.pdf · for Music Plagiarism. Structure 1 Introduction 2 Method and Examples 3 Algorithms 4 Empirical Study ...](https://reader031.fdocuments.us/reader031/viewer/2022020316/5b895b6a7f8b9a78618bb14b/html5/thumbnails/4.jpg)
The 2-second rule The 1-bar rule The 5-notes rule
IntroductionIntroduction::No No hard hard and fast and fast rulesrules!!
It is more complicated than that!
What has been taken needs to be at least asubstantial part of the copyright work(Lord Millet, Designers Guild v. Williams [2000] 1 WLR 2426)
![Page 5: Similarity Algorithms for Music Plagiarismmas03dm/...MusicPlagiarism.pdf · for Music Plagiarism. Structure 1 Introduction 2 Method and Examples 3 Algorithms 4 Empirical Study ...](https://reader031.fdocuments.us/reader031/viewer/2022020316/5b895b6a7f8b9a78618bb14b/html5/thumbnails/5.jpg)
The first quantitative study on music plagiarism
IntroductionIntroduction: : MMüllensiefen üllensiefen &&Pendzich Pendzich (2009)(2009)
Questions: Can computer algorithms predict plagiarism
incidents from the melodic similarity of tunes? What is the predictive power of these algorithms
when superimposed on documented case law?
What is the frame of reference(directionality of comparisons)?
How can prior musical knowledge be takeninto account?
![Page 6: Similarity Algorithms for Music Plagiarismmas03dm/...MusicPlagiarism.pdf · for Music Plagiarism. Structure 1 Introduction 2 Method and Examples 3 Algorithms 4 Empirical Study ...](https://reader031.fdocuments.us/reader031/viewer/2022020316/5b895b6a7f8b9a78618bb14b/html5/thumbnails/6.jpg)
MethodMethod: : OutlineOutline
Selection of 20 cases* from 1970 to 2005 – focuson melodic aspects of copyright infringement
Transcription to monophonic MIDI files Analysis of judges' written opinions Reduction of court decisions to two categories
“pro plaintiff” = melodic plagiarism “contra plaintiff” = no infringement
Use of context: Corpus of 14,063 pop songsrepresenting pop music history from 1950-2006
*from UCLA/Columbia copyright infringement database: http://cip.law.ucla.edu/
![Page 7: Similarity Algorithms for Music Plagiarismmas03dm/...MusicPlagiarism.pdf · for Music Plagiarism. Structure 1 Introduction 2 Method and Examples 3 Algorithms 4 Empirical Study ...](https://reader031.fdocuments.us/reader031/viewer/2022020316/5b895b6a7f8b9a78618bb14b/html5/thumbnails/7.jpg)
The Chiffons He‘s So Fine, 1963 No. 1 in US, UK highest position 11
Example 1: Bright Tunes v.Example 1: Bright Tunes v.Harrisongs Harrisongs (1976, 420 F. Supp)(1976, 420 F. Supp)
George Harrison, My Sweet LordPublished in 1971 No.-1-Hit in US, UK & (West-)Germany
![Page 8: Similarity Algorithms for Music Plagiarismmas03dm/...MusicPlagiarism.pdf · for Music Plagiarism. Structure 1 Introduction 2 Method and Examples 3 Algorithms 4 Empirical Study ...](https://reader031.fdocuments.us/reader031/viewer/2022020316/5b895b6a7f8b9a78618bb14b/html5/thumbnails/8.jpg)
Ronald Selle, “Let It End” Unreleased (1975)
Example 2: Selle v. Gibb Example 2: Selle v. Gibb (1983, 567(1983, 567F.Supp 1173)F.Supp 1173)
Bee Gees, “How Deep Is Your Love” (1977)
![Page 9: Similarity Algorithms for Music Plagiarismmas03dm/...MusicPlagiarism.pdf · for Music Plagiarism. Structure 1 Introduction 2 Method and Examples 3 Algorithms 4 Empirical Study ...](https://reader031.fdocuments.us/reader031/viewer/2022020316/5b895b6a7f8b9a78618bb14b/html5/thumbnails/9.jpg)
AlgorithmsAlgorithms
Statistically Informed Similarity Algorithms:Importance of frequency of melodicelements for similarity assessment
Inspired from computational linguistics(Baayen, 2001) and text processing (Manning &Schütze, 1999; Jurafsky & Martin, 2000)
Conceptual Components: m-types (aka n-grams) as melodic elements Frequency counts: Type frequency (TF) and
Inverted Document Frequency (IDF)
![Page 10: Similarity Algorithms for Music Plagiarismmas03dm/...MusicPlagiarism.pdf · for Music Plagiarism. Structure 1 Introduction 2 Method and Examples 3 Algorithms 4 Empirical Study ...](https://reader031.fdocuments.us/reader031/viewer/2022020316/5b895b6a7f8b9a78618bb14b/html5/thumbnails/10.jpg)
Melodic elementsMelodic elements: : m-typesm-types
Word Type t Frequency f(t), Melodic Type τ (pitchinterval, length 2)
Frequency f(τ),
Twinkle 2 0, +7 1
little 1 +7, 0 1
star 1 0, +2 1
How 1 +2, 0 1
I 1 0, -2 3
wonder 1 -2, -2 1
what 1 -2, 0 2
you 1 0, -1 1
are 1 -1, 0 1
![Page 11: Similarity Algorithms for Music Plagiarismmas03dm/...MusicPlagiarism.pdf · for Music Plagiarism. Structure 1 Introduction 2 Method and Examples 3 Algorithms 4 Empirical Study ...](https://reader031.fdocuments.us/reader031/viewer/2022020316/5b895b6a7f8b9a78618bb14b/html5/thumbnails/11.jpg)
!
IDFC"( ) = log
C
m:" #m( )C Corpus of melodies
m melody
τ Melodic type
T # different melodic types
|m:τ ∈ m| # melodies containing τ
!
TF(m," ) =fm "( )
fm " i( )i=1
#
$
Melodic Type τ(pitch interval,length 2)
Frequency f(τ)
0, +7 1
+7, 0 1
0, +2 1
+2, 0 1
0, -2 3
-2, -2 1
-2, 0 2
0, -1 1
-1, 0 1
Type- and Type- and InvertedInvertedDocument FrequenciesDocument Frequencies
![Page 12: Similarity Algorithms for Music Plagiarismmas03dm/...MusicPlagiarism.pdf · for Music Plagiarism. Structure 1 Introduction 2 Method and Examples 3 Algorithms 4 Empirical Study ...](https://reader031.fdocuments.us/reader031/viewer/2022020316/5b895b6a7f8b9a78618bb14b/html5/thumbnails/12.jpg)
!
IDFC"( ) = log
C
m:" #m( )C Corpus of melodies
m melody
τ Melodic type
τ T # different melodic types
|m:τ ∈ m| # melodies containing τ
!
TF(m," ) =fm "( )
fm " i( )i=1
#
$
Melodic Type τ(pitch interval,length 2)
Frequency f(τ) TF(m, τ)
0, +7 1 0.083
+7, 0 1 0.083
0, +2 1 0.083
+2, 0 1 0.083
0, -2 3 0.25
-2, -2 1 0.083
-2, 0 2 0.167
0, -1 1 0.083
-1, 0 1 0.083
Type- and Type- and InvertedInvertedDocument FrequenciesDocument Frequencies
![Page 13: Similarity Algorithms for Music Plagiarismmas03dm/...MusicPlagiarism.pdf · for Music Plagiarism. Structure 1 Introduction 2 Method and Examples 3 Algorithms 4 Empirical Study ...](https://reader031.fdocuments.us/reader031/viewer/2022020316/5b895b6a7f8b9a78618bb14b/html5/thumbnails/13.jpg)
!
IDFC"( ) = log
C
m:" #m( )C Corpus of melodies
m melody
τ Melodic type
τ T # different melodic types
|m:τ ∈ m| # melodies containing τ
!
TF(m," ) =fm "( )
fm " i( )i=1
#
$
Melodic Type τ(pitch interval,length 2)
Frequency f(τ) TF(m, τ) IDFC(τ)
0, +7 1 0.083 1.57
+7, 0 1 0.083 1.36
0, +2 1 0.083 0.23
+2, 0 1 0.083 0.28
0, -2 3 0.25 0.16
-2, -2 1 0.083 0.19
-2, 0 2 0.167 0.22
0, -1 1 0.083 0.51
-1, 0 1 0.083 0.74
Type- and Type- and InvertedInvertedDocument FrequenciesDocument Frequencies
![Page 14: Similarity Algorithms for Music Plagiarismmas03dm/...MusicPlagiarism.pdf · for Music Plagiarism. Structure 1 Introduction 2 Method and Examples 3 Algorithms 4 Empirical Study ...](https://reader031.fdocuments.us/reader031/viewer/2022020316/5b895b6a7f8b9a78618bb14b/html5/thumbnails/14.jpg)
!
IDFC"( ) = log
C
m:" #m( )C Corpus of melodies
m melody
τ Melodic type
τ T # different melodic types
|m:τ ∈ m| # melodies containing τ
!
TF(m," ) =fm "( )
fm " i( )i=1
#
$
Melodic Type τ(pitch interval,length 2)
Frequency f(τ) TF(m, τ) IDFC(τ) TFIDFm,C(τ)
0, +7 1 0.083 1.57 0.13031
+7, 0 1 0.083 1.36 0.11288
0, +2 1 0.083 0.23 0.01909
+2, 0 1 0.083 0.28 0.02324
0, -2 3 0.25 0.16 0.04
-2, -2 1 0.083 0.19 0.01577
-2, 0 2 0.167 0.22 0.03674
0, -1 1 0.083 0.51 0.04233
-1, 0 1 0.083 0.74 0.06142
Type- and Type- and InvertedInvertedDocument FrequenciesDocument Frequencies
![Page 15: Similarity Algorithms for Music Plagiarismmas03dm/...MusicPlagiarism.pdf · for Music Plagiarism. Structure 1 Introduction 2 Method and Examples 3 Algorithms 4 Empirical Study ...](https://reader031.fdocuments.us/reader031/viewer/2022020316/5b895b6a7f8b9a78618bb14b/html5/thumbnails/15.jpg)
!
"C(s,t) =
TFIDFs,C(# ) $TFIDF
t,C(# )
# % sn&tn
'
TFIDFs,C(# )( )
2
$ TFIDFt,C(# )( )
2
# % sn&tn
'#% s
n&tn
'
TF-IDF CorrelationTF-IDF Correlation
![Page 16: Similarity Algorithms for Music Plagiarismmas03dm/...MusicPlagiarism.pdf · for Music Plagiarism. Structure 1 Introduction 2 Method and Examples 3 Algorithms 4 Empirical Study ...](https://reader031.fdocuments.us/reader031/viewer/2022020316/5b895b6a7f8b9a78618bb14b/html5/thumbnails/16.jpg)
Ratio Model (Tversky, 1977): Similarity σ(s,t) related to # features in s and t have common salience of features f()
!
"(s,t) =f (sn# tn )
f (sn# tn )+$f (sn \ tn )+ %f ( tn \ sn ),$,% & 0
features => m-types salience => IDF and TF different values of α, β to change frame of reference
Variable m-type lengths (n=1,…,4), entropy-weighted average
Feature-based SimilarityFeature-based Similarity
![Page 17: Similarity Algorithms for Music Plagiarismmas03dm/...MusicPlagiarism.pdf · for Music Plagiarism. Structure 1 Introduction 2 Method and Examples 3 Algorithms 4 Empirical Study ...](https://reader031.fdocuments.us/reader031/viewer/2022020316/5b895b6a7f8b9a78618bb14b/html5/thumbnails/17.jpg)
Tversky.equal measure (with α = β = 1)
!
"(s,t) =IDF
C(# )
# $ sn% t
n
&IDF
C(# )
# $ sn% t
n
& + IDFC(# )
# $ sn\ tn
& + IDFC(# )
# $ tn\ sn
&
Tversky.plaintiff.only measure (with α = 1, β = 0)
!
"plaintiff.only(s,t) =IDF
C(# )
# $ sn% t
n
&IDF
C(# )
# $ sn% t
n
& + IDFC(# )
# $ sn\ tn
&
Tversky.defendant.only measure (with α = 0, β = 1)
!
"defendant .only( t,s) =IDFC (# )# $ sn% tn
&IDFC (# )# $ sn% tn
& + IDFC (# )# $ tn \ sn&
Tversky.weighted measure with and
!
" =TF
s(# )
# $ sn% t
n
&TF
s(# )
# $ sn
&
!
" =TF
t(# )
# $ sn% t
n
&TF
t(# )
# $ tn
&
Feature-based SimilarityFeature-based Similarity
![Page 18: Similarity Algorithms for Music Plagiarismmas03dm/...MusicPlagiarism.pdf · for Music Plagiarism. Structure 1 Introduction 2 Method and Examples 3 Algorithms 4 Empirical Study ...](https://reader031.fdocuments.us/reader031/viewer/2022020316/5b895b6a7f8b9a78618bb14b/html5/thumbnails/18.jpg)
Ground Truth:20 cases with pro/contra decision (7/13)
Evaluation metrics Accuracy (% correct at optimal cut-off on
similarity scale) AUC (Area Under receiver operating
characteristic Curve)
EvaluationEvaluation
![Page 19: Similarity Algorithms for Music Plagiarismmas03dm/...MusicPlagiarism.pdf · for Music Plagiarism. Structure 1 Introduction 2 Method and Examples 3 Algorithms 4 Empirical Study ...](https://reader031.fdocuments.us/reader031/viewer/2022020316/5b895b6a7f8b9a78618bb14b/html5/thumbnails/19.jpg)
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
Edit Dis
t.
Sum C
om.
Ukkonen
TF-IDF c
orrel.
Tv.equal
Tv.pla
intif
f
Tv.defe
ndant
Tv.weig
hted
Accuracy
AUC
EvaluationEvaluation
![Page 20: Similarity Algorithms for Music Plagiarismmas03dm/...MusicPlagiarism.pdf · for Music Plagiarism. Structure 1 Introduction 2 Method and Examples 3 Algorithms 4 Empirical Study ...](https://reader031.fdocuments.us/reader031/viewer/2022020316/5b895b6a7f8b9a78618bb14b/html5/thumbnails/20.jpg)
Evaluation: ROC CurvesEvaluation: ROC Curves
![Page 21: Similarity Algorithms for Music Plagiarismmas03dm/...MusicPlagiarism.pdf · for Music Plagiarism. Structure 1 Introduction 2 Method and Examples 3 Algorithms 4 Empirical Study ...](https://reader031.fdocuments.us/reader031/viewer/2022020316/5b895b6a7f8b9a78618bb14b/html5/thumbnails/21.jpg)
Observations: Decision sometimes based on ‘characteristic motives’ (incl. rhythm)
High-level form can be important (e.g. call-and-response structure)
Frame of reference can be different
Ronald Selle, “Let It End”
Bee Gees, “How Deep Is Your Love”
A Qualitative LookA Qualitative Look
![Page 22: Similarity Algorithms for Music Plagiarismmas03dm/...MusicPlagiarism.pdf · for Music Plagiarism. Structure 1 Introduction 2 Method and Examples 3 Algorithms 4 Empirical Study ...](https://reader031.fdocuments.us/reader031/viewer/2022020316/5b895b6a7f8b9a78618bb14b/html5/thumbnails/22.jpg)
SummarySummary Court decisions can be related closely tomelodic similarity Plaintiff’s song is often the frame ofreference when determining the question ofsubstantial part (“It depends upon its importance to thecopyright work. It does not depend upon its importance to thedefendants” per Lord Millet)
Statistical information about commonnessof melodic elements is important
![Page 23: Similarity Algorithms for Music Plagiarismmas03dm/...MusicPlagiarism.pdf · for Music Plagiarism. Structure 1 Introduction 2 Method and Examples 3 Algorithms 4 Empirical Study ...](https://reader031.fdocuments.us/reader031/viewer/2022020316/5b895b6a7f8b9a78618bb14b/html5/thumbnails/23.jpg)
Future ResearchFuture Research
Conduct analogous studies with cases fromthe UK and Germany and utilize additionalUS cases
Include rhythm in m-types Use higher level features of melodies in
addition to m-types Conduct listening experiment to determine
cognitive validity
![Page 24: Similarity Algorithms for Music Plagiarismmas03dm/...MusicPlagiarism.pdf · for Music Plagiarism. Structure 1 Introduction 2 Method and Examples 3 Algorithms 4 Empirical Study ...](https://reader031.fdocuments.us/reader031/viewer/2022020316/5b895b6a7f8b9a78618bb14b/html5/thumbnails/24.jpg)
Daniel Müllensiefen
Department of Psychology
Goldsmiths,University of London
Similarity algorithmsfor music plagiarism