Model%Free%Music%Similarity%Measure%courses.cecs.anu.edu.au/courses/CS_PROJECTS/10S2/Final...

17
Only for presenta.onal or informa.onal purpose. Copyright © 2010 Wen Shao @ ANU. All rights reserved. AI Project Presenta.on Model Free Music Similarity Measure Wen Shao(u4717714) Under the supervision of Prof. Tom Gedeon

Transcript of Model%Free%Music%Similarity%Measure%courses.cecs.anu.edu.au/courses/CS_PROJECTS/10S2/Final...

Page 1: Model%Free%Music%Similarity%Measure%courses.cecs.anu.edu.au/courses/CS_PROJECTS/10S2/Final present… · Only%for%presenta.onal%or%informa.onal%purpose.% Copyright©%2010%Wen%Shao%@ANU.%All%rights%reserved.%

Only  for  presenta.onal  or  informa.onal  purpose.  Copyright  ©  2010  Wen  Shao  @  ANU.  All  rights  reserved.  

AI  Project  Presenta.on    

Model  Free  Music  Similarity  Measure  

Wen  Shao(u4717714)  Under  the  supervision  of    

Prof.  Tom  Gedeon  

Page 2: Model%Free%Music%Similarity%Measure%courses.cecs.anu.edu.au/courses/CS_PROJECTS/10S2/Final present… · Only%for%presenta.onal%or%informa.onal%purpose.% Copyright©%2010%Wen%Shao%@ANU.%All%rights%reserved.%

Only  for  presenta.onal  or  informa.onal  purpose.  Copyright  ©  2010  Wen  Shao  @  ANU.  All  rights  reserved.  

Music  Similarity  Measure  SeNngs  

•  Content-­‐based  retrieval  (Query  by  singing  and  humming)  – A  natural  way  to  search  – Only  possible  way  some.mes  – Applica.ons:  

Album?  

Ar)st?  Lyrics?  

Language?

 

Publica)on  Date?  

VS.  

…  

Page 3: Model%Free%Music%Similarity%Measure%courses.cecs.anu.edu.au/courses/CS_PROJECTS/10S2/Final present… · Only%for%presenta.onal%or%informa.onal%purpose.% Copyright©%2010%Wen%Shao%@ANU.%All%rights%reserved.%

Only  for  presenta.onal  or  informa.onal  purpose.  Copyright  ©  2010  Wen  Shao  @  ANU.  All  rights  reserved.  

Music  Similarity  Measure  SeNngs  

•  Content-­‐based  retrieval  (Query  by  singing  and  humming)  •  Similarity  Measure  – One  of  the  key  problems  

Album?  

Ar)st?  Lyrics?  

Language?

 

VS.  

Singing  Clips   Humming  Clip  

Publica)on  Date?  

Page 4: Model%Free%Music%Similarity%Measure%courses.cecs.anu.edu.au/courses/CS_PROJECTS/10S2/Final present… · Only%for%presenta.onal%or%informa.onal%purpose.% Copyright©%2010%Wen%Shao%@ANU.%All%rights%reserved.%

Only  for  presenta.onal  or  informa.onal  purpose.  Copyright  ©  2010  Wen  Shao  @  ANU.  All  rights  reserved.  

Music  Similarity  Measure  SeNngs  

•  Content-­‐based  retrieval  (Query  by  singing  and  humming)  •  Similarity  Measure  •  MIREX  (Music  Informa.on  Retrieval  EXchange)  – One  of  the  tasks  – Subjec.ve  and  Objec.ve          Evalua.on  

Album?  

Ar)st?  Lyrics?  

Language?

 

VS.  

Publica)on  Date?  

Page 5: Model%Free%Music%Similarity%Measure%courses.cecs.anu.edu.au/courses/CS_PROJECTS/10S2/Final present… · Only%for%presenta.onal%or%informa.onal%purpose.% Copyright©%2010%Wen%Shao%@ANU.%All%rights%reserved.%

Only  for  presenta.onal  or  informa.onal  purpose.  Copyright  ©  2010  Wen  Shao  @  ANU.  All  rights  reserved.  

Music  Similarity  Measure  SeNngs  

•  Content-­‐based  retrieval  (Query  by  singing  and  humming)  •  Similarity  Measure  •  MIREX  •  Model  Free  approach  – No  music  knowledge  assumed  

Album?  

Ar)st?  Lyrics?  

Language?

 

VS.  

Publica)on  Date?  

Page 6: Model%Free%Music%Similarity%Measure%courses.cecs.anu.edu.au/courses/CS_PROJECTS/10S2/Final present… · Only%for%presenta.onal%or%informa.onal%purpose.% Copyright©%2010%Wen%Shao%@ANU.%All%rights%reserved.%

Only  for  presenta.onal  or  informa.onal  purpose.  Copyright  ©  2010  Wen  Shao  @  ANU.  All  rights  reserved.  

Literature  review  &  My  work  

•  WAV-­‐based  approach  [LA01]  [AP02]    – Feature:  Mel-­‐Frequency  Cepstral  Coefficients…  – Signature:  K-­‐Means,  Gaussian  Mixture  Model    – Distance:  Signature-­‐based  

•  MIDI-­‐based  approach  [LHC99]  [RCP04]    – Transformed  to  string  matching  problems  – “ZIP”  method  

 

Page 7: Model%Free%Music%Similarity%Measure%courses.cecs.anu.edu.au/courses/CS_PROJECTS/10S2/Final present… · Only%for%presenta.onal%or%informa.onal%purpose.% Copyright©%2010%Wen%Shao%@ANU.%All%rights%reserved.%

Only  for  presenta.onal  or  informa.onal  purpose.  Copyright  ©  2010  Wen  Shao  @  ANU.  All  rights  reserved.  

Literature  review  &  My  work  

•  WAV-­‐based  approach  [LA01]  [AP02]    •  MIDI-­‐based  approach  [LHC99]  [RCP04]    •  My  work  – Evaluated  two  state-­‐of-­‐the-­‐art  approaches  

•  ZIP  method  (MIDI-­‐based)  •  MFCCs  +  GMM  +  likelihood  (WAV-­‐based)  

– Proposed  a  Neural  Network  and  GMM  based    

Page 8: Model%Free%Music%Similarity%Measure%courses.cecs.anu.edu.au/courses/CS_PROJECTS/10S2/Final present… · Only%for%presenta.onal%or%informa.onal%purpose.% Copyright©%2010%Wen%Shao%@ANU.%All%rights%reserved.%

Only  for  presenta.onal  or  informa.onal  purpose.  Copyright  ©  2010  Wen  Shao  @  ANU.  All  rights  reserved.  

ZIP  method  [RCP04]    

•  Kolmogorov  complexity  K(x)  abababababababababababababababababababababababababababababababab  

4c1j5b2p0cv4w1x8rx2y39umgw5q85s    

•  Condi.onal  Kolmogorov  complexity  K(x|y)  – The  difficulty  to  construct  x  from  y  – A  small  number  if  y  is  of  great  help  in  construc.ng  x,  otherwise  it  equals  to  K(x)  

– An  indicator  of  the  degree  of  similarity  between  x  and  y  

Page 9: Model%Free%Music%Similarity%Measure%courses.cecs.anu.edu.au/courses/CS_PROJECTS/10S2/Final present… · Only%for%presenta.onal%or%informa.onal%purpose.% Copyright©%2010%Wen%Shao%@ANU.%All%rights%reserved.%

Only  for  presenta.onal  or  informa.onal  purpose.  Copyright  ©  2010  Wen  Shao  @  ANU.  All  rights  reserved.  

ZIP  method  (cont.)    

•  Kolmogorov  complexity  only  semi  computable  •  Use  compressor  to  approximate  K  

K(S1)  =  56B  

K(S2)  =  45B  

K(S1S2)  =  90B  

Page 10: Model%Free%Music%Similarity%Measure%courses.cecs.anu.edu.au/courses/CS_PROJECTS/10S2/Final present… · Only%for%presenta.onal%or%informa.onal%purpose.% Copyright©%2010%Wen%Shao%@ANU.%All%rights%reserved.%

Only  for  presenta.onal  or  informa.onal  purpose.  Copyright  ©  2010  Wen  Shao  @  ANU.  All  rights  reserved.  

MFCCs  +  GMMs  [AP02]  MFCCs:  Short-­‐term  power  spectrum  of  a  sound,  based  on  a  linear  cosine  transform  of  a  log  power  spectrum  on  a  nonlinear  Mel  scale  of  frequency  

Page 11: Model%Free%Music%Similarity%Measure%courses.cecs.anu.edu.au/courses/CS_PROJECTS/10S2/Final present… · Only%for%presenta.onal%or%informa.onal%purpose.% Copyright©%2010%Wen%Shao%@ANU.%All%rights%reserved.%

Only  for  presenta.onal  or  informa.onal  purpose.  Copyright  ©  2010  Wen  Shao  @  ANU.  All  rights  reserved.  

MFCCs  +  GMMs  (cont.)  MFCCs  GMM  signature:  •  GMM  over  MFCCs.    •  Adap.ve  component  number  

•  Each  component  carries  a  mean  vector  and  a  covariance  matrix  

 

Page 12: Model%Free%Music%Similarity%Measure%courses.cecs.anu.edu.au/courses/CS_PROJECTS/10S2/Final present… · Only%for%presenta.onal%or%informa.onal%purpose.% Copyright©%2010%Wen%Shao%@ANU.%All%rights%reserved.%

Only  for  presenta.onal  or  informa.onal  purpose.  Copyright  ©  2010  Wen  Shao  @  ANU.  All  rights  reserved.  

MFCCs  +  GMMs(cont.)    MFCCs  GMM  signature  Likelihood:  How  easily  or  likely  to  construct  the  samples  from  one  song  given  the  GMM  of  the  other    

Page 13: Model%Free%Music%Similarity%Measure%courses.cecs.anu.edu.au/courses/CS_PROJECTS/10S2/Final present… · Only%for%presenta.onal%or%informa.onal%purpose.% Copyright©%2010%Wen%Shao%@ANU.%All%rights%reserved.%

Only  for  presenta.onal  or  informa.onal  purpose.  Copyright  ©  2010  Wen  Shao  @  ANU.  All  rights  reserved.  

Experimental  Results  

0  

20  

40  

60  

80  

100  

120  

140  

Top5   Top10   Top20   Top50   Top100  

Top  N  hits  Top  N  hits(GMM)  

Top  N  hits(Complexity)  

•  4431  singing/humming  clips,  avg.  12  singing/humming  for  one  song  

•  100  clips  are  drawn  at  random  

Page 14: Model%Free%Music%Similarity%Measure%courses.cecs.anu.edu.au/courses/CS_PROJECTS/10S2/Final present… · Only%for%presenta.onal%or%informa.onal%purpose.% Copyright©%2010%Wen%Shao%@ANU.%All%rights%reserved.%

Only  for  presenta.onal  or  informa.onal  purpose.  Copyright  ©  2010  Wen  Shao  @  ANU.  All  rights  reserved.  

Neural  network  based  method  

•  Main  idea:  feed  NN  with  two  GMMs,  train  NN  and  use  the  output(0-­‐1)  as  the  degree  of  similarity.    

•  Price  paid:  Fixed  number  of  GMMs  components(3).  

  N( !µk,!k )

N( !µ j,! j )

Page 15: Model%Free%Music%Similarity%Measure%courses.cecs.anu.edu.au/courses/CS_PROJECTS/10S2/Final present… · Only%for%presenta.onal%or%informa.onal%purpose.% Copyright©%2010%Wen%Shao%@ANU.%All%rights%reserved.%

Only  for  presenta.onal  or  informa.onal  purpose.  Copyright  ©  2010  Wen  Shao  @  ANU.  All  rights  reserved.  

Neural  network  based  method  

•  Architecture:  2  hidden  layers,  264  inputs,  1  output,  9/16/27  hidden  neurons  –  Input  1-­‐24:  means  for  each  component  in  GMM  for  the  first  song  

–  Input  25-­‐132:  covariance  in  GMM  for  the  first  song  –  Input  133-­‐156:  means  for  each  component  in  GMM  for  the  second  song  

–  Input  157-­‐264:  covariance  in  GMM  for  the  second  song  

•  Cross-­‐entropy  (instead  of  SSE)  •  Ac.va.on  func.on:  Logis.c  sigmoid  func.on  

Page 16: Model%Free%Music%Similarity%Measure%courses.cecs.anu.edu.au/courses/CS_PROJECTS/10S2/Final present… · Only%for%presenta.onal%or%informa.onal%purpose.% Copyright©%2010%Wen%Shao%@ANU.%All%rights%reserved.%

Only  for  presenta.onal  or  informa.onal  purpose.  Copyright  ©  2010  Wen  Shao  @  ANU.  All  rights  reserved.  

Experimental  result  48,440  ‘yes’  pauerns,  and  48,440  ‘no’  pauerns  for  training,  24,220  ‘yes’  pauerns  and  24,220  ‘no’  pauerns  for  tes.ng,  without  a  third  valida.on  dataset    

9 hidden neurons 16 hidden neurons 27 hidden neurons Total cross-entropy error 46667.92 45799.54 44086.60

Mean cross-entropy error 0.96

0.95

0.91

!! ! !!! 0.38 0.39 0.40 !! ! !!! 0.62 0.62 0.60

Difference < 0.5 25593 52.83% 23416 48.34% 20117 41.52% Difference < 0.4 21858 45.12% 18615 38.43% 13320 27.50% Difference < 0.3 17749 36.64% 13594 28.06% 7065 14.59% Difference < 0.2 13016 26.87% 7379 15.23% 2395 4.94% Difference < 0.1 6910 14.26% 1126 2.32% 199 0.41%

!

Page 17: Model%Free%Music%Similarity%Measure%courses.cecs.anu.edu.au/courses/CS_PROJECTS/10S2/Final present… · Only%for%presenta.onal%or%informa.onal%purpose.% Copyright©%2010%Wen%Shao%@ANU.%All%rights%reserved.%

Only  for  presenta.onal  or  informa.onal  purpose.  Copyright  ©  2010  Wen  Shao  @  ANU.  All  rights  reserved.  

Thanks  Ques.ons?