Similarity Matrix Processing for Music Structure Analysis Yu Shiu, Hong Jeng C.-C. Jay Kuo ACM...

25
Similarity Matrix Processing for Music Structure Analysis Yu Shiu, Hong Jeng C.-C. Jay Kuo ACM Multimedia 2006

Transcript of Similarity Matrix Processing for Music Structure Analysis Yu Shiu, Hong Jeng C.-C. Jay Kuo ACM...

Page 1: Similarity Matrix Processing for Music Structure Analysis Yu Shiu, Hong Jeng C.-C. Jay Kuo ACM Multimedia 2006.

Similarity Matrix Processing for Music Structure Analysis

Yu Shiu, Hong Jeng

C.-C. Jay Kuo

ACM Multimedia 2006

Page 2: Similarity Matrix Processing for Music Structure Analysis Yu Shiu, Hong Jeng C.-C. Jay Kuo ACM Multimedia 2006.

System Framework

Page 3: Similarity Matrix Processing for Music Structure Analysis Yu Shiu, Hong Jeng C.-C. Jay Kuo ACM Multimedia 2006.

Pitch Class Profile (PCP)

• The PCP vector is a 12-dimensional vector, which shows the relative intensities of the 12 pitch classes, {C, C#, D, D#, E, F, F#, G, G#, A, A#,B}

• Normalized to a unit vector

Page 4: Similarity Matrix Processing for Music Structure Analysis Yu Shiu, Hong Jeng C.-C. Jay Kuo ACM Multimedia 2006.

Pitch Class Profile (PCP)

Page 5: Similarity Matrix Processing for Music Structure Analysis Yu Shiu, Hong Jeng C.-C. Jay Kuo ACM Multimedia 2006.

Measure-based Similarity Matrix

• Previous similarity matrix– Pre-defined window size– results in a similarity matrix of a large

size that makes further processing more expensive

• In this paper– Use measure as the element of

similarity matrix

Page 6: Similarity Matrix Processing for Music Structure Analysis Yu Shiu, Hong Jeng C.-C. Jay Kuo ACM Multimedia 2006.

Measure-based Similarity Matrix

• PCP Vector generation– choose a window size that is equal to

the duration of one half beat– Detect onset signal

• compute the change of the spectral content between two adjacent shifting windows of 20ms long and with 50% overlap

Page 7: Similarity Matrix Processing for Music Structure Analysis Yu Shiu, Hong Jeng C.-C. Jay Kuo ACM Multimedia 2006.

Measure-based Similarity Matrix

– the autocorrelation function (ACF) of the onset signal is calculated to determine the beat period

– Example:• 100BPM → length of half beat is 300 ms• Longer than the window size commonly

use in previous work

Page 8: Similarity Matrix Processing for Music Structure Analysis Yu Shiu, Hong Jeng C.-C. Jay Kuo ACM Multimedia 2006.

Measure-based Similarity Matrix

• Grouping N successive PCP vectors

• Since PCP vectors are unit vectors, 0 <= sij <= 1

• dynamic time warping (DTW) can be used to enhance the sij value

Page 9: Similarity Matrix Processing for Music Structure Analysis Yu Shiu, Hong Jeng C.-C. Jay Kuo ACM Multimedia 2006.

Dynamic Time Warping

Page 10: Similarity Matrix Processing for Music Structure Analysis Yu Shiu, Hong Jeng C.-C. Jay Kuo ACM Multimedia 2006.

Measure-based Similarity Matrix

• After the simplification, a 3-minute song with a tempo of 100BPM can form a 75 × 75 similarity matrix

• MSM reveals more the chord similarity rather than the melody similarity

Page 11: Similarity Matrix Processing for Music Structure Analysis Yu Shiu, Hong Jeng C.-C. Jay Kuo ACM Multimedia 2006.

• Johnny Cash’s Hurt repeatedly uses the chord succession {Am, Am, C, D} in the 1st and 3rd sections while {G, A, F, C} in the 2nd and 4th sections.

• Beatles’ Yesterday does not have chord succession of short periods. Its music form structure is P = {I V V C V C V O}

Two MSM Examples

Page 12: Similarity Matrix Processing for Music Structure Analysis Yu Shiu, Hong Jeng C.-C. Jay Kuo ACM Multimedia 2006.

Detection of Local Similarity

• Using a 2D moving window

Page 13: Similarity Matrix Processing for Music Structure Analysis Yu Shiu, Hong Jeng C.-C. Jay Kuo ACM Multimedia 2006.

Detection of Local Similarity• move the 2D moving window along

the diagonal line of the MSM

Page 14: Similarity Matrix Processing for Music Structure Analysis Yu Shiu, Hong Jeng C.-C. Jay Kuo ACM Multimedia 2006.

Detection of Long Range Similarity

• The Viterbi algorithm is used to find segments with consecutive large similarity values along the 45-degree direction

• we can exploit the output from the second module that provides the chord succession similarity to enhance the long range similarity detection.

Page 15: Similarity Matrix Processing for Music Structure Analysis Yu Shiu, Hong Jeng C.-C. Jay Kuo ACM Multimedia 2006.

Detection of Long Range Similarity

• interpret the x-axis as the “time”, the y-axis as the “state”

Page 16: Similarity Matrix Processing for Music Structure Analysis Yu Shiu, Hong Jeng C.-C. Jay Kuo ACM Multimedia 2006.

Detection of Long Range Similarity

• use “scores” instead of “probabilities”

• The score of a path is defined as the product of similarity value of all states and scores of all state transitions

Page 17: Similarity Matrix Processing for Music Structure Analysis Yu Shiu, Hong Jeng C.-C. Jay Kuo ACM Multimedia 2006.

Detection of Long Range Similarity

• PT0 > PT1 to guarantee the preference along the 45-degree direction.– The larger the ratio, the more favorable

the path will proceed along the 45-degree direction.

– In our experiment, the ratio PT0/PT1 is chosen to be 1.5

Page 18: Similarity Matrix Processing for Music Structure Analysis Yu Shiu, Hong Jeng C.-C. Jay Kuo ACM Multimedia 2006.

Detection of Long Range Similarity

• Pruning with Chord Succession Information– sections with repetitive chord

successions of a certain period should be similar to sections of same period

– A period value p is tagged to a measure

Page 19: Similarity Matrix Processing for Music Structure Analysis Yu Shiu, Hong Jeng C.-C. Jay Kuo ACM Multimedia 2006.

Detection of Long Range Similarity

Page 20: Similarity Matrix Processing for Music Structure Analysis Yu Shiu, Hong Jeng C.-C. Jay Kuo ACM Multimedia 2006.

Post-processing

• we begin with the state j that gives the highest Q(L, j) at time L, and perform a back-tracking process.

• Segments with length smaller than φ measures are removed– In our implementation, φ = 8.

• Segments whose mean similarity value is less than a threshold, τ , are removed– τ = mean + standard deviation (for all sij)

Page 21: Similarity Matrix Processing for Music Structure Analysis Yu Shiu, Hong Jeng C.-C. Jay Kuo ACM Multimedia 2006.

Post-processing

• Each segment should be divided– if their two corresponding sections in the song

overlap with each other– if there is a significant difference between

similarity values before and after a certain point in the segment.

• If there are conflicts on sections, the one with a higher similarity value has the priority to keep the boundaries

• For those songs in verse-chorus form, similarity values are clustered into two classes– high similarity values are claimed to be the chorus

Page 22: Similarity Matrix Processing for Music Structure Analysis Yu Shiu, Hong Jeng C.-C. Jay Kuo ACM Multimedia 2006.
Page 23: Similarity Matrix Processing for Music Structure Analysis Yu Shiu, Hong Jeng C.-C. Jay Kuo ACM Multimedia 2006.

Experiment

• collection of 120 pop, country and rock songs after 60’s.

• 100 of them are of the verse-chorus form and 20 are of the AAA or other form

• mono audio sampled at a rate of 22,050Hz, with 16 bits per sample.

Page 24: Similarity Matrix Processing for Music Structure Analysis Yu Shiu, Hong Jeng C.-C. Jay Kuo ACM Multimedia 2006.

Experimental Results

• The pattern extraction of a song is claimed to be correct if all patterns in the song are extracted without distinguishing between verse and chorus

• The accurate detection rate is 112/120 = 93.33%.

Page 25: Similarity Matrix Processing for Music Structure Analysis Yu Shiu, Hong Jeng C.-C. Jay Kuo ACM Multimedia 2006.

Experimental Results