Post on 26-Dec-2015
Instrument Recognition in Polyphonic Music
Jana Eggink
Supervisor: Guy J. Brown
University of Sheffield
j.eggink@dcs.shef.ac.uk
HOARSE 20.02.04 Jana Eggink: Instrument Recognition in Polyphonic Music 2 / 12
Previous Work• Missing feature approach
• Frequency regions in which partials of a non-target tone were found were excluded from the recognition process
• Requires the knowledge of all F0s
• Worked well only for low numbers of simultaneous F0s
freq
uenc
y
time
freq
uenc
y
time
freq
uenc
y
time
freq
uenc
y
time
a) target tone b) interfering tone c) mixture d) mixture with mask
HOARSE 20.02.04 Jana Eggink: Instrument Recognition in Polyphonic Music 3 / 12
New Approach• Spectral energy of instrument sounds is concentrated in their
harmonics
• Which are therefore less likely to be masked by interfering sounds
• Build a recogniser based on harmonics only to minimise mismatch between training and test data
HOARSE 20.02.04 Jana Eggink: Instrument Recognition in Polyphonic Music 4 / 12
Instrument Recognition in Accompanied Sonatas and Concertos
• A solo instrument is commonly played louder than the accompaniment
• Causing the corresponding harmonic series to stand out in a spectral representation (hopefully)
• Requires only the extraction of the most prominent F0, which will most often belong to the solo instrument
audio signal
spectral peaks
F0 and harmonics
features
flute
clarinet
oboe
violin
cello
HOARSE 20.02.04 Jana Eggink: Instrument Recognition in Polyphonic Music 5 / 12
Find Spectral Peaks• Convolve spectrum with a differentiated Gaussian
• Spectrum is smoothed and peaks are transformed into zero crossings which are easy to detect
• The frequency of a peak is defined by the frequency of the corresponding FFT bin (for more accuracy a highly zero-padded FFT is used)
frequency
pow
er
HOARSE 20.02.04 Jana Eggink: Instrument Recognition in Polyphonic Music 6 / 12
Find Most Prominent F0• Pattern matching using ‘harmonic sieves’
• One sieve for every possible F0, with ‘slots’ for every partial
• The more spectral peaks pass through the slots of a sieve, the more likely the F0
• Problem: octave confusions
• Solution: for every sieve compute a ‘match’ as the sum of the weighted power of all allocated partials, with higher weights for lower partials
frequency
pow
er
HOARSE 20.02.04 Jana Eggink: Instrument Recognition in Polyphonic Music 7 / 12
F0 Restriction
• Especially for woodwinds, many of the estimated F0s were below the range of the instrument
• These F0s were either erroneous or belonged to the accompaniment, the latter inevitable in sections were the solo instrument is silent
• Only the highest 50% of all estimated F0s were used for instrument recognition
HOARSE 20.02.04 Jana Eggink: Instrument Recognition in Polyphonic Music 8 / 12
Compute Features
• Frequency and power of first 15 partials
• Frame to frame differences (deltas and delta-deltas) within tones of continuous F0
partials 220 442 658 ... 60 50 44 ...
+2 -1 +5 ... 0 +5 -3 ...
+1 0 -1 ... 0 +3 +1 ...
frequency power
HOARSE 20.02.04 Jana Eggink: Instrument Recognition in Polyphonic Music 9 / 12
Instrument Recognition
• Gaussian mixture models (GMMs), trained on solo music and isolated tone samples
• One model for every F0 of every instrument
• Very homogeneous training data: • only one centre per model needed • very fast convergence during training
• Recognition efficient because models can be restricted to those trained on the current F0
HOARSE 20.02.04 Jana Eggink: Instrument Recognition in Polyphonic Music 10 / 12
Results I
• Isolated tone samples (monophonic)
• Average recognition accuracy: 67%
flute clarinet oboe violin cello
flute 76% 9% 3% 12% 1%
clarinet 16% 64% 9% 8% 2%
oboe 6% 16% 57% 13% 7%
violin 5% 1% 5% 71% 18%
cello 3% 2% 7% 21% 68%
response
stimulus
confusion matrix isolated tones:
HOARSE 20.02.04 Jana Eggink: Instrument Recognition in Polyphonic Music 11 / 12
Results II• Realistic monophonic phrases, 2-10 sec.: 84% correct
• Solo instrument with accompaniment (piano or orchestra), 2-3 min.: 86% correct
flute clarinet oboe violin cello
flute 75% 0% 0% 25% 0%
clarinet 6% 88% 0% 6% 0%
oboe 0% 0% 82% 18% 0%
violin 0% 0% 0% 88% 12%
cello 0% 0% 0% 6% 92%
confusion matrix accompanied solo instruments:response
stimulus
HOARSE 20.02.04 Jana Eggink: Instrument Recognition in Polyphonic Music 12 / 12
Conclusions and Future Work• Recognition accuracy comparable to that of other systems
designed to deal with monophonic music only
• Phrases were classified better than isolated tones, which is a common phenomenon, in longer and more varied examples isolated random errors are more likely to be evened out
• No drop in recognition accuracy between monophonic phrases and those with accompaniment
• Very good results when averaging over whole sound files, but not accurate enough for note-by-note classifications
• Use knowledge about the solo instrument to extract the melody line
• Distinction between solo instrument present / silent necessary