Investigation of Pitch Detection Characteristics from Different Audio Context.
-
Upload
asia-dikes -
Category
Documents
-
view
221 -
download
2
Transcript of Investigation of Pitch Detection Characteristics from Different Audio Context.
![Page 1: Investigation of Pitch Detection Characteristics from Different Audio Context.](https://reader035.fdocuments.us/reader035/viewer/2022062619/55163533550346b2068b4e61/html5/thumbnails/1.jpg)
Investigation of Pitch Detection Characteristics from Different Audio Context
![Page 2: Investigation of Pitch Detection Characteristics from Different Audio Context.](https://reader035.fdocuments.us/reader035/viewer/2022062619/55163533550346b2068b4e61/html5/thumbnails/2.jpg)
Part 1: Introduction
![Page 3: Investigation of Pitch Detection Characteristics from Different Audio Context.](https://reader035.fdocuments.us/reader035/viewer/2022062619/55163533550346b2068b4e61/html5/thumbnails/3.jpg)
Pitch Detection Characteristics from Different Audio Context
Motivations: Testing pitch detection algorithms using imperfect audio materials• Music note itself can be very complex • A lot of audio material is recorded in imperfect recording conditions, for example, interference from other music instrument in emsemble recording and noise. • Existing source separation algorithms usually provide incomplete separation.
Testing and Evaluation Goals:• Pitch Detection Performance Analysis using Synthesized Notes• Pitch Detection Performance Analysis using Real Musical Notes
Testing Framework:
add MIR Toolbox
•distortion•interference note•noise
Pitch detection result 1
SNR
Source audio signal
combined audio signal(simulate imperfect audio)
MIR Toolbox
Pitch detection result 2
![Page 4: Investigation of Pitch Detection Characteristics from Different Audio Context.](https://reader035.fdocuments.us/reader035/viewer/2022062619/55163533550346b2068b4e61/html5/thumbnails/4.jpg)
Part 2: Pitch Detection Performance on Synthesized Notes
![Page 5: Investigation of Pitch Detection Characteristics from Different Audio Context.](https://reader035.fdocuments.us/reader035/viewer/2022062619/55163533550346b2068b4e61/html5/thumbnails/5.jpg)
• Synthesized tone of 440 Hz.
440Hz
MIR Toolbox
440.5227 Hz
time (s)
freq
uen
cy(H
z)
Spectrogam
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.90
0.5
1
1.5
2
x 104
Synthesized Notes of Different Complexity
![Page 6: Investigation of Pitch Detection Characteristics from Different Audio Context.](https://reader035.fdocuments.us/reader035/viewer/2022062619/55163533550346b2068b4e61/html5/thumbnails/6.jpg)
• Synthesized tone of 440 Hz.
440Hz
MIR Toolbox
440.5227 Hz
time (s)
freq
uen
cy(H
z)
Spectrogam
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.90
500
1000
1500
Synthesized Notes of Different Complexity
![Page 7: Investigation of Pitch Detection Characteristics from Different Audio Context.](https://reader035.fdocuments.us/reader035/viewer/2022062619/55163533550346b2068b4e61/html5/thumbnails/7.jpg)
• Synthesized tone of 440 Hz. We add in some amplitude modulation and frequency modulation to each sonic partials to add in the complexities.
• AM index = 0.5, maximum frequency deviation = 10 Hz at f1
440Hz
MIR Toolbox
443.3253 Hz
time (s)
freq
uen
cy(H
z)
Spectrogam
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.90
0.5
1
1.5
2
x 104
Synthesized Notes of Different Complexity
![Page 8: Investigation of Pitch Detection Characteristics from Different Audio Context.](https://reader035.fdocuments.us/reader035/viewer/2022062619/55163533550346b2068b4e61/html5/thumbnails/8.jpg)
440Hz
MIR Toolbox
443.3253 Hz
time (s)
freq
uen
cy(H
z)
Spectrogam
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.90
500
1000
1500
• Synthesized tone of 440 Hz. We add in some amplitude modulation and frequency modulation to each sonic partials to add in the complexities.
• AM index = 0.5, maximum frequency deviation = 10 Hz at f1
Synthesized Notes of Different Complexity
![Page 9: Investigation of Pitch Detection Characteristics from Different Audio Context.](https://reader035.fdocuments.us/reader035/viewer/2022062619/55163533550346b2068b4e61/html5/thumbnails/9.jpg)
440Hz
MIR Toolbox
443.3253 Hz
• Synthesized tone of 440 Hz. We add in some amplitude modulation and frequency modulation to each sonic partials to add in the complexities.
• AM index = 0.5, maximum frequency deviation = 10 Hz at f1
Synthesized Notes of Different Complexity
![Page 10: Investigation of Pitch Detection Characteristics from Different Audio Context.](https://reader035.fdocuments.us/reader035/viewer/2022062619/55163533550346b2068b4e61/html5/thumbnails/10.jpg)
440Hz
MIR Toolbox
443.3253 Hz
• Synthesized tone of 440 Hz. We add in some amplitude modulation and frequency modulation to each sonic partials to add in the complexities.
• AM index = 0.5, maximum frequency deviation = 10 Hz at f1
Synthesized Notes of Different Complexity
![Page 11: Investigation of Pitch Detection Characteristics from Different Audio Context.](https://reader035.fdocuments.us/reader035/viewer/2022062619/55163533550346b2068b4e61/html5/thumbnails/11.jpg)
• Synthesized tone of 440 Hz. We add in some amplitude modulation and frequency modulation to each sonic partials to add in the complexities.
• AM index = 0.5, maximum frequency deviation = 40 Hz at f1
440Hz
MIR Toolbox
449.7704 Hz
time (s)
freq
uen
cy(H
z)
Spectrogam
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.90
0.5
1
1.5
2
x 104
Synthesized Notes of Different Complexity
![Page 12: Investigation of Pitch Detection Characteristics from Different Audio Context.](https://reader035.fdocuments.us/reader035/viewer/2022062619/55163533550346b2068b4e61/html5/thumbnails/12.jpg)
440Hz
MIR Toolbox
449.7704 Hz
• Synthesized tone of 440 Hz. We add in some amplitude modulation and frequency modulation to each sonic partials to add in the complexities.
• AM index = 0.5, maximum frequency deviation = 40 Hz at f1
time (s)
freq
uen
cy(H
z)
Spectrogam
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.90
500
1000
1500
Synthesized Notes of Different Complexity
![Page 13: Investigation of Pitch Detection Characteristics from Different Audio Context.](https://reader035.fdocuments.us/reader035/viewer/2022062619/55163533550346b2068b4e61/html5/thumbnails/13.jpg)
Part 3: Pitch Detection Performance on Real Musical Notes
![Page 14: Investigation of Pitch Detection Characteristics from Different Audio Context.](https://reader035.fdocuments.us/reader035/viewer/2022062619/55163533550346b2068b4e61/html5/thumbnails/14.jpg)
time (s)
freq
uen
cy(H
z)
Spectrogam of Combined Test Signal
0.1 0.2 0.3 0.4 0.5 0.60
0.5
1
1.5
2
x 104
Interference from Another Music Note
time (s)
freq
uen
cy(H
z)
Spectrogam of Target Signal
0.1 0.2 0.3 0.4 0.5 0.60
0.5
1
1.5
2
x 104
time (s)
freq
uen
cy(H
z)
Spectrogam of Interference Signal
0.1 0.2 0.3 0.4 0.5 0.60
0.5
1
1.5
2
x 104
add
MIR Toolbox
source note f0596.90 Hz
MIR Toolbox
combined note f0 1034.02Hz
MIR Toolbox
interference note f01040.83 Hz
source note
interference note
combined note
Wrong
SNR = 3.5dB
![Page 15: Investigation of Pitch Detection Characteristics from Different Audio Context.](https://reader035.fdocuments.us/reader035/viewer/2022062619/55163533550346b2068b4e61/html5/thumbnails/15.jpg)
time (s)
freq
uen
cy(H
z)
Spectrogam of Combined Test Signal
0.1 0.2 0.3 0.4 0.5 0.60
0.5
1
1.5
2
x 104
Interference from Another Music Note
MIR Toolbox
combined note f0 1034.02Hz
combined note
Wrong
0 0.5 1 1.5 2 2.5
x 104
0
100
200
300
400
500
600
700fft of sustained part, Combined Test Signal
frequency (Hz)
mag
nitu
de
0 1000 2000 3000 4000 5000 6000 7000
100
200
300
400
500
600
700
fft of sustained part, Combined Test Signal
frequency (Hz)
mag
nitu
de
![Page 16: Investigation of Pitch Detection Characteristics from Different Audio Context.](https://reader035.fdocuments.us/reader035/viewer/2022062619/55163533550346b2068b4e61/html5/thumbnails/16.jpg)
time (s)
freq
uen
cy(H
z)
Spectrogam of Combined Test Signal
0.1 0.2 0.3 0.4 0.5 0.60
0.5
1
1.5
2
x 104
Interference from Another Music Note
MIR Toolbox
combined note f0 1034.02Hz
combined note
Wrong
0 1000 2000 3000 4000 5000 6000 7000
100
200
300
400
500
600
700
fft of sustained part, Combined Test Signal
frequency (Hz)
mag
nitu
de
1000 1010 1020 1030 1040 1050 1060 10700
100
200
300
400
500
600fft of sustained part, zoom in, Combined Test Signal
frequency (Hz)
mag
nitu
de
![Page 17: Investigation of Pitch Detection Characteristics from Different Audio Context.](https://reader035.fdocuments.us/reader035/viewer/2022062619/55163533550346b2068b4e61/html5/thumbnails/17.jpg)
time (s)
freq
uen
cy(H
z)
Spectrogam of Combined Test Signal
0.1 0.2 0.3 0.4 0.5 0.60
0.5
1
1.5
2
x 104
time (s)
freq
uen
cy(H
z)
Spectrogam of Target Signal
0.1 0.2 0.3 0.4 0.5 0.60
0.5
1
1.5
2
x 104
Interference from Another Music Note
time (s)
freq
uen
cy(H
z)
Spectrogam of Interference Signal
0.1 0.2 0.3 0.4 0.5 0.60
0.5
1
1.5
2
x 104
add
MIR Toolbox
source note f0596.90 Hz
MIR Toolbox
combined note f0 596.47 Hz
MIR Toolbox
interference note f01040.83 Hz
source note
interference note
combined note
Right
SNR = 8.61 dB
![Page 18: Investigation of Pitch Detection Characteristics from Different Audio Context.](https://reader035.fdocuments.us/reader035/viewer/2022062619/55163533550346b2068b4e61/html5/thumbnails/18.jpg)
Interference from Another Music Note
time (s)
freq
uen
cy(H
z)
Spectrogam of Combined Test Signal
0.1 0.2 0.3 0.4 0.5 0.60
0.5
1
1.5
2
x 104
MIR Toolbox
combined note f0 596.47 Hz
combined note
Right
0 0.5 1 1.5 2 2.5
x 104
0
100
200
300
400
500
600
700fft of sustained part, Combined Test Signal
frequency (Hz)
mag
nitu
de
0 1000 2000 3000 4000 5000 6000 70000
100
200
300
400
500
600
700
fft of sustained part, Combined Test Signal
frequency (Hz)
mag
nitu
de
![Page 19: Investigation of Pitch Detection Characteristics from Different Audio Context.](https://reader035.fdocuments.us/reader035/viewer/2022062619/55163533550346b2068b4e61/html5/thumbnails/19.jpg)
Interference from Another Music Note
time (s)
freq
uen
cy(H
z)
Spectrogam of Combined Test Signal
0.1 0.2 0.3 0.4 0.5 0.60
0.5
1
1.5
2
x 104
MIR Toolbox
combined note f0 596.47 Hz
combined note
Right
0 1000 2000 3000 4000 5000 6000 70000
100
200
300
400
500
600
700
fft of sustained part, Combined Test Signal
frequency (Hz)
mag
nitu
de
560 570 580 590 600 610 620 6300
50
100
150
200
250
300
350
400
450
500fft of sustained part, zoom in, Combined Test Signal
frequency (Hz)
mag
nitu
de
![Page 20: Investigation of Pitch Detection Characteristics from Different Audio Context.](https://reader035.fdocuments.us/reader035/viewer/2022062619/55163533550346b2068b4e61/html5/thumbnails/20.jpg)
Interference from Another Music Note
-10 -5 0 5 10 15 20 25550
600
650
700
750
800
850
900
950
1000
1050pitch detection change with SNR
SNR (dB)
det
ect
ed p
itch
(Hz)
-10 -5 0 5 10 15 20 250
200
400
600
800
1000
1200pitch detection change with SNR
SNR (dB)
det
ect
ed p
itch
(Hz)
![Page 21: Investigation of Pitch Detection Characteristics from Different Audio Context.](https://reader035.fdocuments.us/reader035/viewer/2022062619/55163533550346b2068b4e61/html5/thumbnails/21.jpg)
time (s)
freq
uen
cy(H
z)
Spectrogam of Combined Test Signal
0.1 0.2 0.3 0.4 0.5 0.60
0.5
1
1.5
2
x 104
time (s)
freq
uen
cy(H
z)
Spectrogam of Interference Signal
0.1 0.2 0.3 0.4 0.5 0.60
0.5
1
1.5
2
x 104
Interference from Noise
time (s)
freq
uen
cy(H
z)
Spectrogam of Target Signal
0.1 0.2 0.3 0.4 0.5 0.60
0.5
1
1.5
2
x 104
add
MIR Toolbox
source note f0596.90 Hz
MIR Toolbox
combined note f0 596.65Hz
source note
noise
combined note
Right
SNR = 3.29 dB
![Page 22: Investigation of Pitch Detection Characteristics from Different Audio Context.](https://reader035.fdocuments.us/reader035/viewer/2022062619/55163533550346b2068b4e61/html5/thumbnails/22.jpg)
time (s)
freq
uen
cy(H
z)
Spectrogam of Combined Test Signal
0.1 0.2 0.3 0.4 0.5 0.60
0.5
1
1.5
2
x 104
Interference from Noise
MIR Toolbox
combined note f0 596.76Hz
combined note
Right
0 0.5 1 1.5 2 2.5
x 104
0
100
200
300
400
500
600
700fft of sustained part, Combined Test Signal
frequency (Hz)
mag
nitu
de
0 1000 2000 3000 4000 5000 6000
100
200
300
400
500
600
700
fft of sustained part, Combined Test Signal
frequency (Hz)
mag
nitu
de
![Page 23: Investigation of Pitch Detection Characteristics from Different Audio Context.](https://reader035.fdocuments.us/reader035/viewer/2022062619/55163533550346b2068b4e61/html5/thumbnails/23.jpg)
time (s)
freq
uen
cy(H
z)
Spectrogam of Combined Test Signal
0.1 0.2 0.3 0.4 0.5 0.60
0.5
1
1.5
2
x 104
Interference from Noise
MIR Toolbox
combined note f0 596.76Hz
combined note
Right
0 1000 2000 3000 4000 5000 6000
100
200
300
400
500
600
700
fft of sustained part, Combined Test Signal
frequency (Hz)
mag
nitu
de
560 570 580 590 600 610 620 6300
50
100
150
200
250
300
350
400
450
500fft of sustained part, zoom in, Combined Test Signal
frequency (Hz)
mag
nitu
de
![Page 24: Investigation of Pitch Detection Characteristics from Different Audio Context.](https://reader035.fdocuments.us/reader035/viewer/2022062619/55163533550346b2068b4e61/html5/thumbnails/24.jpg)
time (s)
freq
uen
cy(H
z)
Spectrogam of Interference Signal
0.1 0.2 0.3 0.4 0.5 0.60
0.5
1
1.5
2
x 104
Interference from Noise
time (s)
freq
uen
cy(H
z)
Spectrogam of Target Signal
0.1 0.2 0.3 0.4 0.5 0.60
0.5
1
1.5
2
x 104
add
MIR Toolbox
source note f0596.90 Hz
MIR Toolbox
combined note f0 600.06Hz
source note
noise
combined note
Right
SNR = -2.63 dB
time (s)
freq
uen
cy(H
z)
Spectrogam of Combined Test Signal
0.1 0.2 0.3 0.4 0.5 0.60
0.5
1
1.5
2
x 104
![Page 25: Investigation of Pitch Detection Characteristics from Different Audio Context.](https://reader035.fdocuments.us/reader035/viewer/2022062619/55163533550346b2068b4e61/html5/thumbnails/25.jpg)
Interference from Noise
MIR Toolbox
combined note f0 600.06Hz
combined note
Right
time (s)
freq
uen
cy(H
z)
Spectrogam of Combined Test Signal
0.1 0.2 0.3 0.4 0.5 0.60
0.5
1
1.5
2
x 104
0 0.5 1 1.5 2 2.5
x 104
0
100
200
300
400
500
600
700fft of sustained part, Combined Test Signal
frequency (Hz)
mag
nitu
de
0 1000 2000 3000 4000 5000 6000
100
200
300
400
500
600
700
fft of sustained part, Combined Test Signal
frequency (Hz)
mag
nitu
de
![Page 26: Investigation of Pitch Detection Characteristics from Different Audio Context.](https://reader035.fdocuments.us/reader035/viewer/2022062619/55163533550346b2068b4e61/html5/thumbnails/26.jpg)
Interference from Noise
MIR Toolbox
combined note f0 600.06Hz
combined note
Right
time (s)
freq
uen
cy(H
z)
Spectrogam of Combined Test Signal
0.1 0.2 0.3 0.4 0.5 0.60
0.5
1
1.5
2
x 104
0 1000 2000 3000 4000 5000 6000
100
200
300
400
500
600
700
fft of sustained part, Combined Test Signal
frequency (Hz)
mag
nitu
de
560 570 580 590 600 610 620 6300
50
100
150
200
250
300
350
400
450
500fft of sustained part, zoom in, Combined Test Signal
frequency (Hz)
mag
nitu
de
![Page 27: Investigation of Pitch Detection Characteristics from Different Audio Context.](https://reader035.fdocuments.us/reader035/viewer/2022062619/55163533550346b2068b4e61/html5/thumbnails/27.jpg)
Interference from Noise
-25 -20 -15 -10 -5 0 5 10 15 200
200
400
600
800
1000
1200pitch detection change with SNR
SNR (dB)
det
ect
ed p
itch
(Hz)
-25 -20 -15 -10 -5 0 5 10 15 200
200
400
600
800
1000
1200pitch detection change with SNR
SNR (dB)
det
ect
ed p
itch
(Hz)
![Page 28: Investigation of Pitch Detection Characteristics from Different Audio Context.](https://reader035.fdocuments.us/reader035/viewer/2022062619/55163533550346b2068b4e61/html5/thumbnails/28.jpg)
Conclusions
![Page 29: Investigation of Pitch Detection Characteristics from Different Audio Context.](https://reader035.fdocuments.us/reader035/viewer/2022062619/55163533550346b2068b4e61/html5/thumbnails/29.jpg)
Conclusions
• We implemented a framework to validate the performance of pitch detection algorithms at different audio qualities.
• We tested the performance of MIR toolbox pitch detection algorithms using both synthesized music notes and real music notes.
• Three factors that affects pitch detection performance are investigated. These factors include the complexity of the music note, interference from concurring music note and noise.
![Page 30: Investigation of Pitch Detection Characteristics from Different Audio Context.](https://reader035.fdocuments.us/reader035/viewer/2022062619/55163533550346b2068b4e61/html5/thumbnails/30.jpg)
QA
![Page 31: Investigation of Pitch Detection Characteristics from Different Audio Context.](https://reader035.fdocuments.us/reader035/viewer/2022062619/55163533550346b2068b4e61/html5/thumbnails/31.jpg)
Thank you!