1 The Visitor Design Pattern and Java Tree Builder Cheng-Chia Chen.
Reporter : Chia-Cheng Chen Advisor : Wen-Ping Chen 1 Network Application Laboratory Department...
Transcript of Reporter : Chia-Cheng Chen Advisor : Wen-Ping Chen 1 Network Application Laboratory Department...
Reporter : Chia-Cheng Chen
Advisor : Wen-Ping Chen
1Network Application Laboratory
Department of Electrical EngineeringNational Kaohsiung University of Applied Sciences
A Study of Single Channel Blind Source Separation and Recognition
Based on Mixed-State Prediction
Outline
Introduction and Motivation
Background
Research Methods
Experimental Results
Conclusion and Future Works
Research Results
2
Introduction
3
The applications of voiceprint recognition system• Call routing (1997)• Jupiter (1997)• Let’s Go! (2002)• Siri (2010)• Skyvi (2011)• Vlingo (2011)
Introduction Current Ecological Status of the Survey:
• Sensor networks• Wireless networks• Database• Voiceprint recognition system
Advantage• Reduce the cost of human resource and time• Save and share the raw data conveniently
4
Introduction
5
Blind Source Separation
http://metadata.froghome.org/about.php 台灣地區兩棲類物種描述資料
Introduction
6
?Blind Source Separation
IntroductionVoiceprint recognition
• C.J. Huang, Y.J. Yang, D.X. Yang and Y.J. Chen, “Frog classification
using machine learning techniques,” Expert Systems with Applications,
Vol. 36, No. 2, pp. 3737-3743, 2009. (SCI)
• S.C. Hsieh, W.P. Chen, W.C. Lin, F.S. Chou, and J.R. Lai, “Endpoint
detection of frog croak syllables with using average energy entropy
method,” Taiwan Journal of Forest Science, Vol.27, No.2, pp.149-161,
Jun. 2012. (EI)
• W.P. Chen, S.S. Chen, C.C. Lin, Y.Z. Chen and W.C. Lin, “Automatic
recognition of frog call using multi-stage average spectrum,” Computers
& Mathematics with Applications, Vol. 64, No. 5, pp. 1270-1281, Sep.
2012. (SCI)
7
IntroductionSingle channel source separation
• M.N. Schmidt and M. Mørup, “Nonnegative matrix factor 2-D
deconvolution for blind single channel source separation,” Proceedings of
International Conferences Independent Component Analysis and Blind
Signal Separation, Vol. 3889, pp. 700-707, Mar. 2006. (SCI)
• S. Kırbız and B. Gunsel, “Perceptually weighted non-negative matrix
factorization for blind single-channel music source separation,” 21st
International Conference on Pattern Recognition, Nov. 2012. (EI)
8
MotivationAutomatic frog species voiceprint recognition system
• Predicting the number of mixed signal• Single channel blind source separation• Biologist• People
9
Outline
Introduction and Motivation
Background
Research Methods
Experimental Results
Conclusion and Future Works
Research Results
10
Background
Blind Source SeparationNon-negative Matrix Factor 2-D Deconvolution
MatchingAdaptive Multi-stages Average Spectrum
Feature ExtractionMel-frequency Cepstrum Coefficient
Endpoint DetectionTime Domain Frequency Domain
Signal ProcessingPre-emphasis Frame Window
11
Background
Signal Processing
Syllable Segmentation
Feature Extraction Matching
12
Voiceprint Recognition
Signal ProcessingSignal Processing
13
FrogSignal
Pre-emphasis
Frame
Hamming Window
Resample
1ˆ nαsnsns
1
2cos460540ˆ
N
πn..nsnw
44100Hz
Syllable SegmentationEndpoint Detection Algorithm
• Energy• Time Domain• Simple• Square of the Amplitude or Absolute Value of the Amplitude• Vulnerable to Noise Impact
• Entropy • Frequency Domain• Complex• Noise Immunity
14
Average Energy Entropy Signal Transform
Average Energy
15
10,1
0
2
NkenskX
N
n
N
knj
s(n) : windowed signalN : frame sizek : frequency component
1
0
)(N
n N
nAu
u : the mean for energy of input signalA(n) : the amplitude value of input signalN : total number of input signal
Average Energy Entropy Probability Density Function
16
10,))((
))((1
0
'
MiufE
ufEp M
mm
ii
E(fi) : the spectral energy for the frequency fi
: the corresponding probability density M : total number of frequency components in FFTβ : Multiples
Average Energy Entropy Average Energy Entropy
17
1
0
''' logN
iii ppH
H’ : the negative entropy for each frame
Endpoint Detection Algorithm
18
Signal
AEE
Absolute Energy
Square Energy
Feature Extraction
19
M
kkm Lm
MkmEC
1
,...,1],)2
1(cos[
Adaptive Multi-stage Average SpectralAdaptive Clustering
20
Cluster B
Cluster A
Adaptive Multi-stage Average Spectral
Cluster B
Cluster A
21
Adaptive Clustering
Adaptive Multi-stage Average Spectral
22
Adaptive Clustering
Adaptive Multi-stage Average SpectralTemplate Training
23
Frame 1
Frame 2
Frame 3
Frame 4
Frame 5
Frame 6
Frame 7
Stage 1
Stage 2
Stage 3
1
0
iL
n i
ni L
(k)X(k)S
Adaptive Multi-stage Average SpectralTemplate Training
24
Adaptive Multi-stage Average SpectralTemplate Training
25
Minimum Cumulative Difference
Adaptive Multi-stage Average SpectralTemplate Maching
26
Unknow Audio
Stage
Stage 1
Stage 2
Stage 3
1 2 3 4 5 6 7
Minimum Cumulative Difference
Blind Source SeparationNon-negative Matrix Factor 2-D Deconvolution
• α basis matrix and βcoefficient matrix • Obtain the relations between the time and the pitch• Shift operator
27
HWV
987
654
321
A
870
540
2101
A
654
321
0001
A
,
,
V: Original Signal
: Reconstructed Signal
Non-negative Matrix Factor 2-D Deconvolution
28
11
dW 21
dW12
dW 22
dW
11
dH
12
dH
11
dW 21
dW12
dW 22
dW
11
dH
12
dH
11
dW 21
dW12
dW 22
dW
11
dH
12
dH
0
HWV
Non-negative Matrix Factor 2-D DeconvolutionNon-negative Matrix Factor 2-D Deconvolution
• Cost function• Based on Euclidean Distance
• Based on Kullback-Leibler Divergence
29
2
,
mnnmnmED VC
mn
nmnmnm
nmnmKL V
VVC
,
log
Outline
Introduction and Motivation
Background
Research Methods
Experimental Results
Conclusion and Future Works
Research Results
30
Research MethodsMixed-State Prediction voiceprint recognition method
• Training• Mixed signals states
• Testing• Two stages voiceprint recognition• Mixed-State Prediction
31
32 Species
Audio Training
Audio Testing
AMASA
TemplateTraining
StandardSample
Template
TemplateMatching
Species
Source Separation
TemplateMatching
Number of Audio > 1
Yes
No
MixedStates
Mix-StatePrediction
Source 1
Source m
Source 2
SignalProcessing
SyllableSegmentation
FeatureExtraction
Audio Testing
AEEEndpoint Detection
First Stage
Second Stage
Audio Training
First Stage
33
Latouche's frog MFCC Moltrecht's green tree frog + Latouche's frog MFCC
Independent signal Mixed signal
Signal Processing
Syllable Segmentation
Feature Extraction Matching
Mixed signals states
34
S
A
B
C
AB
AC
BC
R21
R22
R23
i1 i2
R2n
a
b
c
i
ABC
R31
i3
R3n
D
d
AD
R24
BD
R25
CD
R26
ABD
R32
ACD
R33
BCD
R34
iC2iC3
Mixed StatesAverage Energy
35
N
k
kXN
E0
2)(
1 E : the average energy for the frequency X(k)
N : the length of the syllable
Mixed signalIndependent signal
Predicting the number of mixed signal
36
2aEdist
E : the mean spectral energy for test syllable
a : the mean energy of training data
distT T : the separation threshold
S
ABC
R31
A
AB
R21
AC
R22a
ABD
R32
ACD
R33
AD
R24
Outline
Introduction and Motivation
Background
Research Methods
Experimental Results
Conclusion and Future Works
Research Results
37
Experimental Results
38
Parameters Parameter Value
Frame Length 512 samples
Frame Overlapping 50%
Window Function Hamming Window
Frequency Bin 512
Feature Parameters Mel-Frequency Cepstral Coefficient
Feature Dimensions 15
Separation Threshold 0.3
Experimental Results Recognition Experiment
• Independent signals
39
MethodTotal
SyllableErrorMixed
CorrectSyllable
Accuracy(%)
DTW 373 31 282 75.6%
AMSAS 373 31 317 84.71%
Experimental Results Recognition Experiment
• Mixed signals
40
MethodTotal
SyllableCorrectSyllable
Accuracy(%)
DTW 269 183 68.02%
AMSAS 269 211 78.43%
TotalSyllable
ErrorMixed
CorrectSyllable
Accuracy(%)
167 36 131 78.44%
Experimental Results
41
Experimental Results
42
Conclusion and Future WorksThe proposed method
• Improve the mixed signal recognition rate• Proposed a method to predict the number of mixed signal
43
Conclusion and Future WorksFuture Works
• Study of de-noise methods• Collect more features between independent and mixed signals• Mixed signals recognition within same species• Collect various sound of species. Then, improve the system
performance• Adopt Support Vector Machines(SVM), Neural Network…
44
Research ResultsCompetition
• 第七屆數位訊號處理創思設計競賽—入圍•青蛙物種聲紋辨識系統
• 計畫協助
45
FormNSC 100-2221-
E-151-0117
NSC1002101010508
-080702G1
NSC1002101050511-
060101G4
Heading
WDM-EPON 之動態波長頻寬配置與服務品
質之研究
生態資訊學技術應用在森林
經營之研究
無線感測器網路在森林災害監測之應用與
研究
46
Thank you for your attention !!