NOISE DETECTION AND CLASSIFICATION IN SPEECH SIGNALS WITH BOOSTING Nobuyuki Miyake, Tetsuya...
-
Upload
helen-hines -
Category
Documents
-
view
212 -
download
0
Transcript of NOISE DETECTION AND CLASSIFICATION IN SPEECH SIGNALS WITH BOOSTING Nobuyuki Miyake, Tetsuya...
NOISE DETECTION AND CLASSIFICATION NOISE DETECTION AND CLASSIFICATION IN SPEECH SIGNALS WITH BOOSTINGIN SPEECH SIGNALS WITH BOOSTING
Nobuyuki Miyake, Tetsuya Takiguchi and Yasuo ArikiDepartment of Computer and System Engineering, Kobe University
Research purpose
Purpose Detecting and Classifying Sudden and Short-Period Noises
BackgroundSudden and short-period noises often affect speech recognition system in real environments.
Noise reduction improve speech recognition system.It is difficult to remove sudden and short-period noises because we do not know where the noise overlapped and what noise was.
Telephone calling
System overview
Well, I believe that you will ・・・・clatter
Noise detection using AdaBoost
Clean speechNoisy speech overlapped
by sudden noises
Smoothing
Noise classification using AdaBoost
Final results
Feature extraction
AdaBoost
Weak classifier Classifier’s weight
(x)hαxH
tttsgn
weak classifier )(1 xh
We labeled learning data {-1,+1}, 1 means noisy speech data label, -1 means clean speech data label.
AdaBoost is one of method of boosting.AdaBoost decides the weak classifiers and their weights.
Multi-class classification using AdaBoostWe perform multi-class classification using AdaBoost in order t
o determine noise classes.It is necessary to extend AdaBoost to classify multi-class
AdaBoostclass1
orother class
AdaBoostclass K
orother class
Find a maximum value in each outputs)(maxarg)( xHxC k
k
・・・
Feature vector
Label of class 1
Label of class 2
Label of class 3
….
1
1
1
….
k
tktk
t
k hxH 1
)(
combine
strong classifier )(xH
If speech recognition system can detect sudden noises, it will make it possible for the system to ask the speaker to repeat the same utterance.If it can be determined what and where noise is overlapped, these information will be useful for noise reduction or model composition.
}1,1{)( xht
Noise detection using AdaBoost
AdaBoost
Clean speech Noise overlapped
Feature vector
AdaBoost determines this frame overlapped by noise or clean speech.
ni xxx ,....,: 1
,.....,,,,
,....,,,,
11
54321
nnsss
xxxxx
Learning
where, weak classifier is one-dimension linear classifier.
Detection
AdaBoost makes strong classifier between clean speech frames and noisy speech frames using these data.
Multiple two-class classifiers are created, which distinguish one class and other classes. The class of the largest value is selected from the output values.
)(sgn)( xhxH
ttt
Changing η of this equation, we adjust the number of positive errors and negative errors.
red blue
red
blueChanging data weight
)(2 xh
・・・・・Algorithm
1)(11 xh
)(22 xh2
)(sgn)(:hypothesis Final
)})((exp{)(
)})((exp{)()(w
on wdistributi example Updata.4
1log3.Set
21))((
)(
of error training theCalclulate 2. }1,1{: hypothesisobtain and
on wdistributi example ed weight orespect tith learner w base a1.Train
T,1,...,for t Do/1)( wInitialize
)}y,(x),...,y,{(x Z examplesn Input
1t
t
t
t
1
nn11
xhxH
yxhIzw
yxhIzwz
yxhIzw
hxh
nz
tt
iittit
iittiti
t
tt
iititt
t
t
i
We use the AdaBoost for noise detection and classification because it can make complex boundary.
Weight weak classifier based on performance of it
Wrong data weight is biggerTrue data weight is smaller
Comparative approach
We use log likelihood ratio of GMMs. It is the popular method for VAD (voice activity detection )
)|()|(
logsgn)(elspeech_modxPlnoisy_modexP
xH
)|(maxarg)( kk
noisexPxC
Detection
Classification We find a class which has a maximum likelihood from noisy speech GMMs.
Experiments
Summary Future work
Classification
Noise class k
Noise class 1or
Other class
Noise class2or
Other class
Noise class Kor
Other class
The frame to be noisy in detection approach
…
Noises are separated to some classes in advance.Classifiers are learned by AdaBoost to classify these classes.
Learning
classification
Classification are applied to only the frames which are determined as noisy in detection.
Classifiers decide the class of noisy speech frame.
Smoothing
noise 1 noise 1
noise 1
noise2
A signal interval detected by AdaBoost may result in only a few frames
Experimental conditionWindow size 20msec Hamming window every 10-msecFeature: 24-order log-Mel filter bank and 12-order MFCCThe number of weak classifier of AdaBoost: 500SNR of learning data : -5 dB ~ 5 dB
Recall Precision F-measure Classification Accuracy
AdaBoostGMM 0.95
0.90
0.850.80
0.75
0.70
0.973 0.9580.914
0.896
1.00[SNR of 0 dB]
0.965 0.9620.9500.9510.989
0.973
These frames are removed by smoothing.We use majority voting for smoothing.
When carrying out the smoothing of one frame, the prior three and subsequent three frames are also consideration.
3
3
' )(maxargN
Nii
cN ccIc
ic : i-th frame’s classification output.
Criteria of evaluation
tpcetp
tionClassifica
PrecisionRecall
PrecisionRecallmeasureF
2
fntp
cefptpAccuracy
fptp
tpPrecision
fntptp
Recall
frames positive false ofnumber the:fp
frameserror tion classifica ofnumber the:ceframes negative false ofnumber the:fn
frames positive trueofnumber the:tp
Recall Precision F-measure Classification Accuracy
AdaBoost
GMM0.95
0.90
0.85
0.80
0.75
0.70
0.923
0.842
0.804
1.00[SNR of 5 dB]
0.9730.949
0.9150.950
0.900
0.947 0.932
Experimental results
Recall Precision F-measure Classification Accuracy
AdaBoostGMM 0.95
0.90
0.85
0.80
0.75
0.70
0.9740.9731.00[SNR of -5 dB]
0.9720.973 0.973 0.9740.989 0.989
0.937 0.933
η of this equation adjust the number of positive error and negative error.
We proposed the sudden noise detection and classification with Boosting. Detection and classification have high performance in low SNR. The performance using AdaBoost is better than GMM-based method.
We will detect more kinds of noises combining this method with clustering method as k-means.We will combine noise detection and classification with noise reduction method.
Speech data 16kHztraining:210 utterances of 21 menTesting:2104 utterances of 5 men
Noise data6 kinds of noise: “spray,“ " telephone,” ”tearing paper,” “pouring of a granular substance,” “bell-ringing,” “horn”
These have each 50 source. 20 data for training, 30 data for testing.