Reporter ： Chia-Cheng Chen Advisor ： Wen-Ping Chen 1 Network Application Laboratory Department...

Reporter ： Chia-Cheng Chen

Advisor ： Wen-Ping Chen

1Network Application Laboratory

Department of Electrical EngineeringNational Kaohsiung University of Applied Sciences

A Study of Single Channel Blind Source Separation and Recognition

Based on Mixed-State Prediction

Outline

Introduction and Motivation

Background

Research Methods

Experimental Results

Conclusion and Future Works

Research Results

2

Introduction

3

The applications of voiceprint recognition system• Call routing (1997)• Jupiter (1997)• Let’s Go! (2002)• Siri (2010)• Skyvi (2011)• Vlingo (2011)

Introduction Current Ecological Status of the Survey:

• Sensor networks• Wireless networks• Database• Voiceprint recognition system

Advantage• Reduce the cost of human resource and time• Save and share the raw data conveniently

4

Introduction

5

Blind Source Separation

http://metadata.froghome.org/about.php 台灣地區兩棲類物種描述資料

http://metadata.froghome.org/about.php

Introduction

6

?Blind Source Separation

IntroductionVoiceprint recognition

• C.J. Huang, Y.J. Yang, D.X. Yang and Y.J. Chen, “Frog classification

using machine learning techniques,” Expert Systems with Applications,

Vol. 36, No. 2, pp. 3737-3743, 2009. (SCI)

• S.C. Hsieh, W.P. Chen, W.C. Lin, F.S. Chou, and J.R. Lai, “Endpoint

detection of frog croak syllables with using average energy entropy

method,” Taiwan Journal of Forest Science, Vol.27, No.2, pp.149-161,

Jun. 2012. (EI)

• W.P. Chen, S.S. Chen, C.C. Lin, Y.Z. Chen and W.C. Lin, “Automatic

recognition of frog call using multi-stage average spectrum,” Computers

& Mathematics with Applications, Vol. 64, No. 5, pp. 1270-1281, Sep.

2012. (SCI)

7

IntroductionSingle channel source separation

• M.N. Schmidt and M. Mørup, “Nonnegative matrix factor 2-D

deconvolution for blind single channel source separation,” Proceedings of

International Conferences Independent Component Analysis and Blind

Signal Separation, Vol. 3889, pp. 700-707, Mar. 2006. (SCI)

• S. Kırbız and B. Gunsel, “Perceptually weighted non-negative matrix

factorization for blind single-channel music source separation,” 21st

International Conference on Pattern Recognition, Nov. 2012. (EI)

8

MotivationAutomatic frog species voiceprint recognition system

• Predicting the number of mixed signal• Single channel blind source separation• Biologist• People

9

Outline


Background

Research Methods



Research Results

10

Background

Blind Source SeparationNon-negative Matrix Factor 2-D Deconvolution

MatchingAdaptive Multi-stages Average Spectrum

Feature ExtractionMel-frequency Cepstrum Coefficient

Endpoint DetectionTime Domain Frequency Domain

Signal ProcessingPre-emphasis Frame Window

11

Background

Signal Processing

Syllable Segmentation

Feature Extraction Matching

12

Voiceprint Recognition

Signal ProcessingSignal Processing

13

FrogSignal

Pre-emphasis

Frame

Hamming Window

Resample

1ˆ nαsnsns

1

2cos460540ˆ

N

πn..nsnw

44100Hz

Syllable SegmentationEndpoint Detection Algorithm

• Energy• Time Domain• Simple• Square of the Amplitude or Absolute Value of the Amplitude• Vulnerable to Noise Impact

• Entropy • Frequency Domain• Complex• Noise Immunity

14

Average Energy Entropy Signal Transform

Average Energy

15

10,1

0

2

NkenskX

N

n

N

knj

　　

s(n) ： windowed signalN ： frame sizek ： frequency component

1

0

)(N

n N

nAu

u ： the mean for energy of input signalA(n) ： the amplitude value of input signalN ： total number of input signal

Average Energy Entropy Probability Density Function

16

10,))((

))((1

0

'

MiufE

ufEp M

mm

ii 　　

E(fi) ： the spectral energy for the frequency fi

： the corresponding probability density M ： total number of frequency components in FFTβ ： Multiples

Average Energy Entropy Average Energy Entropy

17

1

0

''' logN

iii ppH

H’ ： the negative entropy for each frame

Endpoint Detection Algorithm

18

Signal

AEE

Absolute Energy

Square Energy

Feature Extraction

19

M

kkm Lm

MkmEC

1

,...,1],)2

1(cos[ 　

Adaptive Multi-stage Average SpectralAdaptive Clustering

20

Cluster B

Cluster A

Adaptive Multi-stage Average Spectral

Cluster B

Cluster A

21

Adaptive Clustering

Adaptive Multi-stage Average Spectral

22

Adaptive Clustering

Adaptive Multi-stage Average SpectralTemplate Training

23

Frame 1

Frame 2

Frame 3

Frame 4

Frame 5

Frame 6

Frame 7

Stage 1

Stage 2

Stage 3

1

0

iL

n i

ni L

(k)X(k)S


24


25

Minimum Cumulative Difference

Adaptive Multi-stage Average SpectralTemplate Maching

26

Unknow Audio

Stage

Stage 1

Stage 2

Stage 3

1 2 3 4 5 6 7

Minimum Cumulative Difference

Blind Source SeparationNon-negative Matrix Factor 2-D Deconvolution

• α basis matrix and βcoefficient matrix • Obtain the relations between the time and the pitch• Shift operator

27

HWV

987

654

321

A

870

540

2101

A

654

321

0001

A

，

，

V: Original Signal

: Reconstructed Signal

Non-negative Matrix Factor 2-D Deconvolution

28

11

dW 21

dW12

dW 22

dW

11

dH

12

dH

11

dW 21

dW12

dW 22

dW

11

dH

12

dH

11

dW 21

dW12

dW 22

dW

11

dH

12

dH

0

HWV

Non-negative Matrix Factor 2-D DeconvolutionNon-negative Matrix Factor 2-D Deconvolution

• Cost function• Based on Euclidean Distance

• Based on Kullback-Leibler Divergence

29

2

,

mnnmnmED VC

mn

nmnmnm

nmnmKL V

VVC

,

log

Outline


Background

Research Methods



Research Results

30

Research MethodsMixed-State Prediction voiceprint recognition method

• Training• Mixed signals states

• Testing• Two stages voiceprint recognition• Mixed-State Prediction

31

32 Species

Audio Training

Audio Testing

AMASA

TemplateTraining

StandardSample

Template

TemplateMatching

Species

Source Separation

TemplateMatching

Number of Audio > 1

Yes

No

MixedStates

Mix-StatePrediction

Source 1

Source m

Source 2

SignalProcessing

SyllableSegmentation

FeatureExtraction

Audio Testing

AEEEndpoint Detection

First Stage

Second Stage

Audio Training

First Stage

33

Latouche's frog MFCC Moltrecht's green tree frog + Latouche's frog MFCC

Independent signal Mixed signal

Signal Processing

Syllable Segmentation

Feature Extraction Matching

Mixed signals states

34

S

A

B

C

AB

AC

BC

R21

R22

R23

i1 i2

R2n

a

b

c

i

ABC

R31

i3

R3n

D

d

AD

R24

BD

R25

CD

R26

ABD

R32

ACD

R33

BCD

R34

iC2iC3

Mixed StatesAverage Energy

35

N

k

kXN

E0

2)(

1 E ： the average energy for the frequency X(k)

N ： the length of the syllable

Mixed signalIndependent signal

Predicting the number of mixed signal

36

2aEdist

E ： the mean spectral energy for test syllable

a ： the mean energy of training data

distT T ： the separation threshold

S

ABC

R31

A

AB

R21

AC

R22a

ABD

R32

ACD

R33

AD

R24

Outline


Background

Research Methods



Research Results

37


38

Parameters Parameter Value

Frame Length 512 samples

Frame Overlapping 50%

Window Function Hamming Window

Frequency Bin 512

Feature Parameters Mel-Frequency Cepstral Coefficient

Feature Dimensions 15

Separation Threshold 0.3

Experimental Results Recognition Experiment

• Independent signals

39

MethodTotal

SyllableErrorMixed

CorrectSyllable

Accuracy(%)

DTW 373 31 282 75.6%

AMSAS 373 31 317 84.71%

Experimental Results Recognition Experiment

• Mixed signals

40

MethodTotal

SyllableCorrectSyllable

Accuracy(%)

DTW 269 183 68.02%

AMSAS 269 211 78.43%

TotalSyllable

ErrorMixed

CorrectSyllable

Accuracy(%)

167 36 131 78.44%


41


42

Conclusion and Future WorksThe proposed method

• Improve the mixed signal recognition rate• Proposed a method to predict the number of mixed signal

43

Conclusion and Future WorksFuture Works

• Study of de-noise methods• Collect more features between independent and mixed signals• Mixed signals recognition within same species• Collect various sound of species. Then, improve the system

performance• Adopt Support Vector Machines(SVM), Neural Network…

44

Research ResultsCompetition

• 第七屆數位訊號處理創思設計競賽—入圍•青蛙物種聲紋辨識系統

• 計畫協助

45

FormNSC 100-2221-

E-151-0117

NSC1002101010508

-080702G1

NSC1002101050511-

060101G4

Heading

WDM-EPON 之動態波長頻寬配置與服務品

質之研究

生態資訊學技術應用在森林

經營之研究

無線感測器網路在森林災害監測之應用與

研究

46

Thank you for your attention !!

Reporter ： Chia-Cheng Chen Advisor ： Wen-Ping Chen 1 Network Application Laboratory Department...

Documents

Transcript of Reporter ： Chia-Cheng Chen Advisor ： Wen-Ping Chen 1 Network Application Laboratory Department...