1. Introduction

1
20 10 S c h o o l o f E l e c t r i c a l E n g i n e e r i n g & T e l e c o m m u n i c a t i o n s UNSW 1. Introduction Crying is the only mode for infants to express their physical and psychological status. For many years, paediatricians have been searching for non-invasive tools to measure brain function of infants; meanwhile there is growing evidence that infants with medical complications have identifiable cries. Hence, cry signals may carry informative features to reflect medical status of an infant. Moreover, cry analysis can be automated with the advent digital signal processing. Being cheap, easy to perform and completely non-invasive, automatic acoustic analysis on infant crying is a potential prognostic or diagnostic tool for certain pathologies in the future. 10 2. Aim To implement an automatic cry recogniser which classifies cries of normal and pathological (asphyxiated and hearing-impaired) infants by extracting relevant acoustic features. 3. Background 3.1. The Infant Cry Production Model The infant cry production mechanism can be described using the physio-acoustic model which consists of (Figure 1): the infant control organiser the independent source-filter model The infant control organiser is a three-level processing: Upper processor is the Central Nervous System (CNS) which determines the states of action. Middle processors represent all vegetative states, e.g. crying. Lower processors coordinate different groups of muscles. 5. Simulation Results Average accuracy of classification is 4. Methodology and Experiments Implementation of automatic classification involves two phases. Figure 3: Automatic infant cry recogniser Class Normal Asphyxia Hearing Impairment Accuracy 84.2% 92.3% 98.1% Figure 1: Physio-acoustic infant cry production model Figure 4: Process of decision making 4.3. Leave-One-Out Training and Testing Training samples are randomly selected from 16 (out of 17) infants. The samples of the remaining infant are used for accuracy testing. Tests are repeated by removing a different infant each time. 4.4. Model Training Models are trained using Gaussian mixture modelling (GMM). 4.5. Decision Making Classifications are made using maximum likelihood criterion. 4.1. Database A set of cry recordings from 5 normal, 6 asphyxiated and 6 hearing-impaired infants has been recorded and labelled by paediatricians. 4.2. Feature Extraction Fundamental frequency f 0 f 0 or pitch is the quasi- periodic vibration rate of vocal folds in the larynx. Formants F i Formants represents resonant frequencies of the vocal tract. Spectral Centroids SC i A spectral centroid indicates the dominant frequency of a given frequency sub-band and is calculated as the average frequency weighted by amplitudes: f 0 = 1 / T 0 F 1 F 2 F 3 SC 2 CNS controls sub-glottal (respiratory system), glottal (larynx) and supra-glottal (nasal and vocal tracts) independently. It is assumed that pathologies will affect the functionality of CNS. Consequently, any malfunction in either group of muscles is directly reflected in the cry sound produced. Therefore, acoustic anomalies can be correlated to physiological pathologies. 3.2. Infant Vocalisation Modes Infant cry signals present both voiced and unvoiced structures: Voiced cry when vocal folds are vibrating. o Phonation vibration rate < 700 Hz o Hyper-phonation vibration rate > 700 Hz Hyper- phonation Dys- phonation Phonation Phonation Phonation Dys- phonation Figure 2: The three basic infant cry modes Automatic Infant Cry Analysis ~ An Acoustic Approach ~ Author : Voon Hian Lee Supervisor : Dr. Hadis M. Nosratighods Student ID : 3195964 Assessor : Dr. Julien Epps 6. Conclusion and Future Work Average accuracy attained is 94.1%, results show that automatic classification of infant cry signals via acoustic analysis is feasible. In the future, we will investigate other acoustic features to: identify other pathologies determine the causes of crying (e.g. pain, hunger and discomfort) ENGINEERING @ UNSW

description

Automatic Infant Cry Analysis ~ An Acoustic Approach ~. 10. Author: Voon Hian LeeSupervisor: Dr. Hadis M. Nosratighods Student ID: 3195964 Assessor: Dr. Julien Epps. 1. Introduction - PowerPoint PPT Presentation

Transcript of 1. Introduction

Page 1: 1.   Introduction

2010

School of Electrical Engineering &

Telecomm

unications

UNSW ENGINEERING @

UNSW

1. Introduction Crying is the only mode for infants to express their physical and psychological status. For many years, paediatricians have been searching for non-invasive tools to measure brain function of infants; meanwhile there is growing evidence that infants with medical complications have identifiable cries. Hence, cry signals may carry informative features to reflect medical status of an infant. Moreover, cry analysis can be automated with the advent digital signal processing. Being cheap, easy to perform and completely non-invasive, automatic acoustic analysis on infant crying is a potential prognostic or diagnostic tool for certain pathologies in the future.

10

2. Aim To implement an automatic cry recogniser which classifies cries of normal and pathological (asphyxiated and hearing-impaired) infants by extracting relevant acoustic features.

3. Background 3.1. The Infant Cry Production ModelThe infant cry production mechanism can be described using the physio-acoustic model which consists of (Figure 1): the infant control organiser the independent source-filter model

The infant control organiser is a three-level processing: Upper processor is the Central Nervous System (CNS) which

determines the states of action. Middle processors represent all vegetative states, e.g. crying. Lower processors coordinate different groups of muscles.

5. Simulation Results

Average accuracy of classification is 94.1%.

4. Methodology and Experiments Implementation of automatic classification involves two phases.

Figure 3: Automatic infant cry recogniser

Class Normal Asphyxia Hearing ImpairmentAccuracy 84.2% 92.3% 98.1%

Figure 1: Physio-acoustic infant cry production model

Figure 4: Process of decision making

4.3. Leave-One-Out Training and TestingTraining samples are randomly selected from 16 (out of 17) infants. The samples of the remaining infant are used for accuracy testing. Tests are repeated by removing a different infant each time.

4.4. Model TrainingModels are trained using Gaussian mixture modelling (GMM).

4.5. Decision MakingClassifications are made using maximum likelihood criterion.

4.1. DatabaseA set of cry recordings from 5 normal, 6 asphyxiated and 6 hearing-impaired infants has been recorded and labelled by paediatricians.

4.2. Feature Extraction Fundamental frequency f0 f0 or pitch is the quasi-periodic vibration rate of vocal folds in the larynx.

Formants Fi Formants represents resonant frequencies of the vocal tract.

Spectral Centroids SCi A spectral centroid indicates the dominant frequency of a given frequency sub-band and is calculated as the average frequency weighted by amplitudes:

f0 = 1 / T0

F1 F2 F3

SC2

CNS controls sub-glottal (respiratory system), glottal (larynx) and supra-glottal (nasal and vocal tracts) independently. It is assumed that pathologies will affect the functionality of CNS. Consequently, any malfunction in either group of muscles is directly reflected in the cry sound produced. Therefore, acoustic anomalies can be correlated to physiological pathologies.

3.2. Infant Vocalisation ModesInfant cry signals present both voiced and unvoiced structures: Voiced cry when vocal

folds are vibrating.o Phonation

vibration rate < 700 Hzo Hyper-phonation

vibration rate > 700 Hz

Unvoiced cry when vocal folds are inactive.

o Dys-phonation

Hyp

er-p

hona

tion

Dys

-pho

natio

n

Phon

atio

n

Phon

atio

n

Phon

atio

n

Dys

-pho

natio

n

Figure 2: The three basic infant cry modes

Automatic Infant Cry Analysis ~ An Acoustic Approach ~Author : Voon Hian Lee Supervisor : Dr. Hadis M. NosratighodsStudent ID : 3195964 Assessor : Dr. Julien Epps

6. Conclusion and Future Work Average accuracy attained is 94.1%, results show that automatic classification of infant cry signals via acoustic analysis is feasible.In the future, we will investigate other acoustic features to: identify other pathologies determine the causes of crying (e.g. pain, hunger and discomfort)

ENGINEERING @ UNSW