1 Acoustic measurements on prosody using Praat Bert Remijsen Universiteit Leiden & University of...

Acoustic measurements on prosody using Praat

Bert RemijsenUniversiteit Leiden & University of Edinburgh

Overview

Brief motivation

Introduction to Praat scripting

Measurement of –

> Vowel quality

> Fundamental frequency

> Voice quality and intensity

Overview

Topics in relation to measurements:

[> Data collection and processing]

> How to measure it in Praat

> (Semi-)automating measurements

> Displaying the descriptive statistics

> Inferential statistics

Motivation

Why quantitative analysis of prosody?

> quantitative results can be used to test hypotheses

Motivation

> humans are bad at determining the acoustic cause of prosodic variation by ear

E.g.: - controversy on lexical stress

- perception of pitch-accent

Motivation

> Prosodic contrasts are often realized in terms ‘packages’ of prosodic correlates.

E.g.: stress: duration, vowel quality, intensity

complementary quantity: duration, vowel q.

pitch-accent: fundamental frequency (f0), duration, etc.

Motivation

Why quantitative analysis of prosody with Praat?

> Allows for measurement, manipulation, and representation of the full range of acoustic parameters.

> Relatively easy to (semi-)automate procedures by means of scripts.

How to write a Praat script?

A. Try to start out from an existing script

> For example, check on:

http://uk.groups.yahoo.com/group/praat-users

> Praatscripts introduced in this presentation can be found at:

http://www.ling.ed.ac.uk/~bert/praatscripts

B. Writing (part of) a script from scratch

> Do the steps by hand for one item

> Display them using Paste history

> Combine these steps with control structures, guided by the manual.

An annotated script:

Script: msr_duration.psc

Function: collecting durations for onset, nucleus and coda of a target word, for each file in

list. Automatic.

Common components:

> User interface (form … endform)

> Getting the input files (Read …)

> Finding point of measurement (using TextGrid)

> Measurements

> Writing output to file (e.g. fappendinfo)

The dataset:

> One long sound file – e.g. the whole recording session, with information on sections in

the TextGrid.

> One item-per-file. If so, it is best to encode as much useful information as possible in the filename, in a structured way.

> One item-per-file. If so, it is best to encode as much useful information as possible in the filename, preferably fixed-width.

dataset_code d2_2_012_s_1

speaker_no

item_no repetition_no

s(ingular) / p(lural) [S&R]

Reasons:

> Saves work coding in statistics package

> The fields in the name can be searched with a Praat script (using string pattern

matching).

Script: openlist.psc

Function: open specific objects associated with item in list

Script: openlist_specificitem.psc

Function: This script searches on the item code – the third field in the name.

Measuring vowel quality

Vowel quality Measurement in Praat

How to measure formants in Praat?

I. The point of measurement

II. An algorithm and a protocol

III. Semi-automating measurements

I. The point of measurement – possibilities:

> Where F1 reaches its maximum

> Small domain centered on temporal mid point

> Averaged over (middle x% of) vowel.

II. An algorithm and a protocol

1. Produce Formant object using default algorithm (Burg) and parameters (5 formants below 5000 Hz [male] / 5500 Hz [female])

2. Track using default values (male values = female values – 10 %).

3. Protocol for when the value is incorrect:

E.g.: weak F2 of high back vowels often missed; F3 reported as F2

Options: - Use LPC with more coefficients

- Retrack with changed F1/2 ref.

The strategy is to be fixed within a single study.

III. Semi-automating the measurements

> Formant measurements should be checked. A fully-automated procedure is not an

option.

> Instead: automate all the repetitive actions.

Script: msr&check_f1f2_indiv_interv.psc

Function: Makes measurement as proposed above, Point of measurement: midpoint of an

interval – suitable for analysis for monophthongs.

Script: msr&check_f1f2_indiv_point.psc

Function: Makes measurement as proposed above, Point of measurement: points on a point

tier – suitable for analysis of di/triphthongs.

> These scripts can easily be modified to process a batch in one go – still with check.

Vowel quality Scaling

The formant values, once collected, can be scaled in a number of ways:

1. Individual frequencies or frequency differences?

> Vowel height: F1-F0 or F1

> Advancement: F2-F1 or F2

2. F1 x F2, or others formants as well?

> F1 x F2

> F0 x F1 x F2 x F3

3. Acoustic / psycho-perceptual scale?

> hertz (Hz)

> Logarithmic (ST)

> Bark

4. Cross-speaker comparisons?

> z-transformation (Lobanov)

> Gerstman

> Constant Log Interval Hypothesis

Ideal set-up for normalization (Adank 2003):

> Individual frequencies rather than Δ’s

> hertz (Hz) rather psycho-acoustic scale

> No need to consider F0 and F3

> between-speaker variation: z-transformation

Vowel quality Analysis / vowel plots

The formant values, can be interpreted best in a vowel plot (F1 x F2).

Characteristics of a good vowel plot:

> Inverted axes

> Over speakers (so z-transformed)

> Categories labeled using IPA

The formant values, can be interpreted best in a vowel plot (F1 x F2).

Characteristics of a good vowel plot:

> Inverted axes

> Over speakers (so z-transformed)

> Categories labeled using IPA

Praat can do it.

Example:- The vowels of Dinka: /i,e,,a,,o,u/

- Ellipses encircle 1 standard deviation

- Separate ellipses for compl. quantity

- Values averaged over 2 repetitions of 36 items uttered by 5 speakers.

-22 1 -10F2 (z-transformed)

Example:- The vowels of Dinka: /i,e,,a,,o,u/

- Ellipses encircle 1 st. dev. (68%)

- Separate ellipses for compl. quantity

- Values averaged over 2 repetitions of 36 items uttered by 5 speakers.

-22 1 -10F2 (z-transformed)

1. Create a TableofReal, with, for each token:

> praat-code for the IPA label (e.g. ‘’ is ‘\ep’)

> z-transformed F1 and F2; sign inverted (I do this in SPSS)

> Header contains axis labels and no. of tokens

Example: formants_tor.txt:

File type = "ooTextFile"

Object class = "TableOfReal"

numberOfColumns = 2

columnLabels []:

"F2 (z-transformed)" "F1 (z-transformed)"

numberOfRows = 341

row [1]: "i^C" -1.6595 1.2794

row [2]: "i^C" -1.9973 1.2538

row [341]: "o^C" 0.6245 0.0380

2. Open the TableofReal in Praat, and use either:

> Draw scatter plot

to plot individual values; each token is marked by its (IPA) label.

> Draw sigma ellipses

ellipses, sized by user in terms of st. devs. (sigma). (IPA) label

plotted at center.

Either way, plot with no for Garnish and Discriminant plane

3. In Picture window, add marks on x and y axes, inverting the inverted sign back to normal – for example:One mark left... -2 no yes no 2

This gives a y-axis mark in terms of z-scores of ‘2’ at -2 on the y-axis, without plotting ‘-2’.

Vowel quality Analysis / inferential tests

Characteristic inferential test: ANOVA

> within-subjects

> multivariate (dependents zF1 and zF2)

> factor(s) vowel quality (and e.g. lexical stress / intonational accent / position in phrase / etc.).

Measuring fundamental frequency

F0 Overview

> Issues in measuring F0

> Scaling

> Descriptive stats

F0 Issues in measuring F0

I. For detailed study about the realization of tonal contrasts, consonants in target words should be:

+ nasals

liquids

approximants, rhotics

–voiced fricatives

unvoiced fricatives, stops

BUT: other may be more important – such as the availability of minimal-set data:

/ba1/ Low level ‘to remain’

/ba3/ High level ‘ancestor’

/ba121/ Rise-fall ‘stiff’

/ba12[p]/ Low Rise ‘father’

/ba41/ Extra High Fall ‘to hit’

/ba21/ Low Fall ‘to blow’

/ ba31/ High Fall ‘when’

II. F0 measurements need to be checked for octave jumps etc.

> suggestion: use a semi-automated procedure

Script: lst2f0&check.psc

Function: This script automates all the repetitive actions involved in the checking of F0 tracks. It calculates the F0 track (Pitch object), plots it in the Picture window, gives the opportunity to fix errors if need be, and then writes the (fixed) Pitch object to a file. Batch processing using file list.

III. The point of measurement – turning points can be determined:

> by eye

> using mathematical modelling. MOMEL (Hirst & Espesser) is implemented in Praat. See also recent work by Grabe & Kochanski.

Script: momel_modif.psc

Function: Praat implementation of the MOMEL algorithm. (Original implementation in the MES signal processing package)

F0 Scaling

From physical F0 trace to psycho-acoustic track.

1. Normalization for the logarithmic nature of pitch perception:

> hertz (Hz)

> semitone (ST)

> Equivalent Rectangular Bandwidth (ERB)

F0 Scaling

From physical f0-track to psycho-acoustic track.

1. Normalization for the logarithmic nature of pitch perception:

> hertz (Hz)

> semitone (ST)

> Equivalent Rectangular Bandwidth (ERB)

Latest news: semitone is best (Nolan 2003).

F0 Scaling

2. Normalization across speakers:

> No need to normalize for slope differences expressed in ERB or ST.

> Absolute values can be normalized using the z-transformation.

F0 Analysis / Plotting tracks

How to interpret the data, and communicate tendencies to others? The problem:

> Averages of F0 measures expressed as numbers in Hz are hard to interpret. ST, ERB and

z- scores are even harder to interpret.

> Visual illustration by means of F0 tracks of individual cases fail to exploit the dataset.

The solution:

> Represent F0 visually across speakers, by means of tracks normalized for time.

> I.e.: graph used as a descriptive stat (reports average)

Example 1:- The 6 lexical tones of Matbat

- Normalized time

- Utterance-medial position, following low target

- Tracks averaged over 2 repetitions of 48 items uttered by 8 speakers. (784 tokens)

Example 2:- The 3 word-prosodic patterns of Papiamentu.

- Normalized time

- Whole sentence represented.

- Tracks averaged over 2 repetitions of 2 items uttered by 8 speakers. (96 tokens)

SUBJ COP O1 V1 O2 V2 PREP.

word-acc. I, penult. stress

word-acc. II, penult. stress

word-acc. II, final stress

Script: pp_show_series10.psc (example)

Function: on the basis of checked tracks, the scripts produces a text file with an F0 values for each of 8 points of measurement. Takes voicing at edges into consideration.

Measuring overall / selective intensity (dB)

dB Introduction

Variation in perceived voice quality (breathy, modal, creaky) correlates with distribution of energy in spectrum.

dB Introduction

Functions include:

1. Utterance-level contrasts

Example: creaky voice correlates with low F0 –

Q: The slugs ate the dahlias, didn’t they?

A: No, that’s not true / the rabbits ate the dahlias, not the slugs.

(Thanks to Mariko Sugahara for the example)

dB Introduction

2. Word-level contrasts – on its own…

Dinka example – breathy vs. modal

raal raall ltt lt

‘vein-sg.’ ‘vein-pl.’ ‘insult-sg.’ ‘insult-pl.’

dB Introduction

Functions include:

2. Word-level contrasts – on its own…

Dinka example – breathy vs. modal

raal raall ltt lt

‘vein-sg.’ ‘vein-pl.’ ‘insult-sg.’ ‘insult-pl.’

… or as a package (register tone – e.g. Mon-Khmer languages, Chamic languages)

dB Introduction

Variation in perceived loudness correlates with:

> the distribution of energy in the spectrum (spectral balance)

> overall intensity

Functions include:

> Lexical stress (cf. Sluijter & van Heuven 1996)

> Phrasal accent (cf. Heldner 2003)

dB Introduction

In summary:

> Selective intensity marks distinction in voice quality AND distinctions in loudness.

> Loudness contrasts may also correlate with overall intensity.

> It remains unclear whether / to what extent loudness and voice quality have separate correlates.

dB Measuring overall intensity

> No need for checking. Automated procedure is possible, cf. measurement of duration.

> Important issue: controlling for variation in irrelevant factors in the course of session.

> Relate intensity of target segment to the intensity of (part of) the carrier utterance.

dB Measuring selective intensity

Abundance of possible measurements, including:

> H1-H2

> H1-A1, H1-A2, H1-A3

> Dynamic filter (Heldner)

> Average within a range (Sluijter & van Heuven)

See thematic issue of JPhon 29:4 (2001)

Recommendations:

> For detailed acoustic study: a measure of specific spectral properties is best (explanatory adequacy).

> In relation to a big corpus, Heldner’s filter-based measure seems best (relatively vowel-independent; easy to automate)

> Try out several / Make your own variation

How to:

> Point of measurement (cf. vowel quality)

> Semi-automating measurements using script

Script: msr&check_spectr_indiv_interv.psc

Goal: Semi-automated procedure for measurement of H1, H2, A1, A2, A3. Extension of vowel quality script.

> The organizers, for inviting me.

> Thanks to Patti Adank and Alice Turk, for discussions on measurements of vowel quality; and to Helen Hanson, for discussions on voice quality. > The Netherlands Organization for Scientific Research (NWO), for funding my research by means of a postdoc grant to Vincent van Heuven.

Acknowledgements

Conference announcement:

Between Stress and Tone

Topic: Typology of prosodic systems When/where: 16-18 of June 2005 / Leiden Abstracts due: 1 of November 2004 Details: http://www.iias.nl/iias/agenda/best/

1 Acoustic measurements on prosody using Praat Bert Remijsen Universiteit Leiden & University of...

Documents

Transcript of 1 Acoustic measurements on prosody using Praat Bert Remijsen Universiteit Leiden & University of...

einführung praat

PROSODY MODELING AND EIGEN-PROSODY ANALYSIS FOR ROBUST SPEAKER RECOGNITION

PRAAT ON THE WEB

Speech Signal Processing With Praat

PROSODY - School of Englishseas3.elte.hu/analysis/analysis7.pdf · English Phonological Analysis – Chapter 7 – PROSODY (Autumn 2013) 2 7.1. WHAT IS PROSODY? Prosody in linguistics

How to Use Praat for Acoustic Analysis1current1

Medición de fenómenos fonéticos con Praat - sadowsky.clsadowsky.cl/files/tutoriales-praat/Clase-03/Sadowsky_Taller-Praat... · (Sadowsky_Taller-Praat-03--audio.flac) como LongSound:

Praat Manual

Praat LING115 November 4, 2009. Getting started Basic phonetic analyses with Praat –Creating sound objects Recording, reading from a file, creating from.

Prosody Training

Leiden Institute of Chemistry Leiden University

Acoustic Phonetics - uni-bielefeld.de · Guangzhou, Autumn 2019 D. Gibbon, Brief Introduction to Praat 3 Praat Praat is a phonetic workbench application developed in Amsterdam by

PRAAT - LLACANllacan.vjf.cnrs.fr/fichiers/manuels/Praat/PRAAT...PRAAT Bref didacticiel Introduction Pascal van Lieshout, Ph.D. University of Toronto, Graduate Department of Speech-Language

Pali Prosody

Introduction to Praat - Universität Graz...Introduction to Praat Petra Hödl petra.hoedl@uni-graz.at Summer school “Intonation and Word Order –Theoretical and Empirical Approaches”

BBN ANG 243 Phonological analysis Prosody: Phrase stress, rhythm & intonationseas3.elte.hu/analysis/analysis-prosody-slides.pdf · 2020. 11. 10. · prosody prosody /pr´Os@dIj/ examination

Prosody 2014

Prosody: Thinking Outside the Box · Excel Supervised and unsupervised Machine Learning Speech engineering software development Analysis SPSS Stata MatLab R Python Praat Time stamp

Using Praat for Linguistic Research by Will Styler

Prosody Introduction