Semioccluded Vocal Tract Exercises - TUNI

MARCO GUZMAN

Semioccluded Vocal Tract ExercisesA physiologic approach for voice training and therapy

Acta Universitatis Tamperensis 2265

MA

RC

O G

UZM

AN

Sem

ioccluded Vocal Tract Exercises

AU

T 2265

MARCO GUZMAN

Semioccluded Vocal Tract Exercises

A physiologic approach for voice training and therapy

ACADEMIC DISSERTATIONTo be presented, with the permission of

the Board of the School of Education of the University of Tampere,for public discussion in the auditorium Virta 109,

Åkerlundinkatu 5, Tampere,on 24 February 2017, at 12 o’clock.

UNIVERSITY OF TAMPERE

MARCO GUZMAN

Semioccluded Vocal Tract Exercises

A physiologic approach for voice training and therapy

Acta Universi tati s Tamperensi s 2265Tampere Universi ty Pres s

Tampere 2017

ACADEMIC DISSERTATIONUniversity of TampereFaculty of Education Finland

Copyright ©2017 Tampere University Press and the author

Cover design byMikko Reinikka

Acta Universitatis Tamperensis 2265 Acta Electronica Universitatis Tamperensis 1766ISBN 978-952-03-0390-7 (print) ISBN 978-952-03-0391-4 (pdf )ISSN-L 1455-1616 ISSN 1456-954XISSN 1455-1616 http://tampub.uta.fi

Suomen Yliopistopaino Oy – Juvenes PrintTampere 2017 441 729

Painotuote

The originality of this thesis has been checked using the Turnitin OriginalityCheck service in accordance with the quality management system of the University of Tampere.

3

CONTENTS

1. Introduction ……………………………………………………………….. 14

1.1 Approaches of voice therapy ………………………………………… 14

1.2 Physiologic voice therapy programs ……………………………. 16

1.3 Semioccluded vocal tract exercises ……………………………… 19

1.4 Semiocclusions of vocal tract and energy transfer ……………… 23

1.5 Radiological imaging methods ………………………………….. 25

1.6 Aerodynamic measurements of phonation ……………………… 27

1.7 Electroglottography ……………………………………………… 28

1.8 High speed digital imaging ……………………………………… 30

1.9 Naso-laryngeal fiberoscopy …………………………………….. 32

1.10 Aims of the study ……………………………………………… 34

2. Material and methods …………………………………………………….. 36

2.1 Article I …………………………………………………………. 36

2.2 Article II ………………………………………………………… 44

2.3 Article III ……………………………………………………….. 48

2.4 Article IV ……………………………………………………….. 52

3. Results ……………………………………………………………………. 56

3.1 Article I …………………………………………………………. 56

3.2 Article II ………………………………………………………… 57

3.3 Article III ………………………………………………………... 59

3.4 Article IV ………………………………………………………... 60

4. Discussion ………………………………………………………………… 62

4.1 Influence of semioccluded exercises on aerodynamic variables ... 62

4

4.2 Influence of semioccluded exercises on glottal function ……… 66

4.3 Influence of semioccluded exercises on vocal tract function ….. 71

4.4 Possible sources of error ……………………………………….. 80

4.5 Future research …………………………………………………. 81

5. Conclusions ………………………………………………………………. 83

6. Acknowledgements ………………………………………………………. 85

7. Financial support …………………………………………………………. 86

8. References ………………………………………………………………… 87

5

LIST OF ORIGINAL ARTICLES

Guzman M, Laukkanen A-M, Krupa P, Horáček J, Švec J, Geneid A. Vocal Tract

and Glottal Function During and After Vocal Exercising with Resonance Tube and

Straw. Journal of Voice, 2013;27:305-311.

Guzman M, Laukkanen A-M, Traser L, Geneid A, Richter B, Muñoz D,

Echternach M. The Influence of Water Resistance Therapy on Vocal Fold

Vibration: A High Speed Digital Imaging Study. Logopedics Phoniatrics

Vocology. In Press.

Guzman M, Castro C, Madrid S, Olavarría C, Leiva M, Muñoz D, Jaramillo E,

Laukkanen A-M. Air pressure and contact quotient measures during different semi-

occluded postures in subjects with different voice conditions. Journal of Voice. In

Press.

Guzman M, Castro C, Testart A, Muñoz D, Gerhard J. Laryngeal and Pharyngeal

Activity During Semi-occluded Vocal Tract Postures in Subjects Diagnosed With

Hyperfunctional Dysphonia. Journal of Voice. 2013;27:709-716.

6

AUTHOR´S CONTRIBUTION

Article I

Vocal Tract and Glottal Function During and After Vocal Exercising with

Resonance Tube and Straw.

All investigators participated in the research design of this study. The researcher in

charge of CT samples was Dr. Petr Krupa. Acoustic, EGG and aerodynamic

recordings were made by the author of this dissertation and Dr. Anne-Maria

Laukkanen. Moreover, the author of this dissertation made all measurement from

CT, acoustic, EGG, and aerodynamic samples. Auditory perceptual analysis

procedures were also organized by Marco Guzman. He also wrote the manuscript

and created the figures and tables. Dr. Anne-Maria Laukkanen, Dr. Sc. Jaromir

Horáček, and Dr. Ahmed Genied provided their comments.

Article II

The Influence of Water Resistance Therapy on Vocal Fold Vibration: A High

Speed Digital Imaging Study.

Original research design was made mainly by Marco Guzman and Dr. Anne-Maria

Laukkanen. Thereafter, all researchers made contributions to obtain the final

design. All investigators participated in different ways during experimental

laryngoscopic procedures. HSDI samples were acquired by Dr. Matthias

Echternach and Dr. Ahmed Geneid. High speed material was analyzed by Dr.

Matthias Echternach. Dr. Daniel Muñoz performed statistical analysis of data. The

author of the present dissertation entirely wrote the manuscript and was in charge

7

of figures and tables. Dr. Anne-Maria Laukkanen, Dr. Matthias Echternach, and

Dr. Ahmed Geneid provided their comments.

Article III

Air pressure and contact quotient measures during different semi-occluded postures

in subjects with different voice conditions.

The author of the present dissertation made the research design, participated in all

acoustic, EGG and aerodynamic recordings, performed the analysis of all samples,

and wrote de manuscript. He also was in charge of figures included in the original

article. Dr. Christian Olavarria and Dr. Miguel Leiva performed laryngeal

videostroboscopic procedures to corroborate laryngeal diagnosis of subjects.

Christian Castro, Sofia Madrid, and Elizabeth Jaramillo collaborated in recruitment

of subjects and experimental procedures. Statistical analysis was carried out by Dr.

Daniel Muñoz. Comments and contributions on the manuscript during the writing

process were made by Dr. Anne-Maria Laukkanen.

Article IV

Laryngeal and Pharyngeal Activity During Semi-occluded Vocal Tract Postures in

Subjects Diagnosed With Hyperfunctional Dysphonia.

Research design was made by Marco Guzman, Christian Castro, and Dr. Julia

Gerhard. All laryngoscopic procedures were performed by Dr. Alba Testart. Marco

Guzman wrote the entire manuscript, participated during all laryngoscopic

procedures and was in charge of visual assessment of laryngoscopic samples by

blinded judges. Christian Castro collaborated with recruitment of subjects and

experimental procedures. Statistical analysis was carried out by Dr. Daniel Muñoz.

English correction and comments on the manuscript were made by Dr. Julia

Gerhard and Christian Castro.

8

LIST OF ABBREVIATIONS AND SYMBOLS

ALR Amplitude to length ratio

ClQ Closing quotient

CQ Contact quotient

CT Computerized tomography

EGG Electroglottography

F0 Fundamental frequency

F1 First formant

F2 Second formant

F3 Third formant

F4 Fourth formant

F5 Fifth formant

FFT Fast-Fourier Transform

HNR Harmonic to noise ratio

HSDI High speed digital imaging

Hz Hertz

LTAS Long-term average spectrum

MRI Magnetic resonance imaging

OQ Open quotient

Poral Oral pressure

Psub Subglottic pressure

PTP Phonation threshold pressure

Ptrans Transglottal pressure

SOVTE Semioccluded vocal tract exercises

9

SF Spectral flatness

SPL Sound pressure level

VLP Vertical laryngeal position

10

SUMMARY

Physiologic approach of voice therapy is commonly used by speech-language

pathologists in treating patients with a wide variety of voice disorders. Voice

rehabilitation programs based on physiologic approach are designed to modify the

physiology of the vocal mechanism, including the three subsystems involved in

voice production (breathing, phonation, and resonance) in an integrated or holistic

way. Examples of physiologic voice therapy programs include: Resonant Voice

Therapy, the Accent Method (of Voice Therapy), and Vocal Function Exercises.

One common aspect to physiologic voice therapy programs is that they take

advantage of semioccluded vocal tract exercises (SOVTE). This group of exercises

includes phonation on voiced fricatives (consonants), nasals (consonants), lip and

tongue trills, hand over mouth, and phonation into different tubes with the free end

either freely in the air or submerged into a recipient filled with water. The general

aim of the present dissertation was to determine how the three subsystems involved

in voice production (breathing, phonation, and resonance) are affected by different

SOVTE. If this group of exercises is considered as a physiologic approach for

voice therapy, they should affect simultaneously glottal function, vocal tract

function, and aerodynamic variables during exercising.

Specific aims of this dissertation were: 1) to investigate the vocal tract

modifications and also the acoustic, aerodynamic and EGG characteristics of the

voice when comparing vocal exercising with two different tubes (traditional

Finnish glass tube and a thin stirring straw); 2) to observe the influence of tube

11

phonation into water on vocal fold vibration by using HSDI and phonovibrography;

3) to determine the effect of different artificial lengthening of the vocal tract

(phonation through a tube with the free end in water and in air) on air pressure

variables and vocal fold adduction in different types of subjects; and 4) to observe

and compare the effects of different types of SOVTE on VLP, A-P laryngeal

compression and pharyngeal width in a group of subjects diagnosed with

hyperfunctional dysphonia.

Study I assessed a male subject with classical singing training. CT was carried

out when the subject produced the vowel [a:] at comfortable speaking pitch,

phonating into the resonance tube (the distal end in the air) and when repeating [a:]

after the exercise. A similar procedure was performed with a narrow straw after

fifteen-minute silence. Anatomic distances and areas were measured from CT

midsagittal and transversal images. Acoustic, perceptual, EGG and Poral

parameters were also investigated. Study II evaluated eight subjects via HSDI

while phonating into a silicon tube with the free end submerged into water. Two

test sequences were studied: 1) phonation pre, during and post tube submerged 5

cm into water, with tube phonation lasting for five minutes, and 2) phonation into

the tube submerged 5 cm, 10 cm, and 18 cm into water. Several glottal area

parameters were calculated using phonovibrograms. Study III assessed forty-five

participants representing four vocal conditions: subjects diagnosed with 1) normal

voice and without voice training, 2) normal voice with voice training, 3) muscle

tension dysphonia, and 4) unilateral vocal fold paralysis. Participants phonated into

different kinds of tubes (drinking straw, 5 mm in inner diameter; stirring straw, 2.7

mm in inner diameter; silicon tube, 10 mm in inner diameter) with the free end in

air and in water. Aerodynamic, acoustic and EGG signals were captured

simultaneously. Mean values of the following variables were considered: EGG CQ,

F0, Psub, Poral, and Ptrans. Study IV evaluated twenty subjects with

hyperfunctional dysphonia. All participants were required to produce eight

12

different SOVTE during observation with flexible endoscope: lip trills, hand-over-

mouth, phonation into four different tubes, and tube phonation into water using two

different immersion depths. Each exercise was produced at three loudness levels:

habitual, soft, and loud. A evaluation test with three blinded laryngologists was

performed to determine the VLP, anterior-posterior laryngeal (A-P) compression,

and pharyngeal width. Judges rated the three endoscopic variables using a 5-point

scale. To assess intra and inter-rater agreement, an intraclass correlation

coefficient was used. A multivariate linear regression model considering VLP,

pharyngeal width, and A-P laryngeal compression was conducted. Correlation

analysis between variables was also conducted.

Study I demonstrated that during and after phonation into the tube or straw the

velum closed the nasal passage, the vertical laryngeal position was lower and the

hypopharyngeal area became wider compared to condition pre tube/straw.

Furthermore, the ratio between the inlet of the lower pharynx and the outlet of the

epilaryngeal tube became larger during and after tube/straw phonation. A stronger

spectral prominence in the singer/speaker’s formant cluster region after tube and

straw phonation was showed by acoustic results. Listening test revealed a better

voice quality after both straw and tube compared to condition before exercises.

CQEGG decreased both during the tube and the straw and remained lower after

exercising. Psub showed an increase during straw and remained higher after it.

Larger changes were observed during straw. Study II demonstrated certain trends.

Amplitude to length ratio, harmonic to noise ratio and spectral flatness (derived

from glottal area) decreased for all tube immersion depths, and glottal closing

quotient increased for 10 cm immersion. Contact quotient in turn increased for 18

cm immersion. CQEGG decreased during phonation into the tube in 5 cm depth, and

jitter decreased during and after it. Study III showed that all SOVTE had a

significant effect on Psub, Poral, Ptrans, and CQEGG. Phonation into a 55 cm silicon

tube submerged 10 cm in water and phonation into a stirring straw (the outer end of

13

it in the air) resulted in the highest values for CQ, Psub, and Poral compared to

baseline for all groups of subjects, regardless of the vocal status. Poral and Psub

correlated positively. Study IV indicated that VLP, A-P laryngeal compression and

pharyngeal width changed differently throughout the eight semi-occluded postures.

A narrower aryepiglottic opening, a wider pharynx, and a lower larynx than resting

position were produced by all SOVTE. Tube into the water and narrow tube into

the air demonstrated more prominent changes. Correlation analysis showed that

VLP significantly correlated with pharyngeal width and A-P laryngeal

compression. Additionally, pharyngeal width significantly correlated with A-P

laryngeal compression.

From the present dissertation it is possible to conclude that: 1) vocal exercises

including semiocclusions and artificial lengthening of the vocal tract have a

simultaneous influence on variables related to phonation, resonance, and

aerodynamic aspects. There is a clear interaction between the three subsystems,

each one influencing the two others during voicing; 2) Most changes produced

during SOVTE constitute relevant therapeutic and voice training tools for subjects

with voice disorders and for professional voice users, respectively. These changes

seem to lead to a more economic voice production; 3) Subjects with organic or

behavioral voice disorders, as well as healthy vocally untrained, and healthy trained

subjects behave similarly under similar exercise conditions. 4) Aerodynamic

aspects are mainly affected during SOVTE by increasing air pressure measures

such as Poral and Psub, the latter being a compensation of the former. An increase

in Poral, in turn, is expected since semiocclusions decrease the airflow rate. 5)

Influence of SOVTE in phonation is mainly related to the degree of glottal

adduction during exercising. This is mostly determined by the degree of resistance

to the airflow that each exercise offers and compensations caused by resistance.

High resistance seems to promote a greater vocal fold adduction compared to a

lower resistance exercise. Therefore, higher resistance exercises could be more

14

beneficial for patients with diagnosis such as vocal fold paralysis or presbyphonia.

On the other hand, glottal function of patients with hyperadduction or vocal fatigue

could be improved using exercises that involve lower airflow resistance; 6)

Important therapeutic goals are reached during SOVTE regarding vocal tract

configuration. A lower larynx, a wider pharynx and a higher velum is produced

during this type of exercises compared to vowel phonation. The changes are more

prominent when the airflow resistance is higher. These modifications are linked to

positive changes in acoustic and perceptual output of voice.

15

1 INTRODUCTION

1.1 Approaches of voice therapy

Over the years a variety of approaches (orientations) have emerged for the

treatment of voice disorders. Stemple (2000) classified the voice therapy

approaches into four major categories: hygienic, psychogenic, symptomatic, and

physiologic.

Hygienic orientation of voice therapy is based on two main aspects: 1) many

functional voice disorders are initiated and maintained by behaviors or vocal habits

that damage the laryngeal structures; 2) elimination of harmful and traumatic

behaviors will improve vocal performance. Therefore, hygienic approach of voice

therapy focuses on the identification and elimination of harmful vocal behaviors

followed by the development of proper vocal behaviors (Thomas and Stemple

2007). Common components of hygienic approach of voice therapy include:

hydration of the larynx, voice rest, silent cough, avoidance of screaming and

yelling, control of vocal loading, etc. (Stemple 2000).

Psychogenic approach of voice therapy supposes that underlying emotional or

psychosocial problems may cause voice disorders (Stemple 2000). This approach

of voice therapy focuses on identification and modification of the emotional and

psychosocial problems related to the onset and maintenance of the voice disorder.

According to this approach, when the psychogenic causes are resolved, the voice

disorder is eliminated (Aronson 1990).

16

Symptomatic voice therapy is based on the modification of vocal symptoms.

Voice therapy under this approach focuses on the modification of aberrant vocal

symptoms related to pitch, loudness and voice quality. Symptomatic voice therapy

is based on the belief that modification and correction of phonation, respiratory,

and resonance characteristics produce an improvement in the voice condition

(Thomas and Stemple 2007). Symptomatic approach involves several vocal

exercises to modify aberrant vocal symptoms (Boone 1977). In the recent years

additional facilitating voice exercises have been added to the original list of

symptomatic exercises (Boone and McFarlane 1988; Boone, McFarlane, and Von

Berg 2005). Some of the traditional and commonly used facilitating exercises in

voice therapy are: pushing exercises, humming, chewing exercise, yawn-sigh,

change of loudness, inhalation phonation, digital manipulation/digital pressure,

relaxation, establishing a new pitch, and “placement of the voice” techniques.

On the other hand, physiologic voice therapy is based on the belief that voice

disorders are best treated by modifying the underlying physiology of voice

production (Stemple 2000; Stemple, Graze, and Klaben 2000). Stemple et al.

(2000) suggest that the physiologic approach involves three key components: 1)

improving the balance among the main subsystems involved in voice production:

respiration, phonation, and resonance (vocal tract configuration and sensations

related to “voice placement”); 2) improving the strength, balance, tone, and

stamina of laryngeal muscles; and 3) developing a heathy mucosa covering of the

true vocal folds. Evidence suggests that physiologic methods of voice therapy have

greater scientific support (larger number of studies and higher level of evidence)

than other approaches of voice therapy (Thomas and Stemple 2007). Examples of

physiologic voice therapy programs include: Vocal Function Exercises (Stemple

2000), the Accent Method of Voice Therapy (Kotby 1995), and Resonant Voice

Therapy (Verdolini 1998). Each program approaches the voice condition in a

17

holistic or integrated manner with the aim of altering the overall physiology of

voice production.

1.2 Physiologic voice therapy programs

One common aspect to physiologic voice therapy programs mentioned above is

that all of them take advantage of SOVTE. This group of exercises includes

phonation on voiced fricatives, nasals, close vowels, lip and tongue trills, hand over

mouth, and phonation into different tubes with the free end either freely in the air

or submerged into a recipient filled with water.

Vocal function exercises program is based on series of four steps, all of them

using a semioccluded configuration of the vocal tract. In the first step (warm-up),

the patient produces a sustained vowel [i:] at a predetermined pitch. In the second

and third steps, the patient is required to produce glides up and glides down

respectively using a sustained bilabial consonant [ß:] with an exaggerated inverted

megaphone shape in the vocal tract. In the final step, the patient has to sustain the

consonant [ß:] with different predetermined pitches using the same inverted

megaphone shape that step two and three. Patients are required to produce all

exercises as softly as possible with a forward “placement of the voice”. Stemple et

al. (2000) stated that by improving the strength, endurance and coordination of the

systems involved in voice production (breathing, phonation, and resonance), the

exercises help to rehabilitate the voice.

The Accent Method focuses on the improvement of the respiratory, phonatory,

and articulatory aspects of speech using a holistic manner. This method is also

considered holistic or physiologic due to it simultaneously involves various voice

features such as pitch, loudness and timbre (Bassiouny 1998). Better control of

voice production is considered as the main goal of the Accent Method. Breath

18

support is expected to be affected by respiratory accents during exercises in this

method. The improvement of breath support is believed to reduce excessive

laryngeal muscular effort (Kotby et al. 1991). The Accent Method is based on three

main principles: 1) optimal breath support; 2) rhythmic practice of accentuated

vowels with progressive carryover to connected speech; and 3) rhythmic body

movements (Bassiouny 1998). In practical terms, Accent Method protocol is based

on the training the abdominodiaphragmatic breath while producing rythmic voiced

fricative consonants and close vowels (all considered semioccluded postures of the

vocal tract). Once the respiratory-phonation connection is well established, the

enhanced respiratory-phonation pattern is generalized to connected speech.

Resonant voice therapy, which is another physiologic program of voice

rehabilitation, teaches how to modify the voice production so that the result is a

more resonant voice quality. This program defines resonant voice as voice

production involving vibratory sensations, usually on the anterior region of face

and mouth in the context of easy phonation (Stemple et al 2000). The goal of this

therapy program is to achieve the strongest possible voice with the least effort and

vocal folds impact to minimize the probability of vocal fold injury (Verdolini et al.

1998). Verdolini (1998) suggested that resonant voice is associated with vocal

folds that are barely adducted or barely abducted. This glottic configuration is

expected to produce a relatively strong voice while providing some protection

against vocal fold tissue injury. In the resonant voice therapy program, the patient

moves through several stages within a hierarchy of increasing complexity.

Advancement through the hierarchy demands evidence of skill acquisition at earlier

stages (Roy et al. 2003). The methodology of this approach focuses on the

processing of sensory information. The therapy procedures begin with stretching

and breathing maneuvers followed by basic training gestures (voice exercises)

(Chen et al. 2007). In practical terms, phonatory exercises in the resonant voice

therapy program includes nasal and voiced fricative consonants (considered as

19

semioccluded postures of the vocal tract) while the patient phonates different

phonatory tasks from a sustained tone to connected speech (words, sentences, etc).

All steps involve sensory motor learning principles.

Even though Vocal Function Exercises, Resonant Voice Therapy and Accent

Method are the most known and studied physiologic voice therapy programs, there

are a number of programs or voice exercise sequences, based on the same

physiologic principles, which speech therapists use in different countries around

the globe. Many of those programs also take advantage of SOVTE. One of the

most commonly used semioccluded exercise is the tube phonation, either with the

free end in the air or submerged into the water. The latter is called water resistance

therapy.

Several types of tubes are currently used for voice therapy and training, some of

them are: 1) the traditional Finnish resonance glass tube (26-28 cm in length and 9

mm in inner diameter for adults, 24-25 cm, 8 mm for children); 2) commercial

plastic drinking straws; and 3) plastic stirring straws (narrow straws to stir coffee).

Furthermore, two main versions of water resistance therapy (tube phonation into

water) have been used: 1) Phonation into traditional Finnish resonance glass tube

submerged in water in a bowl (Laukkanen 1992); and 2) Lax Vox technique, which

consists of phonation in a flexible silicone tube (35 cm in length and 9-12mm in

inner diameter) submerged into a bottle filled with water (Sihvo 2006). In the glass

tube method, tubes are kept 1-2 cm or 5-15 cm below the surface of the water

depending on the patient’s needs. With the Lax Vox method, the participant is

instructed to keep the tube 1-7 cm in water for voice therapy (Sihvo 2006).

Additionally, some voice therapists or voice trainers use commercial plastic

drinking straws with the free end in a glass or bottle filled with water as a simple

version of water resistance therapy.

A considerable number of studies have been carried out to disclose the

underlying physics and physiology of tube phonation and other SOVTE. Some of

20

them have explored changes in the vocal fold vibration, others in vocal tract

configuration, and there are also studies focusing on the aerodynamic variables.

1.3 Semi-occluded vocal tract exercises

SOVTE have been widely used in voice therapy, vocal warm up and voice training.

It has been stated that the vocal exercises, such as artificial lengthening of the vocal

tract or narrow anterior constriction of the vocal tract, increase the vocal tract

impedance, specifically the inertive reactance which may favorably influence the

vocal fold vibration (Titze 1998; Story et al. 2000). Increased inertive reactance

changes the glottal flow amplitude and pulse shape (Rothenberg 1986; Story et al.

2000; Titze 2006a; Titze 2006b). Furthermore, the PTP (the Psub required to barely

initiate and sustain phonation) is reduced by increased vocal tract inertance (Titze

1998).

Vocal tract impedance can affect the voice in two ways: 1) through an

acoustic-aerodynamic interaction, and 2) through a mechano-acoustic interaction

(Rothenberg 1986; Story et al. 2000).

Regarding acoustic-aerodynamic interaction, the shape of the glottal flow pulse

is affected by the acoustic pressures in the vocal tract (Rothenberg 1981,

Rothenberg 1986; Titze and Story 1997; Titze and Laukkanen 2007). When

fundamental frequencies are below the first formant, the acoustic pressures in the

supraglottis cause a skewed flow pulse compared to variation of the glottal area.

Thus, the glottal airflow is suppressed at glottal opening and maintained during the

glottal closing phase. This increased skewing of the glottal flow waveform leads to

strengthening of the higher harmonics (less spectral tilt), increase in SPL (Bickley

and Stevens 1987; Laukkanen et al 2008; Titze 2008), and to a more resonant voice

quality, i.e. a brighter and louder sound, which is characterized by enhanced

21

vibratory sensations in the frontal part of face and mouth and easy voice production

(Titze 2004; Titze 2006a; Titze 2006b; 2008) pointed out that, since the skewing of

the glottal airflow signal is one of the determinants of vocal intensity, the source-

filter interaction can be used to increase intensity rather than vibrational amplitude

of the vocal folds, thus avoiding an increase in the vocal fold impact stress.

The second way that voice production can be affected by vocal tract impedance

is the mechano-acoustic interaction of the vocal tract pressures and the vocal folds

(Rothenberg 1986; Titze and Story 1997). When the vocal tract inertance is

increased, it may affects the vocal fold vibration itself. Specifically, the inertive

reactance lowers the PTP (Titze and Story 1997). A low PTP would promote ease

of phonation (decrease in perceived phonatory effort). According to Story et al.

(2000) this occurs when F0 approaches F1. Hence, when phonation is produced at a

frequency or close to F1 may allow for an efficient voice production that could be

associated with lower phonatory effort.

Related to this, Titze (1988; 2008) states that if the optimal condition for

phonation is that of maximum inertive reactance, F0 should coincide with the peak

of the reactance curve which is not located at F1 but occurs at a frequency

somewhat below it. To maintain favorable impedance conditions across pitches and

loudness, a vocalist may want to decrease the epilaryngeal area. When the

narrowed epilarynx is combined with a wide pharynx, the reactance never goes

negative below about 3000 Hz, which means that the acoustic load is inertive for

all possible values of F0 regardless of vowel or consonant (Titze and Story 1997).

In a recent modeling study with excised larynges, Conroy et al. (2014)

evaluated PTP during nine conditions: control, two tube diameters, three tube

lengths, and three levels of flow input. A significantly decreased in PTP was

detected for the longest tube and narrower tubes. These findings corroborate the

earlier statements related to the mechano-acoustic interaction of the vocal tract

pressures and the vocal folds (e.g. Titze and Story 1997).

22

Several studies with subjects phonating into resonance tubes of varying lengths

and diameters have been performed by Laukkanen to explore glottal function

(Laukkanen 1992; Laukkanen et al. 1995a; Laukkanen et al. 1995b; Laukkanen et

al. 1996; Laukkanen et al. 2007; Laukkanen et al. 2008). Changes in several

different variables both during and after tube phonation were reported across these

studies, including glottal waveform shape, glottal efficiency, laryngeal resistance,

and perceived vocal effort among others. According to Laukkanen et al. (1995a),

electromyographic evaluation showed that the average laryngeal muscle activity

during phonation into a tube was the equal or lower compared to phonation during

sustained vowel. These results seem to imply that during increased impedance of

the vocal tract there is a decreased laryngeal muscle activity and it tends to stay

lower in phonation immediately after exercises.

Related to glottal function, Titze et al. (2002) reported lower amplitude and

lower glottal closed time (obtained from EGG signal) during phonation into straws

compared to sustained vowel phonation. Authors suggested that the use of high

subglottal pressures needed in singing are possible with narrow straws, having

minimal vocal folds collision. Bickley and Stevens (1987) reported similar findings

in their study of EGG waveforms produced by a group of participants phonating on

voiced fricatives. Authors reported that for production of voiced fricative

consonants, the open phase of the glottal cycle increased by over 20% relative to

open postures of the vocal tract. Gaskill et al. (2008) studied the effect of a voiced

lip trill on CQEGG in classically trained singers and vocally untrained participants.

Most participants showed a tendency for a reduction in CQEGG during the lip trill,

with a more pronounced change in the untrained participants.

Some earlier studies have also assessed changes in the vocal tract shape during

SOVTE. Vampola et al. (2011) using CT observed the vocal tract shape in a female

subject before, during, and after phonation into a tube. Finite-element models were

made on the basis of CT registrations to study changes in vocal tract input

23

impedance. CT results indicated that the most dominant change during phonation

into the tube was the expansion of the cross-sectional area of the oropharynx and

the oral cavity due to a different tongue position. CT images also revealed that the

velum rose to seal the nasopharyngeal port during tube phonation, and this change

remained after the exercises. Moreover, the total volume of the vocal tract was

considerably larger after phonation into the tube. Calculations made with finite-

element model showed an increased vocal tract input inertance and an increased

radiated acoustic energy after phonation into tube. Laukkanen et al. (2010)

observed similar vocal tract shape modifications during glass tube phonation in a

female subject using magnetic resonance imaging. The velar closure improved and

the midsagittal area of the vocal tract became larger during and after phonation into

a straw. The ratio between the transversal area of the lower pharynx and the

epilarynx area increased both during and after the straw. Moreover, acoustic results

showed an increase in the energy of the speaker’s formant cluster and in the overall

sound pressure level. The frequencies of F2, F4 and F5 decreased, while F3

increased. There was a decrease on distances between F4 and F3 and F5 and F4.

Authors suggested that straw phonation helps establishing a speaker’s formant

cluster, which increases loudness and thus improves vocal economy.

Another effect of SOVTE that has been explored is the modifications produced

in air pressures during and after exercises. Titze et al. (2002) stated that when a

semi-occlusion, such as tube phonation is produced, there is a change in flow

resistance depending on the size of the tube. Thus, the Poral is positive, and this in

turn, would reduce the Ptrans (difference between subglottic and oral pressure),

unless Psub is raised. Findings from an experiment using different types of straws

(different diameters and lengths) evidenced that stirring straws offer more flow

resistance than drinking straws. Data from Psub and Poral demonstrated that the

higher is the flow resistance of the tube, the greater are the subglottic pressures that

subjects would need to generate. The increase in Psub was, in the same study,

24

corroborated in humans. Authors stated that Psub was inversely related to the straw

diameter (Titze et al. 2002). Furthermore, in a study designed to investigate

phonation during varying artificial extension of the vocal tract, Laukkanen et al.

(2007) demonstrated that Psub was higher during phonation into longer tubes.

They suggested that increased Psub could reflect higher vocal effort used to

compensate for increased vocal tract loading. Radolf et al. (2014) found that the

mean Poral increased about 4 times in habitual comfortable phonation and about 9

times in soft phonation, during voice production into a resonance tube submerged

10 cm below the water surface. The Psub also increased, likely due to a

compensation for the increase in supraglottal resistance. In a physical model of

voice production study by Horáček et al. (2014), similar results were reported. Both

Poral and Psub increased with a resonance tube 10 cm in water. Greater changes

were observed for soft phonation.

1.4 Semiocclusions of vocal tract and energy transfer

Voice rehabilitation programs based on SOVTE propose that a resonant manner of

voice production promote a maximal vocal economy (maximum acoustic output

with minimum degree of vocal fold impact stress). Furthermore, According to Titze

(1998) the origin of vibratory sensations in the front part of face and mouth in

resonant voice could be due to the efficiency of the energy conversion process at

the glottis (from aerodynamic to acoustic energy). When this process is efficient,

the vibrations are distributed all over the face and head areas. On the other hand,

when the energy conversion process is poor, the vibrations will remain mostly in

the laryngeal area. Titze (2003) stated that: “resonant voice engage the vocal tract

for maximum transfer of power from the glottis to lips, and ultimately all the way

to the listeners”. In a computer simulation study, Titze and Story (1997) suggested

25

that phonation can be made more efficient through impedance matching between

the voice source and the vocal tract. Impedance matching is related to the

maximum power transfer theorem and it can be accomplished by using

semiocclusions at the lips or a combination of adjustments in vocal fold adduction

and epilaryngeal constriction.

The maximum power theorem in electronic circuits and transmission systems

states that the internal impedance of the source should match the impedance of the

load for maximum power transfer and/or minimize reflections from the load (Titze

2002). The concept of impedance matching was originally developed for electrical

power, but can be applied to any other field where a form of energy (not

necessarily electrical) is transferred between a source and a load. An alternative

application of the maximum power theorem is on the “voice circuit” where the

glottis is the source impedance and vocal tract the load impedance. Titze (2002)

pointed out that: “in order to apply the maximum power transfer principle to voice

production, we must realize that pressure and flow are not as simply related to each

other as voltage and current are in a resistive circuit”. Since the acoustic load is a

complex impedance, the quantitative nature of the maximum power theorem is not

the same, but qualitatively the principle is expected to be preserved: for maximum

acoustic power to be delivered to the vocal tract, the internal glottal impedance

should be of the same order of magnitude as the acoustic load impedance of the

vocal tract. On the other hand for maximum efficiency (acoustic

power/aerodynamic power), the load impedance should be considerably higher

than the glottal source impedance.

In practical terms, impedance of the vocal tract can be calculated when the size

and shape of the vocal tract is known. Moreover, the relation between subglottic air

pressure and flow during phonation or the amount of contact between the vocal

folds may reflect glottal impedance. Titze (2002) hypothesizes that tight adduction

of the vocal folds requires a narrower supraglottal airway, whereas looser

http://en.wikipedia.org/wiki/Signal_reflection

http://en.wikipedia.org/wiki/Electric_power

http://en.wikipedia.org/wiki/Electric_power

26

adduction requires a wider airway to maximize the output power. Using simulation

models to calculate mean flows, peak flows, and oral radiated pressure (sound

pressure that comes from the mouth to the atmosphere) for an impedance ratio

between the vocal tract and the glottis, the author reported that when the impedance

ratio between source and load impedance approaches 1.0, maximum power is

transferred and radiated from the mouth.

1.5 Radiological imaging methods

Two main techniques have been used to obtain images of the vocal tract

configuration in speech and voice research. One technique is the magnetic

resonance imaging and the other is computerized tomography. These methods

allow to acquire a series of image slices, in one or more anatomical planes

(coronal, transversal, and sagittal), through a desired volume of the human body. In

addition, three-dimensional shape information about the vocal tract volume and

associated airways can obtained from these image slices.

The use of both, MRI and CT has several advantages and disadvantages. In the

present dissertation CT was used, therefore, the focus will be on that method. Some

characteristics of CT imaging are: the air-tissue interface is defined more

accurately than MRI, the scanning time required for acquisition of a full image set

is much lower than MRI, and both teeth and bones are precisely captured (Inohara

et al. 2010). The main disadvantage of CT main is the exposure to ionizing

radiation (Story et al. 1998), and for this reason; researchers avoid this technique

for vocal tract analysis.

Imaging techniques have been applied in voice and speech science to explore

normal morphology and development of the vocal tract (Moore 1992; Story et al.

1998; Fitch et al. 1999; Vorperian et al. 2005; Story 2008), to reconstruct three-

27

dimensional vocal tract models (Maeda 1982; Story et al. 1996; Whalen et al. 1999;

Clement et al. 2007; Howard et al. 2009; Inohara et al 2010), to create laryngeal

computer models (Chen 2012), to explore articulatory adjustments of the vocal

tract in speech and voice production (Crary et al. 1996; Story et at. 1998; Demolin

et al. 2003; Narayanan et al. 2004; Takemoto et al. 2006; Vasconcelos et al. 2011;

Miller et al. 2012a; Miller et al. 2012b), for laryngeal imaging (Silverman et al.

1995; Ahmad et al. 2009), comparing vocal tract morphometry in normal and

pathological voices (Yamasaki et al. 2011), voice improvement following surgical

treatment (Galgano et al. 2009), studying vocal tract configuration during singing

(Story et al. 2001; Sundberg et al. 2007; Echternach et al. 2008; Echternach et al.

2010; Echternach et al. 2011), effect of voice therapy and training (Vampola et al.

2011), and in surgical planning (Raappana et al. 2008).

In the present dissertation, CT was used to explore changes of the vocal tract

configuration before and during phonation into different tubes and after the tubes

have been removed. The fact that the air-tissue interface is defined more accurately

in CT than MRI, is the main reason that support the use of CT in instead of MRI in

our study. The exposure to ionizing radiation was carefully controlled by

experimenters not to produce any harmful effect on the participant, who was

informed about potential risks related to CT examination.

CT examination is based in X-ray physical principles. During CT procedure, the

interaction of X-ray photons (transmitted to the patient) produced multiple images.

Two types of photons can be transmitted: 1) primary photons, passing through the

tissue without interacting, or 2) secondary photons, produced from an interaction

with tissues. The secondary photons will in general be deflected from their original

direction and result in scattered radiation (Brenner et al. 2007).

28

1.6 Aerodynamic measurements of phonation

Aerodynamic measurements are a method to perform objective and noninvasive

exploration of vocal function. While there are a number of methods that can be

used to measure these aerodynamic parameters, current clinical and research

instrumentation is typically designed to obtain indirect estimates of average glottal

airflow rates (L/sec) and average Psub (cm H20), along with simultaneous acoustic

measures of F0 and SPL. These two main aerodynamic components (Psub and

transglottal airflow) reveal indirect information about the underlying valving

activity of the larynx (Stemple 2000).

Flow is a term used to describe the movement of a quantity (volume) of gas

through a given area in a unit of time. As such, the rate of flow is also referred to as

the volume velocity, and it can be quantified in liters or milliliters per second or per

minute (Stemple 2000). Based on this definition, the glottal airflow rate will be the

velocity of the air passing through the glottis during phonation (Baken et al. 2000).

Airflow through the larynx is usually measured with a pneumotachometer (air rate

meter). Airflow through the larynx can be estimated at the airway opening because

mass airflow through the larynx is comparable with that at the airway opening.

Thus airflow is an indicator of relative openness of the larynx and the extent to

which it allows air to pass between the trachea and pharynx (Hixon et al. 2008).

Psub is defined as the air pressure below the glottis and it represents the energy

available for creation of the acoustic signal of speech. It should be of appropriate

magnitude and well-regulated for an efficient voice production. Inappropriate

levels of Psub or inadequate pressure regulation can cause abnormal voice intensity

or sudden shifts in F0, SLP or voice quality. Measurement of Psub is a useful tool

for diagnosis and therapeutic procedures in the voice field (Baken et al. 2000).

Several methods to estimate Psub have been developed. However, the most used

method is the estimation of Psub from the Poral during production of voiceless

29

plosives. This method is not invasive. When the vocal folds are abducted and the

oral airway closed for a sufficiently long period during the production of voiceless

plosive, the Poral reaches its maximum. The magnitude of the Poral peak during

plosive production can be considered, therefore, to be a good estimation of Psub.

The typical protocol requires the subject to produce several repetitions of the

syllable /pa/ at a certain rate. Poral is captured by a plastic tube and pressure

transducer. Habitual speaking pitch and loudness are used, and syllables should

have similar stress. Under these conditions, the peaks of Poral are considered to be

equal to the Psub (Baken et al. 2000). Another pressure measure can be calculated

when Psub and Poral are known: the Ptrans, which is defined as the difference

between Psub and Poral. A positive Ptrans is always needed to produce airflow

from the lungs to the lips. This airflow, in turn, will be capable to produce vocal

folds oscillation.

1.7 Electroglottography

EGG is a noninvasive method to assess laryngeal movement, and more specifically,

the vibratory behavior of the vocal folds during voice production. It was first

introduced by Fabre in 1957 (Fabre. 1957). To obtain the EGG signal, two (or

more) electrodes are placed at either side of the thyroid cartilage at the level of the

vocal folds, and a low amperage frequency-modulated (0.3–5MHz) current is

passed between them. Variations in the electrical conductance of the larynx are

produced by variations in the contact area of vocal folds during the glottal cycle.

This results in fluctuations in the current between the two electrodes (Baken et al.

2012). These variations of the electrical admittance are reflected in the EGG

waveform (Baken et al. 2000; Kitzing et al. 2000). Reduction of the glottal space

causes an increased overall conductance between the electrodes. Therefore, the

30

conductance increases when vocal folds make contact, which is interpreted as

glottal closing. Conversely, a decrease in the conductance is interpreted as glottal

opening (Titze. 1990).

Different phases in standard EGG signal are:

• An upward portion with a steep positive slope, due to a fast contacting phase of

the vocal folds beginning in the lower margin up to the upper margin.

• A segment around the maximum peak, which represents the portion of maximal

contact of the vocal folds.

• A less steep downward-sloping portion corresponds to the gradual loss of contact

between the vocal folds. This segment may exhibit a “knee.”

• An open phase represented by a segment of the signal where there are several

oscillations of small amplitude. (Titze, 1990).

Several quantitative parameters can be obtained from the EGG signal, such as

CQ, open quotient (OQ), and F0, CQ being the most used in clinic and researches.

This parameter is defined as the ratio of the duration of the contact phase to the

entire glottal cycle period (Rothenberg et al. 1988; Titze 1990). Earlier

investigations support the possible association between EGG CQ and the degree of

vocal fold impact stress (collision during vocal fold vibration). Verdolini et al.

(1998) conducted a study to observe the possible use of the EGG CQ as a

noninvasive method to estimate vocal fold impact stress. Authors suggested that

EGG CQ strongly correlates with the degree of glottal impact stress. The study was

conducted on excised canine larynges. When impact stress increases (stronger

collision during vibration), vocal folds also tend to stay together longer and hence

CQ increases (Verdolini et al. 1998). Similar results have been obtained in a

computer modelling study by Horacek et al. (2006). Glottal CQ has also been

reported to distinguish some types of phonation, e.g. resonant voice from pressed

31

voice (Verdolini et al. 1998), breathy, normal and pressed voice (Kankare et al.

2012), and modal register and falsetto (Kitzing, 1982).

1.8 High speed digital imaging

Several visualization techniques of vocal fold vibration have been used in the

diagnosis of laryngeal disorders and assessment of normal voice production. In

order to capture aperiodic vibratory patterns of the vocal folds, High speed digital

imaging (HSDI) has emerged as effective method of visualization (Patel et al.

2008).

HSDI system records images of sustained phonatory vocal fold vibration and it

is designed to capture laryngeal motion at a rate of up to 10,000 frames per second,

which is fast enough to study the actual phonatory vibrations of the vocal folds,

allowing for the observation of aperiodic vibration (Patel. 2008). This system is not

dependent on F0 detection for motion extraction, and initiates recording with the

onset of phonation, therefore overcoming many of the disadvantages of

stroboscopy (Wittenberg et al. 2000; Christopher et al. 2001).

HSDI technique has been applied in different kind of studies such as

comparison with other visualization techniques of vocal fold vibration (Wittenberg

et al. 2000; Christopher et al. 2001; Patel 2008), to investigate vocal fold vibratory

characteristics in voice disorders (Lohscheller et al. 2008; Murugappan et al.

2009), to explore the physical mechanisms of vocal fold vibration during normal

phonation (Doellinger et al, 2006; Bonilha et al, 2008; Doellinger, 2009; Freeman,

2012; Ahmad et al, 2012), comparison between normal and disordered voices

(Bonilha et al. 2008; Inwald 2011), to identify new measures and methods to

evaluate the physiology of vocal fold vibration (Yan, 2005; Deliyski, 2005;

Orlikoff et al. 2009; Köster 1999; Zhang 2010; McDonnell et al, 2011; Krenmayr,

32

2012;), to explore physiology of singing voice (Lindestad et al. 2001; Echternach

2010), and to measure the effects of phonation exercises on voice production

(Laukkanen et al. 2007; Granqvist et al. 2014).

A study designed to investigate the clinical value of HSDI compared to

stroboscopy (Patel 2008), provided evidence that HSDI can be used to reveal

physiologic fluctuations in the vibratory pattern of vocal folds related to some

auditory-perceptual features of moderate to severe dysphonia. The data from this

study confirm that analysis of aperiodic signals can be reliably performed with the

HSDI. Moreover, authors suggested that the vibratory features of amplitude,

mucosal wave, phase symmetry, glottal closure, vocal fold edge, vertical plane

difference, phase closure, tissue pliability, and aperiodicity can be judged from

HSDI to obtain proper interpretations of vocal function.

To obtain precise and objective information from HSDI videos, several

techniques have been developed, the most used of them are Videokymography and

Phonovibrography.

Videokymographic imaging shows the vibrations of the vocal folds in a single

image. The technique of videokymography utilizes a special video camera that,

besides registering the images of the whole vocal folds at standard video rates, can

also scan a single line of the laryngeal image at a high-speed rate of about 8,000

images per second. Many successive images from this line are put together to form

the resulting kymographic image, which shows the vibrations of the selected part of

the vocal folds (Svec et al. 2007). The main limitation of videokymography is that

this method is not capable to show the vibration patterns of the entire length of both

vocal folds. A relatively new quantitative image-based method overcome this

difficulty, the phonovibrography (Lohscheller et al, 2007). Phonovibrography is a

fast, robust, and precise image processing strategy to extract information of the

vocal fold movements during phonation (Wittenberg et al. 2000). The method

detects the vocal fold edges and transfers their movement into a static geometric

33

presentation (phonovibrograms), which can be visually analyzed through various

calculations. Several glottal variables can be derived from high HSDI videos

analyzed with phonovibrography (Echternach et al. 2013).

1.9 Laryngeal endoscopy

Endoscopic visualization methods are the most important and common tools to

explore the laryngeal status. Visualization could be performed using a rigid

endoscope through the oral cavity or a flexible endoscope through the nose. These

methods are used either with a continuous or a stroboscopic light source.

Laryngeal videostroboscopy performed with a rigid endoscope is the most

common method to assess laryngeal structures and to diagnose vocal fold

pathologies in clinic. The principle of videostroboscopy is based on the fact that

very rapid events, such as the phonatory opening and closing of the glottis cannot

be followed by the eye. Talbot’s law states that the human eye is not capable to

perceive more than five frames per second due to the fact each image captured by

eye takes 0.2 seconds to reach the retina. In order to produce a visual illusion of

slow-motion during vocal folds vibration, laryngeal videostroboscopy produces

light flashes that are delivered slightly later in the vocal fold cycle than the one

before. Therefore, the phase difference between the vocal fold cycle and the flash

cycle steadily increases. When these illuminated different points of vocal fold

cycles are joined together, a slow-motion version of sampled phonatory cycles is

obtained. Thus stroboscopic flashes recreate cycles at a rate that is very much

lower than the actual fundamental frequency. This apparent much lower vocal fold

movement is possible to be followed by the eye (Baken et al, 2000).

Even though rigid laryngeal videostroboscopy is the most common method to

assess laryngeal structures and to diagnose vocal fold pathologies in clinic,

34

nasofiberoscopy with continuous light source is a useful method that has some

advantages over the rigid laryngoscopy: 1) it allows to explore not only laryngeal

structures, but also nasal passage and pharynx; 2) it is capable to assess all

laryngeal functions (respiration, swallowing, and phonation both in speaking and

singing); 3) gag reflex is unlikely to be produced during the procedure; and 4) the

patient is able to speak or sing almost normally. The fact that the mouth structures

(tongue, jaw, teeth, etc.) are free during this endoscopic procedure, allows

considering this exploratory method a useful tool in research to observe laryngeal

and pharyngeal activity during regular speaking and singing voice tasks. One of the

disadvantages of nasofiberoscopy may be that the nasal passage cannot be fully

closed. This, in turn, may make the subjects to do some compensatory maneuvers

which they would not otherwise do.

A number of investigations have been carried out thought naso laryngeal

fibroscopy to explore pathological voices (Morrison et al. 1983; Rubin et al. 2006,

Stager et al. 2000; Behrman et al. 2003), singing voice physiology (Yanagisawa et

al. 1989; Pershall et al. 1987; Mayerhoff et al. 2014; Guzman et al. 2013, Guzman

et al. 2015), and speaking voice physiology (Guzman et al. In press).

1.10 Aims of the study

The general aim of the present dissertation was to determine how aerodynamic

variables, phonation, and vocal tract configuration are affected by different

SOVTE.

The first study explored aspects related to the three subsystems in one single

subject by multiple measures. Its aim was to investigate the vocal tract

modifications and also the acoustic, aerodynamic and electroglottographic

characteristics of the voice when comparing vocal exercising with two different

35

vocal tract impedances (during phonation into a glass tube and a stirring straw). We

hypothesized that the glottis and/or the supraglottic behavior should adapt

differently to different impedance of the vocal tract.

The second study concentrated only on glottal function (phonation), and it

aimed to observe the influence of tube phonation into water on vocal fold vibration

by using HSDI and phonovibrography. Two questions were attempted to answer:

1) Is there any influence of tube phonation into water on vocal fold vibration? and

2) Does immersion depth affect vocal fold vibration?

The third study focused on aerodynamic variables and glottal functions.

Specifically, the purpose of this investigation was to determine the effect of

different artificial lengthening of the vocal tract (phonation through a tube with the

free end in water and in air) on air pressure variables and vocal fold adduction.

Research question was: Do subjects with different voice conditions change

differently the air pressure and vocal fold adduction variables during different

artificial lengthening of the vocal tract?

The fourth study aimed to observe and compare the effect of different

commonly used SOVTE on vertical laryngeal position (VLP), anterior-posterior

(A-P) laryngeal compression and pharyngeal width in a group of subjects

diagnosed with hyperfunctional dysphonia. Therefore, this study concentrated only

on modification of vocal tract function (affecting resonance aspects of voice

production).

36

2 MATERIALS AND METHODS

Article I. One single volunteer male subject (thirty-four years old) was assessed

through CT, acoustic analysis, EGG, and aerodynamic measures, during and after

tube phonation. Auditory perceptual analysis was also performed. The subject had

seven years of experience using SOVTE as vocal training and warm-up exercises.

No voice or hearing pathology was reported at the time of the experiment.

CT examination

Before CT examination, the subject was informed about the potential health

hazards. During CT examination, the subject was asked to stay in supine position

inside the CT machine, and he was required to produce the following phonatory

tasks: (1) to sustain vowel [a:], (2) to phonate a sustained vowel-like sound into a

glass tube (27 cm in length and 9 mm inner diameter) for fifteen minutes, and

immediately after that, (3) to produce another sustained vowel [a:]. Subject was

radiated only for few seconds during each phonatory task. Habitual loudness level

and speaking pitch was used during all phonatory tasks. No biomechanical

instruction (e.g. open your throat, close your mouth) were given to the subject, but

he was asked to produce a stable sound with a good closure at the lips and to feel as

strong as possible vibratory sensations on the front part of face and mouth. Fifteen

minutes of vocal rest was produced after sustained vowel [a:] recorded after

37

exercising. Then, similar sequence was produced into a plastic stirring straw (2.5

mm inner diameter and 13.7 cm in length), during fifteen minutes. A sustained

vowel [a:] was produced immediately after straw phonation. Each phonatory task

was performed twice.

All CT samples were obtained using a Toshiba – Aquilion CT machine. Voltage

120 kV, Scan options were used as follows: helical CT, Time of the rotation 0.5 s,

Slice thickness 0.5 mm, total number of slices: 510 during CT procedure. The

subject was required to keep the same body posture in the CT machine during all

examinations. A frame was used to keep the head position stable.

Distance (mm) and area (mm2) measurements were obtained from midsagittal

images. Moreover, cross-sectional areas were calculated from the transversal CT

images for all phonatory tasks. All measurements were carried out using the

software vPACS view, version 6.9.5 (Audioscan, Czech Republic). Distance

measures from midsagittal samples (Figure 1) included 1) vertical length of the

vocal tract (which is indicative of the vertical laryngeal position) measured as the

distance between the lowest point of the odontoid process of the Atlas and the

vocal folds following a vertical line, 2) horizontal length of the vocal tract

measured as the distance between the lowest point of Atlas and the narrowest point

between the lips, 3) lip opening measured as the distance between the lower edge

of the upper lip and the upper edge of the lower lip, 4) jaw opening measured as the

distance between the lowermost edge of the jawbone contour and the anterior end

of the hard palate, 5) tongue dorsum height measured as the distance between the

lowermost edge of the jaw bone and the uppermost point of the tongue dorsum, 6)

oropharynx width measured as the distance between the lowest point of the second

vertebra and the most posterior part of the tongue contour (for ensuring the same

angle we used straight line from the anterior uppermost edge of the jawbone

contour to the anterior lowest point of the second vertebra), 7) velum elevation

38

measured as the distance from the posterior upper edge of the hard palate and the

anterior lowest point of the uvula, 8) hypopharynx width measured as the distance

between the lowest point of the pharynx and the internal edge of the epiglottis

following a line from the anterior uppermost edge of the jawbone contour to the

lower point of the pharynx.

Cross-sectional areas from the same midsagittal images (Figure 2) included: 1)

oral cavity (A1) measured from the lips to the velum (up to the line connecting the

lowermost edge of the jawbone contour and a break of declivity on the velum

surface), 2) the pharyngeal region (A2) measured from the line ending A1 down to

the horizontal line connecting the lower edge of the third vertebra and the

lowermost edge of the jawbone contour, and 3) the epilaryngeal region (A3)

measured from the line ending A2, down to the vocal folds.

Figure 1: Distances (mm) measured in CT midsagittal images: 1) vertical length of the vocal tract, 2) horizontal length of the vocal tract, 3) lip opening, 4) jaw opening, 5) tongue dorsum height, 6) oropharynx width, 7) velum elevation, and 8) hypopharynx width.

39

Cross-sectional areas from the transversal CT samples (Figure 3), included: 1)

the inlet to the lower pharynx (Ap) just above the collar of the epiglottis and 2) the

outlet of the epilaryngeal tube (Ae) just below the collar of the epiglottis where the

epilarynx and sinus piriformes form three separate tubes. The ratio between these

two areas was also calculated (Ap/Ae). Areas from transversal CT images were

chosen taking into account the bending of the vocal tract. A line from backbone to

jawbone through the tip of arytenoids in a parallel way to the bottom line of the CT

image was use to obtain Ae. A line just above to the tip of arytenoids, which was

also parallel to the bottom line, was used to obtain Ap.

Figure 2: Areas (mm2) measured from CT midsagittal images: oral cavity (A1),

pharyngeal region (A2), and epilaryngeal region (A3).

Figure 3: Areas (mm2) measured in CT transversal images: the inlet to the lower pharynx (Ap) and the outlet of the epilaryngeal tube (Ae).

40

All distance and area measures were based on previous studies performed with

MRI. (Echternach et al 2010a; Echternach et al 2010b; Echternach et al 2011;

Laukkanen et al 2010).

Acoustic, electroglottographic and aerodynamic recordings

While recording simultaneously the acoustic, EGG and aerodynamic signals, the

subject was required to produce the following sequence of phonatory tasks:

1) To repeat several times (eight times approximately) the syllable [pa:] and to

produce a sustained vowel [a:] following the last [pa:].

2) To phonate a sustained vowel-like sound into a traditional Finnish glass

tube (27cm in length and 9 mm in inner diameter) during five minutes.

3) To repeat again the syllable [pa:] and sustain vowel [a] following the last

[pa:].

4) Fifteen minutes of vocal rest (complete silence).

5) To repeat several times the syllable [pa:] and a sustained vowel [a:]

following the last [pa:].

6) To phonate a sustained vowel-like sound into a plastic stirring straw of 2.5

mm inner diameter and 13.7 cm in length during five minutes.

7) To repeat again the syllable [pa:] and sustained vowel [a] following the last

[pa:].

The same comfortable speaking pitch and loudness level was produced during

the entire sequence. Pitch was auditorily monitored using an electronic keyboard

by one of the experimenters. Sound pressure level was monitored during the series

of consecutive syllables [pa:] by using a sound level meter (American recorder

41

technologies SPL-8810) (American recorder technologies, Inc. Simi Valley, CA)

placed at a distance of 40 cm from the mouth. Slow time response and a Z

frequency weighting were used. Subject was asked to feel vibratory sensations on

the front part of face and mouth during tube phonation. This was subjectively

controlled by the participant.

Acoustic, EGG and aerodynamic signals were recorded in a sound treated room.

A condenser microphone (Behringer ECM 8000) at a distance of 40 cm from the

mouth was used to record the audio signal. The mouth-to-microphone distance of

40 cm was used to make the recordings comparable with those made for several

years in Speech and Voice Research Laboratory, University of Tampere. The

choice of distance has been found to be best for spectral and also perceptual

analyses, since shorter distances will provide the proximity effect and longer

distances will increase the effect of room acoustics (Leino and Laukkanen). Poral

was recorded as an estimate of Psub during the occlusion of the consonant [p:] in

the syllable [pa:] and during shuttering of the outer end of the tube. A pressure

transducer (PT-25, Glottal Enterprises, Syracuse, NY) connected to a plastic tube

was used to capture Poral. The tube was inserted into the corner of the mouth,

extending a few mm behind the lips, without touching the tongue or any other oral

structure. The manometer MSIF2 (Glottal Enterprises, Syracuse, NY) was used.

A two-channel electroglottograph (Glottal Enterprises, Syracuse, NY) was used

to record EGG signal. A 20 Hz high-pass filtering was also used to exclude slow

variations in the signal amplitude, which could be due to articulatory movements.

Before every capture, electrodes were cleaned with a slightly wet tissue, and a thin

layer of conductive gel was applied (Mingograph electrode cream, Siemens-Elema

AB). The quality of the EGG signal was monitored throughout the recordings with

an oscilloscope.

Calibration of audio signal was performed for further sound pressure level

measurements in dB (Z) using a 440 Hz tone. Calibration of air pressure signal was

42

carried out with a Glottal Enterprises calibrator, model MCU-4 (Syracuse, NY).

Pressure and EGG signals were digitized and recorded simultaneously into two

different channels using the Soundswell Signal Workstation (Hitech Development

AB, Stockholm, Sweden). Sampling rate of 16 kHz was used. Audio signal was

recorded at a sampling rate of 48 KHz with 16 bits quantization using a DAT

recorder (Marantz PMD 671. Marantz, Mahwah, NJ).

From aerodynamic samples, the following variables were studied (Figure 4):

1) Psub (estimated from the maximum peak of the Poral during the occlusion

of the consonant [p:] in the syllable [pa:].

2) Psub during manual shuttering of the outer end of the tube.

3) Poral obtained during [a:] in the syllables and non-shuttered phase of tube

phonation.

4) Ptrans calculated from the difference between Psub and Poral.

A Soundswell Signal Workstation (Hitech Development AB, Stockholm,

Sweden) was used to measure the Poral values.

From EGG samples, mean value of CQEGG was obtained. The software

EFxHist, version 1.5 (Mark Huckvale. University college of London, UK) was

Figure 4: Measurement of Psub and Poral during repetition of syllable [pa:] (Left), and during manual

shuttering of the outer end of the tube (Right).

43

used. Analysis was performed from the middle section of the each EGG sample.

EFxHist software defines the contact phase using a criterion level of 50% from the

peak to peak amplitude of the EGG signal (Figure 5).

Acoustic analysis of vowel samples captured before and after tube/straw

phonation was performed using Praat software (version 5.2, Boersma & Weenink,

2008). FFT spectrum (spectral slice and spectrogram) and LTAS were applied. An

approximation of the formant frequencies from F1 to F5 was obtained from FFT

spectrum analysis. The formant frequencies were measured from the strongest

peaks or in the middle of two adjacent equally strong peaks in spectral slices. To

confirm formant values, wide band spectrograms (bandwidth 260 Hz) were used.

There the formant frequencies were located in the middle of the bands with the

strongest intensity. Calculation of frequency distances between F2-F1, F3-F2, F4-

F3, and F5-F4 was also performed.

Several variables were obtained from LTAS analysis (Figure 6):

1) Singer’s/speaker’s formant energy (energy between 2500-4000 Hz),

Figure 5: Calculation of CQ from the EGG waveform using a 50% peak to peak algorithm.

44

2) Singing power ratio (the difference of energy between the highest peak around

0-2 KHz and the highest peak around 2-4 KHz).

3) Total SPL (energy between 50-6000 Hz).

4) F1 energy (spectral energy between 500-800 Hz). A frequency bandwidth of 25

Hz and Hanning window were used for LTAS analysis.

Additionally, an auditory perceptual assessment to evaluate the voice quality of

the samples was carried out by 4 blinded judges (1 man, 3 women). All of them

were speech-language pathologists with at least eight years of experience in voice

clinic and reported normal hearing. Acoustic samples recorded before and after

tube/straw phonation were played in randomized pairs. Raters were asked to judge

which sample in each pair was produced with a better voice quality or if there was

no difference between them. The evaluation was performed in a sound-treated

room using a high quality Audioengine loudspeaker (Audioengine, Kowloon, Hong

Kong).

Article II. Eight volunteers (five males, three females) participated in the second

study. The age range of the participants was 23-45 years. No voice or hearing

problems were reported at the time of the experiment. Diagnosis of normal larynx

Figure 6: Schema including all spectral measures obtained from each acoustic sample.

45

was confirmed through laryngoscopy. The subjects of the study included four

trained singers, one subject with experience in speech training, and three subjects

who had no voice training.

Two test sequences were registered.

The first sequence included production of the following tasks during HSDI

registration:

1) Sustained and stable vowel [i:].

2) A vowel-like sound into a flexible silicone (lax vox-like) tube (45 cm in

length, 2 cm in inner diameter) whose free end was submerged 5 cm into recipient

filled with water. Tube phonation was performed for five minutes.

3) Sustained and stable vowel [i:].

The second sequence consisted of:

1) Tube phonation with the tube submerged 5 cm into the water (taken from

the previous sequence),

2) Tube phonation with the tube submerged 10 cm into the water

3) Tube phonation with the tube submerged 18 cm into the water.

Therefore, a total of 5 samples were obtained from each subject ([i:] pre tube,

phonation during tube submerged 5 cm in water, [i:] post tube, phonation during

tube submerged 10 cm in water, and phonation during tube submerged 18 cm in

water). All phonatory tasks in both sequences were performed at the same

comfortable pitch and vocal loudness that was chosen by the subjects in the first

46

[i:]. One of the experimenters monitored pitch auditorily during voice production,

and used a grand piano for reference.

HSDI registration during all phonatory tasks was performed via a rigid

endoscope attached to a plastic mouth piece. The flexible tube was also attached to

the same mouth piece. To ensure the depths of immersion, three marks were made

on the water container at 5 cm, 10 cm, and 18 cm (Figure 7). Even though the

subjects phonated for 5 seconds during each phonatory task, only one second (from

the mid portion) of phonation was captured by the high speed camera.

Rigid laryngoscopic examinations were performed by two experienced

laryngologists. All participants stayed in standing position during examinations,

and no topical anesthesia was applied. A HRES-Endocam 5562 system (Fa. Wolf,

Knittlingen, Germany) coupled with a 90o rigid endoscope (Fa. Wolf, Knittlingen,

Germany) was used. The high speed camera allows recording up to 4000 frames

per second with a pixel resolution of 256 x 256. A 300-W xenon light source

(Auto-LP 5131, Richard Wolf, Knittlingen, Germany) was used for ilumination. To

prevent tissue overheating during the recordings, the duration of light exposure was

kept at a minimum.

Figure 7: Experiental set-up during HSDI examination.

47

To extract information of the vocal fold movements during phonation,

Phonovibrography was used (Lohscheller et al. 2007). Thousand frames (250 ms)

from each sample of the high speed material were analyzed using the custom made

Phonovibrogram Software Tool (Glottal Analysis Tools v5, Michael Döllinger and

Denis Dubrovsky, Dept. of Phoniatrics and Pedaudiology, Medical School

Erlangen, Germany). Only the mid and stable portions of each 250 ms sample

(excluding the onset and offset) were analyzed. Segmentation of the glottis area

was performed in a single frame and semi-automatically transferred to the other

frames. Secondly, a phonovibrogram was established by construction of glottal

length axis (Lohscheller et al, 2007). Figure 8 explains how phonovibrogram is

established from a high speed sample.

From all high speed samples, the following glottal variables were obtained.

Amplitude to length ratio (ALR): ratio between vibrational amplitude at the

center of the vocal folds and the length of the vocal folds.

Closed quotient (CQ): ratio between closed phase and the entire glottal

period.

Figure 8 (courtesy of Dr. Michael Doellinger): (A) A single image of an endoscopic high-speed recording is shown; the

orientation is given from patients' view. Visible are the left and right vocal fold, the glottis (dark) and the vocal fold edges. (B)

Generation of the Phonovibrogram: The vocal fold edges are detected and the distance between vocal fold edges and glottis axis

are computed. Then, the left vocal fold edge is rotated by 180° around (P) - the posterior commissure. The distances between the

vocal fold edges and the glottis axis are color encoded: brighter colors represent larger distances from the glottis axis. This is

done for all single frames within a movie, yielding a two-dimensional picture representing the vocal fold vibrations (C). Here, a

PVG is given for a human subject with normal phonation (i.e. periodic over time, left-right symmetric) over three cycles.

http://www.ncbi.nlm.nih.gov/pubmed?term=Lohscheller%20J%5BAuthor%5D&cauthor=true&cauthor_uid=18216742

http://www.ncbi.nlm.nih.gov/pubmed?term=Lohscheller%20J%5BAuthor%5D&cauthor=true&cauthor_uid=18216742

48

Closing quotient (ClQ): ratio between closing phase and entire glottal

period.

Fundamental frequency (F0): number of cycles of vocal fold vibration per

second.

Harmonic to noise ratio (HNR): relation between harmonic energy and

noise energy.

Spectral flatness (SF): spectral declination or spectral tilt (frequency range

from 0 to 2.000 Hz). Flatness is maximal for white noise.

Jitter %: frequency perturbation in percentage.

Shimmer %: amplitude perturbation in percentage.

Only descriptive statistics were calculated for the variables, including median

and interquartile range. We analyzed the trend of values for all parameters and

phonatory tasks without using hypothesis testing to avoid lack of statistical power

due to small sample size. All analyses and graphics were performed using Stata

13.1 (StataCorp, College Station, TX).

Article III. Informed consent was obtained from forty-five volunteer subjects for

this study. All participants were asked to undergo rigid videostroboscopy by two

experienced laryngologists, to confirm medical diagnosis. No topical anesthesia

was used during endoscopic procedure. Twelve subjects had the diagnosis of

healthy larynx and voice and no voice training was reported (6 males, 6 females).

Nine subjects were diagnosed with healthy larynx and voice and voice training was

reported (2 males, 7 females). Fourteen subjects were diagnosed with muscle

tension dysphonia (MTD) (5 males, 9 females), and 10 subjects were diagnosed

with unilateral vocal fold paralysis (paramedian vocal fold position) (2 males, 8

females). The average age was 24 (20-33) years for subjects with healthy voice and

without voice training, 27 (21-37) years for subjects with healthy voice with voice

training, 28 (23-35) years for subject with MTD, and 45 (31-56) years for subject

49

with vocal fold paralysis. Participants from all groups were native speakers of

Spanish.

The repetition of syllable [pa:] at a rate from 2.5 to 4 syllables per second, in

habitual, comfortable speaking pitch and loudness, was recorded from all

participants as the baseline condition. Then, a series of phonation samples on five

semi-occluded vocal tract postures in a random order were produced by all

subjects:

1) Drinking straw with the free end in air (5 mm in inner diameter and 25.8 cm in

length).

2) Stirring straw with the free end in air (2.7 mm in inner diameter and 10.7 cm in

length). 3) Silicon tube (Lax Vox-like) with the free end in air (10 mm in inner

diameter and 55 cm in length).

4) Silicon tube (Lax Vox-like) with the free end submerged 3 cm below water

surface.

5) Silicon tube (Lax Vox-like) with the free end submerged 10 cm below the water

surface.

Three repetitions were performed for each phonatory task (including baseline

condition). Therefore, a total of 810 samples were recorded (45 subjects x 6

postures x 3 repetitions). All the exercises were demonstrated to the participants by

a trained speech-language pathologists before data collection. Participants were

instructed to phonate with ease and to feel vibrations in the anterior face and mouth

during all SOVTE. The use of a comfortable speaking pitch and loudness level was

required during all phonatory tasks. Loudness level was perceptually controlled by

experimenters. Pitch was also monitored by the experimenters using an electronic

keyboard for a reference.

50

All samples were recorded digitally at a sampling rate of 22.1 KHz with 16

bits/sample quantization. Acoustic signal was recorded at a constant microphone-

to-mouth distance of 20 cm, using a condenser microphone AKG CK 77 (AKG

Acoustics, Vienna, Austria) integrated into the Phonatory Aerodynamic System

(PAS). Acoustic samples were captured to obtain F0.

Aerodynamic and EGG devices were connected to a Computerized Speech Lab,

Model 4500 (KayPENTAX, Lincoln Park, NJ), which in turn was connected to a

desktop computer running a Real-Time aerodynamic and EGG analysis software

(KayPENTAX, Model 6600, version 3.4, KayPENTAX, Lincoln Park, NJ). EGG

signal was captured with an electroglottograph (KayPentax, model 6103,

KayPENTAX, Lincoln Park, NJ). EGG signal quality was monitored using a real

time oscillogram incorporated in the EGG software.

A Phonatory Aerodynamic System (PAS; KayPentax, model 4500,

KayPENTAX, Lincoln Park, NJ) was used to collect aerodynamic data. Calibration

of the air pressure was performed before data collection. The Poral was captured

with a pressure transducer connected to a thin flexible plastic tube (13 cm in

length). The tube was inserted into the corner of the mouth, extending a few

millimeters behind the lips. Subjects were instructed not to touch the tube with any

oral structure. To avoid air leakage through the nose, a nose clip was used for all

participants during data collection.

A Real-Time aerodynamic and EGG analysis software (model 6600, version 3.4

KayPENTAX, Lincoln Park, NJ) was used to analyze all samples. From the middle

section of each sample, the most stable part was analyzed.

From EGG, aerodynamic and acoustic samples, the following variables were

obtained:

1) CQ from electroglottographic signal.

2) F0 from acoustic signal.

http://www.kaypentax.com/index.php?option=com_product&controller=product&Itemid=3&cid%255b%255d=63&task=pro_details



51

3) Psub estimated from the maximum peak of the Poral during the occlusion

of the consonant [p:] in the syllable [pa:].

4) Psub estimated during manual shuttering of the outer end of the tube.

5) Poral obtained during vowel [a:].

6) Poral obtained during non-shuttered phase in tube phonation.

7) Ptrans obtained by calculating the difference between Psub and Poral.

8) Peak-to-peak values of Poral from samples where bubbling was produced

(Figure 9).

9) Bubbling frequency from the oral pressure signal.

Criterion level of 25% from the peak to peak amplitude of the EGG signal was

used for CQ analysis.

All statistical analyses were performed using Stata® 13.1 (StataCorp, College

Station, TX), p<0.05 was considered to be statistically significant, and all reported

Figure 9: Modulations in oral pressure caused by bubbling of water. The registration is from one participant while phonating

into a silicon tube submerged 10 cm in water.

52

p values were two-sided. Median and interquartile range (IQR) were used to

describe numerical variables. Kruskal-Wallis test was used to compared phonatory

task and vocal status separately. To observe the joint influence of phonatory task

and vocal status in vocal parameters, a generalized multivariable linear model was

performed. Separate subgroup analysis (Silicon tube submerged 3 and 10 cm into

water) for minimal and maximal Poral was also performed using Wilcoxon test.

Finally, linear correlation analysis using Pearson coefficient for overall correlation

was used.

Study IV. Twenty-one subjects participated in the study (average age of them

was 26 years, with a range of 20-28 years). All participants had the diagnosis of

hyperfunctional dysphonia without any vocal fold lesions and reported at least one

year of voice problems. None of them reported previous voice therapy, voice

training or smoking habit. Laryngeal diagnosis was made through flexible

laryngoscopy by a laryngologist with more than twenty years of experience in

voice clinic.

Nasofiberoscopic procedure

Three laryngoscopic variables were evaluated (Figure 10): 1) VLP defined as the

perceived distance between vocal folds and distal part of flexible endoscope, 2)

pharyngeal width defined as the perceived area surrounded by the pharyngeal

walls, and 3) anterior-posterior (A-P) laryngeal compression defined as the

perceived distance between epiglottis and arytenoid cartilages. Laryngeal and

pharyngeal activity was assessed through flexible laryngeal endoscopy (Olympus

ENF type p4). The flexible laryngoscope was connected to a video camera (Sony

DCX Ls1 Sintek) and a Richard Wolf LP 4200 light source. Digitalization of

analogue samples was performed with a Pinnacle Studio HD 10 software. To allow

a full view of the pharynx and larynx, the flexible endoscope was positioned right

53

below the tip of the uvula. This placement was maintained with the laryngologist’s

finger, which fixed the fiberscope against the alar cartilage of the nose. Since

observation of laryngeal height modifications and other laryngeal/pharyngeal

configurations can be affected by movement of the fiberscope, a steady placement

of the fiberscope is crucial. No topical nasal anesthesia was used during

procedures.

During flexible endoscopic examination, participants produced eight different

SOVTE:

1) Phonation into a long-wide straw (6 mm inner diameter and 20 cm in

length) with the free end in air.

2) Phonation into a long-narrow straw (3 mm inner diameter and 20 cm in


3) Phonation into a short-wide straw (6 mm inner diameter and 10 cm in


4) Phonation into a short-narrow straw (3 mm inner diameter and 10 cm in


5) Phonation into a long-wide straw (6 mm inner diameter and 20 cm in

length) submerged 3 cm below the water surface.

6) Phonation into a long-wide straw submerged 10 cm below the water

surface.

Figure 10: Laryngoscopic images showing how laryngoscopic variables were evaluated. Pharyngeal width

(A), A-P laryngeal compression (B), and vertical laryngeal position (C).

54

7) Phonation using the hand-over-mouth technique.

8) Phonation with lip trill.

Plastic commercial drinking and stirring straws were used for tube phonation

both with the free end in air and into the water. Two repetitions were performed for

each phonatory task.

Each SOVTE was produced for 7 seconds and the entire sequence lasted for

approxi mately fifteen minutes. Regular breathing was asked between tasks in order

to avoid the effect of lung volume on the laryngeal height. Participants were

required to phonate each exercise at three loudness levels: soft, moderate, and loud.

Only moderate loudness was used during tube phonation into the water. Habitual

speaking pitch was used to avoid variation in vertical laryngeal position. An

electronic keyboard was used to give and control the pitch by one of the

experimenters during the entire test sequence.

Three blinded laryngologists with more than 4 years of experience in working

with voice patients assessed the VLP, A-P laryngeal compression, and pharyngeal

width in each video sample. To avoid the possible effect of voice quality on the

judges’ ratings, all audio signals were removed from samples. Judges rated the

three endoscopic variables using a 5-point Likert scale. For VLP (1 = very high, 5

= very low), for A-P laryngeal compression (1 = very open, 5 = very narrow), and

for pharyngeal width (1 = very narrow, 5 = very wide). The three judges

participated in a 1-hour training session in videolaryngoscopy examination, to

standardize the rating parameters and rating scales. Two one hour-sessions were

necessary to assess all video samples by each rater. Fifteen percent of the samples

were randomly repeated in order to assess the intra-rater reliability. Judges were

not aware of these repetitions.

Descriptive statistics were calculated for the variables, including mean and

standard deviation. A multivariate linear regression model was used to obtain an

55

intraclass correlation coeffient (ICC) to assess the judges’ concordance (intra and

inter-rater agreement). Then, another multivariate linear regression model was

performed considering VLP, pharyngeal width, and A-P laryngeal compression as

outcomes, and phonatory tasks and loudness as predictive variables. Simple

correlation analysis between vertical laryngeal position, pharyngeal width, and

anterior-posterior laryngeal constriction was also conducted using Spearman’s rho.

One-way ANOVA analysis for testing differences between phonatory task scores,

was used. The analyses were performed using Stata® 12.1 (StataCorp. 2011.

College Station, TX: StataCorp LP). A p-value < 0.05 was considered statistically

significant.

56

3 RESULTS

Article I. Main findings observed for distance measures from midsagittal images

were: 1) Lower VLP for both glass tube and straw, being more prominent during

the later. This lowered laryngeal position remained after tube and straw phonation.

2) Another important change was detected in the velum position, which rose to seal

the nasopharyngeal port during the tube and straw phonation. The higher velum

position remained after tube and straw phonation. 3) A decrease in oropharynx

width was observed when comparing samples before and after both straw and tube

phonation. 4) Hypopharynx became wider during straw phonation compared to

samples before. 5) A decrease during both tube and straw phonations was found for

Jaw opening.

Main outcomes detected in area measurements were: 1) an increase for Ap area

both during tube and straw compared to vowel phonation before exercise. More

prominent changes were observed during straw. 2) The Ae area showed an

increment during straw phonation. 3) The Ae area decreased after both tube and

straw phonation. More prominent changes were observed after straw. 4) The ratio

of areas Ap/Ae increased for all conditions compared to vowel phonation before

them. 5) Oral cavity area (A1) decreased during both tube and straw phonation. 6)

An increase was observed in pharyngeal region (A2) during both tube and straw

compared to vowel phonation. 7) The epilaryngeal region (A3) became larger

during both glass tube and straw phonation, being the change more prominent

during straw.

57

Results from acoustic analysis revealed the following major changes: 1) There

was an increment of the energy in the speaker’s/singer’s formant cluster region

after straw 2) A decrease was observed in SPR when compared samples before and

after both tube and straw. The difference was greater after straw. 3) Regarding

formant frequencies, the major change was a decrease in F1 after both tube and

straw. 4) The rest of formant frequencies also experimented a decrease; however,

it was smaller than the change found in F1. 5) A formant cluster was observed

between F3 and F4. They were closer to each other after exercising with both tube

and straw. This cluster was located in 2588-2930 Hz and 2603-2966 Hz after tube

and straw respectively.

Auditory perceptual assessment of voice samples before and after exercises

revealed that all samples produced after straw were evaluated by the four judges as

better in voice quality than the samples before. The same degree of agreement was

not found for tube phonation.

Aerodynamic and EGG exploration showed: 1) Psub increased during straw

phonation and this increment remained after exercise. 2) During glass tube, this

variable did not demonstrate a clear change, but it became slightly higher after

tube. 3) Poral also increased during both tube and straw, being the change greater

during straw. 4) Ptrans decreased during both tube and straw. 4) There was a

decrease of EGG CQ during both tube and straw. This change remained after tube

and straw.

Article II. Results of this study are presented in two ways: 1) median and

interquartile ranges for both sequences (Tables I and II). Recall that no hypothesis

testing was used to avoid lack of statistical power due to small sample size. 2)

trends of values for all parameters during both sequences. A trend among

participants was considered when similar changes (compared to vowel phonation)

were observed for the majority (five or more) of the subjects.

58

Parameter During 5cm During 10 cm During 18 cm

F0 222.91 ±59.63 235.29 ±81.41 216.38 ±72.35

HNR 14.38 ±2.82 14.14 ±3.30 12.80 ±1.94

Jitter 3.23 ±1.72 4.09 ±0.89 3.71 ±4.18

Shimmer 0.39 ±0.48 0.39 ±0.16 0.49 ±0.30

SF -17.88 ±2.87 -19.55 ±3.58 -17.95 ±5.08

CQ 0.26 ±0.15 0.25 ±0.11 0.18 ±0.13

ALR-R 0.07 ±0.05 0.07 ±0.02 0.08 ±0.05

ALR-L 0.09 ±0.05 0.10 ±0.05 0.12 ±0.03

For the first sequence (phonation pre, during and post tube submerged 5 cm into

water), the following trends were observed during phonation into tube 5 cm in

water: 1) CQ decreased (5 of 8 participants) compared to baseline condition; 2)

HNR decreased (6 participants), 3) SF decreased (five participants); 4) ALR ratio

(right and left vocal folds) decreased (5 and 6 participants, respectively), 5) F0

increased (6 participants); and 6) Jitter decreased (five participants). These results

Parameter Pre 5cm During 5cm Post 5cm

F0 198.06 ±88.97 222.91 ±59.63 216.38 ±109.54

HNR 15.38 ±2.85 14.38 ±2.82 15.69 ±7.84

Jitter 3.73 ±3.91 3.23 ±1.72 2.19 ±1.83

Shimmer 0.29 ±0.21 0.39 ±0.48 0.45 ±0.49

SF -20.26 ±3.75 -17.88 ±2.87 -20.00 ±1.89

CQ 0.22 ±0.05 0.26 ±0.15 0.23 ±0.13

ALR-R 0.08 ±0.04 0.07 ±0.05 0.07 ±0.03

ALR-L 0.11 ±0.05 0.09 ±0.05 0.09 ±0.04

Table I: Medians and interquartile ranges for glottal area parameters pre, during and after tube

phonation submerged 5 cm in water (sequence 1).

Table II Medians and interquartile ranges for glottal area parameters during tube submerged 5

cm, tube submerged 10 cm, and tube submerged 18 cm (sequence 2).

59

suggest that tube phonation with the free end submerged 5 cm into the water is less

tight, softer, a bit more noisy and yet more stable.

Main trends observed after phonation into tube 5 cm in water were: 1) an

increase in CQ (6 participants), ClQ (5 participants), HNR (5 participants), and F0

(5 participants); and 2) a decrease in jitter (5 participants). The findings suggest

that phonation was tighter, and voice was less noisy and more stable.

For the second sequence (tube submerged 5, 10 cm, and 18 cm into water),

mayor changes were: 1) CQ decreased with tube 5 cm in water (5 participants); 2)

CQ increased with tube 18 cm in water (six participants); 3) ClQ increased during

tube 10 cm (6 participants); 4) HNR decreased for all tube depths (6 and 5

participants, respectively); 5) ALR decreased for all tube depths (6 and 5

participants, respectively); 6) Jitter decreased for tube 5 cm in water (5

participants); 7) Jitter increased for tube 10 cm in water (5 participants); 8) F0

increased for tube 5 cm and 10 cm in water; and 9) SF decreased during tube

submerged 5, 10 and 18 cm below the water surface (five participants for each

condition). These findings suggest that a less tight phonation is produced when

tube is submerged 5 cm in water compared to the rest of depths of immersion. The

results for deeper immersion are more difficult to interpret. Decreased ALR, HNR

and SF seem to imply softer phonation (less collision between the vocal folds),

while increased ClQ and CQ suggest the opposite.

Article III. Significant differences were found for the six phonatory conditions

when compared to each other for all variables (p<0.05), except for F0. Moreover,

an increase in Psub, Poral, Ptrans, and CQ was observed for all SOVTE produced

for all vocal conditions when compared to the baseline. It is important to notice that

most variables behaved in average in the same way regardless of the vocal status of

the participants during phonatory tasks.

A strong positive correlation between Psub and Poral (r=0.82; p < 0.001) was

found when taken all conditions together. When each group was studied separately,

60

similar results were evidenced. Correlation was observed between Psub and Poral

for each group (vocal condition). For normal untrained participants: r=0.86; p <

0.001, for normal trained participants: r=0.89; p < 0.001, for subjects with muscle

tension dysphonia: r=0.79; p < 0.001, and for subjects with vocal folds paralysis:

r=0.82; p < 0.001.

Regarding maximum and minimum Poral during bubbling, results showed that

when tube immersion depth was deeper, significantly higher values were observed

for both maximum and minimum Poral. Nevertheless, no significant differences

were found for the average peak-to-peak amplitude of Poral regarding immersion

depth. No significant differences were observed between voice conditions either.

Article IV. A good intra-rater reliability was observed for each judge. Moreover,

the three blinded judges obtained a high agreement (high inter-rater reliability)

(ICC= 0.79 [0.66-0.87], p<0.0001).

When compared score averages for each laryngoscopic variable by phonatory

task, all SOVTE were found to have a significant and differential effect (p <

0.0001). Thus, VLP, A-P laryngeal compression and pharyngeal width changed

differently throughout the eight SOVTE. All semi-occlussions produced a decrease

in VLP compared to the resting position (Figure 11). Phonation with tube into the

water (10 and 3 cm below the surface) and phonation into a long-narrow tube

produced the three lowest VLP. The same three phonatory tasks caused the widest

pharynx and the narrowest A-P laryngeal compression.

Figure 11: Laryngoscopic images for baseline (A), during phonation into a long-wide straw (6 mm inner diameter and 20 cm in length) with the free end in air (B), and during phonation into a long-wide straw (6 mm inner diameter and 20 cm in length) with the free end submerged 10 cm in water.

61

Correlation analysis showed that VLP significantly correlates with pharyngeal

width (rho=0.578;p<0.0001) and A-P laryngeal constriction

(rho=0.3364;p<0.0001). Moreover, pharyngeal width significantly correlates with

A-P laryngeal constriction (rho=0.18;p=0.001)

62

4 DISCUSSION

Most physiologic voice therapy programs (e.g. Vocal Function Exercises, Resonant

Voice Therapy, and Accent Method) are based on SOVTE. They consider that

physiologic approach should affect simultaneously breathing, glottal and vocal

tract functions during exercising. Therefore, the general aim of the present

dissertation was to determine how the three levels involved in voice production are

affected by different SOVTE. Different exploratory methods, subjects with

different voice conditions, and voice conditions were used to accomplish this

purpose.

4.1 Influence of semioccluded exercises on aerodynamic variables

Earlier investigations (Titze et al 2002; Horáček et al 2014; Radolf et al 2014;

Granqvist et al. 2014) have shown changes in air pressure during SOVTE.

However, these studies have included at the most two subjects and all of them have

been vocally trained.

The third study included in the present dissertation sought to investigate the

effect of phonation into tubes in air and submerged in water on air pressure

variables and vocal fold adduction. This study is the first one including a large

number of subjects (n=40) with four different voice conditions. Regarding air

pressure variables, findings showed an increase in Psub and Poral compared to the

63

baseline condition during all SOVTE. In average, phonation with the silicon tube

below the water surface and into the stirring straw in the air produced the highest

values for Psub. Similar results were found in the single case study (Study I) of the

present dissertation. There was an increase of Psub during straw phonation, and

this increment remained after removing the straw. Poral also evidenced an

increment during both tube and straw. Our results from Study I and III are in line to

previous works regarding air pressure related variables. Recently, Radolf et al.

(2014) in single case study reported that Psub also increased during voice

production into a glass resonance tube submerged 10 cm below the water surface.

The mean Poral also increased in both habitual comfortable and soft phonation, the

increment being greater during the latter. Furthermore, Horáček et al. (2014), using

a physical model of voice production, reported that when flow was kept constant,

subglottic pressure was necessary to raise considerably more for phonation into a

stirring straw in air compared to a resonance tube 10 cm in water.

Findings from earlier studies and ours suggest that the increased Psub could be a

way to compensate the increased Poral, which in turn, is caused by the high degree

of airflow resistance offered by certain semiocclusions (e.g. glass tube into the

water and narrow stirring straw in air). In other words, the increased Psub could

reflect higher vocal effort used to compensate for increased vocal tract loading.

This airflow resistance (vocal loading) varies depending on the diameter and length

of the tube in air (Titze et al. 2002; Laukkanen et al. 2007; Horáček et al. 2014;

Radolf et al. 2014; Amarante et al 2015), and the depth of immersion when tube is

placed into the water (Horáček et al. 2014; Radolf et al. 2014; Amarante et al

2015). The strong positive correlation between Poral and Psub (r=0.8298; p <

0.001) found in the third study of the present dissertation, corroborate the

physiologic interaction between these two aerodynamic variables.

An interesting therapeutic aspect could arise from the effect of SOVTE in air

pressure measures. No matter of the voice condition or voice training, an

64

unconscious and slight abdominal and rib cage activation (movement) is produced

when voice is being produced with some semi-occlusions of the vocal tract. Even

though no objective evidence exists yet, clinical observations have been reported

and patients indicate to feel abdominal and rib cage movements during exercises.

They also point out that abdominal and rib cage activation is greater when tube

is submerged in water compared to tube phonation in air. Possibly, this activation is

an abdominal and intercostal muscle reaction to increase Psub, which is necessary

to overcome the increased Poral. Hence, this muscle activation should be

proportional to the degree of Poral, and airflow resistance. Therefore, SOVTE

could be an effective manner to train breath support (breathing technique whose

aim is to reduce excessive laryngeal muscular effort during phonation) in voice

therapy and training. For example, SOVTE could promote that subjects start

recruiting respiratory muscles more during phonation, instead of just overusing

adductor laryngeal muscles for raising vocal intensity.

Another interesting finding from the third study of this thesis is that subject

groups representing all four voice conditions (normal trained, normal untrained,

functional dysphonia and vocal fold paralysis) behaved quantitatively in a similar

manner regarding air pressure variables. Tube in water and stirring straw in air

presented the highest values in both Psub and Poral for all voice conditions. It

seems that these variables are in general more dependent on the degree of airflow

resistance than vocal status of participants.

Ptrans is obviously a variable expected to be modified during SOVTE. Titze et

al. (2002) stated that when a semi-occlusion, such as tube phonation is produced,

there should be a reduction in the Ptrans (which is considered to be the force

driving vocal fold vibration), unless the subglottic pressure is raised. Results from

the first study of this dissertation (single case study) showed that Ptrans decreased

during both tube and straw. Contrary, in the study III, Ptrans was higher than

baseline condition for all exercises. Outcomes from the latter could imply that

65

although both Poral and Psub increased, they do not change proportionally, i.e.

Psub increases relatively more than Poral. A compensatory adjustment in order to

sustain the phonatory airflow during phonation could be a possible explanation.

Radolf et al. (2014) showed similar findings in for both tube and stirring straw

phonation.

Two main aspects differentiate tube phonation when the free end of a certain

tube is placed in air or into the water: 1) the degree of airflow resistance, and 2) the

presence or absence of water bubbling. Clinical observations have suggested that

bubbles produced during phonation in water may cause a sensation of relaxing

massage on the oral, laryngeal and pharyngeal tissues. In the third study of the

present dissertation, bubbling characteristics were analyzed. Results showed that

the average bubbling frequency was 22 Hz (range 12-32 Hz), regardless of tube

immersion depth and the vocal condition of the subjects. Water bubbling is

reflected in oscillation of Poral (e.g. Enflo et al 2013; Granqvist et al 2014). Earlier

works have reported that the bubbling frequency of pulsating Poral is 10-40 Hz

(Radolf et al. 2014; Horáček et al. 2014; Granqvist et al. 2014). Bubbling

frequency has been previously analyzed from the physical point of view. It has

been stated that it depends on several factors, such as air flow, immersion depth,

and the diameter of the orifice of the tube (Davidson and Amick 1956). These

factors are likely responsible for differences found in frequency of bubbling when

compared our results to previous findings.

No significant differences were found for the amplitude of Poral modulation

(mean difference between Max. and Min. Poral) caused by bubbling when

comparing tube immersion depths of 3 cm and 10 cm in water. These outcomes and

Radolf et al. (2014) results, may suggest that larger modulation of the Poral is not

necessarily caused by higher Psub. Therefore, a possible association between the

degree of tube immersion and the degree of subjective massage-like effect (usually

66

reported by patients for 10 cm compared to 3 cm immersion depths in water) is

unlikely when the flow is kept constant.

4.2 Influence of semioccluded exercises on glottal function

Glottal function is likely the most explored aspect during SOVTE. The second

study of the present thesis analyzed vocal fold oscillatory patterns during phonation

into a tube submerged at different depths into water. In general, it was

demonstrated that vocal fold oscillation is affected by tube phonation. Even though

some trends were observed, results suggest that tube phonation with the free end in

water do not produce clear patterns but the effect depends on how a person reacts

to the increased airflow resistance.

CQ obtained from HSDI samples was one of the dependent variables in the

study III. Findings showed that tube phonation submerged 18 cm into water caused

higher values of CQ than the other conditions and baseline for most subjects. On

the other hand, phonation into 5 cm of showed a decrease in CQ compared to

baseline condition. Comparable results were observed in the third study of the

present dissertation. When compared EGG CQ during tube submerged 3 and 10 cm

into water, the latter demonstrated higher values. Our results are concordant with a

recent study by Guzman et al. (2015) Authors compared EGG CQ in eight different

SOVTE in a large number of subject (n=80). Tube submerged 10 cm in water

demonstrated higher values than tube submerged 3 cm in water for both subjects

diagnosed with dysphonic voices and subjects with normal voice. Similarly, Radolf

et al. (2014) compared EGG CQ between phonation into a glass resonance tube

with the outer end in the air, tube submerged 2 cm in water, tube submerged 10 cm

in water, and phonation into a very thin straw. Outcomes showed that CQ obtained

the highest value with tube submerged 10 cm into water. It seems that the degree of

67

flow resistance, specifically the depth of immersion, played an important role in

glottal adduction. A shallower depth tends to lead to a lower CQ, while deeper

submersion tended to produce a higher the CQ. Possibly, when supraglottic load

increases, a higher Psub and a compensatory glottal adduction are produced, no

matter the vocal status of participants. It is important to notice that not only depth

of immersion affect airflow resistances, but also the inner diameter and length of

tube. Longer and narrower tubes offer more resistance to the airflow than shorter

and wider ones due to frictional losses (Titze et al. 2002, Amarante et al. 2016).

Laukkanen et al. (2007) in a HSDI study found higher CQ for longer tubes

compared with shorter ones. Results from the study by Horáček et al. (2014)

showed how tube diameter also affects glottal function.

If the degree of airflow resistance really influences the degree of vocal fold

adduction, perhaps low flow resistance exercises should be used in subjects with

vocal fold hyperadduction, whereas high flow resistance exercises should be used

in patients with low vocal fold adduction (e.g. vocal fold paralysis or

presbyphonia). This is supported by observations of Sovijärvi (1977; 1989), who

has stated that tube submerged 1-2 cm below the water should be used for patients

with hyperfunctional voice disorders, while a depth of 10-15 cm could be more

appropriate for subjects with vocal fold paralysis.

Although, depth of immersion, inner diameter, and length of tubes could affect

the degree of compensatory glottal adduction during exercises, it is also important

to notice that instructions to the subjects is a relevant aspect that should be take into

account during voice therapy or training as well. Usually, an easy and soft

phonation is required from patients during voice therapy when using SOVTE.

It is interesting to note the possible relationship between CQ and air pressure

measures. In our third study, phonatory tasks showing the highest values of CQ

were tube in water and stirring straw in air. These exercises were also found to

present the highest values of Psub and Poral. It seems that when supraglottic load

68

increased, a higher Psub and a compensatory glottal adduction were produced.

Even though most earlier studies have demonstrated that the higher the depth of

immersion, the higher is the CQ values, a recent HSDI investigation showed an

increased OQ with increased water depth (2-6 cm) (Granqvist et al. 2014). Similar

finding were reported in the article I from the present dissertation (i.e. increased

resistance produced a decreased CQ = increased OQ). This discrepancy could be

due to the fact the tube was submerged in shallower depth than depth used in the

other studies. 5 cm, 10 cm and 18 cm were used in our high speed research.

Perhaps, tube phonation with deeper immersion depths (> 6 cm) may thus require

more adduction of the vocal folds.

Verdolini et al. (1998) suggested that CQEGG strongly correlates with the degree

of glottal impact stress. Hence, considering that during high flow resistance

exercises (e.g. tube into the water or phonation into narrow straw) Psub and CQEGG

increased together, an increased Psub could be associated to a higher degree of

vocal fold impact stress. Earlier modeling studies have explored this possible

association (Jiang and Titze 1994; Horáček et al. 2007). In an excised larynx

experiment, Jiang and Titze (1994) demonstrated that peaks of impact pressure

were positively related to adduction of the vocal folds, elongation, and subglottal

pressure. Maximum impact stress was found at the midpoint of the membranous

vocal folds. Furthermore, Horáček et al. (2007), using an aeroelastic model of

voice production showed that both impact stress and acceleration increased with

Psub.

The association between Psub and degree of impact stress has also an important

clinical application that should be taken into account when choosing the right

SOVTE depending on the perceptual, aerodynamic, electroglottographic and self-

reported characteristic of voice.

One of the most frequently used SOVTE in voice clinic is tube or straw

phonation with the free end in air. In our first study (single case study), CQEGG

69

decreased during both tube and straw. This change remained both after tube and

after straw. However, no consistency has been observed in previous studies when

measure CQEGG during tube phonation with the free end in air. (e.g. Laukkanen

1992; Titze et al. 2002; Gaskill et al. 2010; Gaskill et al. 2012; Guzman et al. 2013;

Guzman et al. 2015). It is possible to hypothesize that perhaps vocal training status

is a determinant variable that should be considered when comparing CQ during

tube phonation and vowel phonation. Nevertheless, results from the third study of

the present dissertation demonstrated no clear trend regarding vocal status of

participants (normal voice without voice training, normal voice with voice training,

functional dysphonia, and vocal folds paralysis). Therefore, it seems that glottal

adduction during tube phonation in air is more dependent on the individual

compensatory adjustments.

ALR was another glottal parameter analyzed in our HSDI study. In most cases,

for all immersion depths, ALR evidenced a decrease during phonation into water.

A decrease in Ptrans could be a possible explanation. Since from the biomechanics

point of view a low ALR is associated to a low vocal fold impact stress, the

possibility of vocal folds phonotrauma is expected to decrease when ALR is low.

In a previous HSDI study, Laukkanen et al. (2007) analyzed vocal fold vibration

using the amplitude to dynamic length (A-DL) ratio during tube phonation. An

increased A-DL ratio was found for the longest tube, which according to authors,

may suggest a raised vocal effort (increased Psub). In our HSDI study, a relatively

high ALR was found for some subjects with tube 18 cm in water.

Another glottal variable related to the degree of vocal fold impact stress is the

ClQ. This parameter is specifically related to the abruptness of vocal fold closure,

which is also expected to influence the degree of vocal fold collision. The lower the

ClQ value, the more abrupt the closure should be. Our results showed two trends.

ClQ increased after 5 cm phonation compared to baseline condition and during

phonation in the tube 10 cm in the water. Therefore, in these two cases the closing

70

phase was relatively longer compared to baseline condition. Laukkanen et al.

(2007) observed that ClQ was lower for longer tubes compared to shorter ones.

Since a long tube and/or narrow tube could be similar to a tube submerged deep

under the water surface regarding the degree of airflow resistance, results from our

study could be considered opposite to the outcomes by Laukkanen et al. (2007).

Since mean values were not analyzed in our study and Laukkanen et al. (2007)

study, individual variations due to different compensation strategies could be a

suitable explanation for this discrepancy between these two previous studies.

Spectral flatness is defined as the spectral declination or spectral tilt. Previous

studies using acoustic analysis of voice have reported that SOVTE produced a less

steep spectral slope after exercising (Laukkanen 1992; Guzman et al. 2012;

Guzman et al. 2013a, Guzman et al. 2013b) suggesting that semi-occlusions

produce an increased spectral energy in the higher part of the spectrum. The second

study of the present dissertation (HSDI investigation) showed no clear trends in

samples obtained after tube phonation. Moreover, a decrease of spectral flatness

during tube phonation for all immersion depths compared to baseline was

observed. This implies a steeper spectral slope. The amplitude of higher harmonics

is particularly sensitive to the speed at which the glottal airflow decreases at the

end of the closing phase (Story et al. 2000). A lower SF seems to imply a smoother

collision between the vocal folds (Fant 1960; Gauffin and Sundberg 1989).

Commonly, when a voice sample has a steeper spectral slope, i.e. less harmonic

energy in the higher harmonics, there is also an increment of noise energy. HNR is

the ratio between harmonic and noise energy. This variable was analyzed based on

image analysis in our HSDI study. This parameter demonstrated an increase after

tube phonation submerged into 5 cm of water compared to pre tube samples. Thus,

more harmonic energy compared to noise was observed. Earlier studies, based on

acoustic signal analysis (Paes et al. 2013; Guzman et al. 2011; Guzman et al. 2012)

71

have also reported a decreased noise energy after SOVTE in subjects diagnosed

with functional dysphonia.

Perturbation measures, specifically Jitter and shimmer have also been assessed

before and after semioccluded vocal tract exercises. In a study performed with

school teachers diagnosed with slight dysphonia, both jitter and shimmer were

found to be significantly lower after vocal exercises with stirring straw (Guzman et

al. 2011). Moreover, a decrease in jitter and shimmer in normal-voiced participants

after vocal warm-up using semiocclusions was reported by Barrichelo-Lindström et

al. (2007). Results from our high speed study showed an interesting trend, jitter

decreased during and after phonation into a tube submerged 5 cm in water for most

participants. These results suggest that phonation into a tube in 5 cm water and

after it has a stabilizing effect on glottal function.

4.3 Influence of semioccluded exercises on vocal tract

Two studies from the present dissertation addressed the influence of SOVTE on

vocal tract configuration. The first study used CT, while the fourth study explored

vocal tract changes through flexible transnasal laryngeal endoscopy. Main findings

from both investigations are in general concordant.

VLP presented one of the major changes observed during semiocclusions. In our

CT study, the subject demonstrated a lower LVP during both phonation into the

traditional Finnish glass tube and stirring straw, the change being more prominent

during the latter. Similar results were found in our flexible laryngoscopy study, all

SOVTE produced a lower VLP compared to the resting position. Phonation with

tube into the water (10 and 3 cm below the surface) and phonation into a long-

narrow tube (a kind of a long stirring straw) produced the three lowest VLP. In

both studies, the lowest VLP was found during exercises with the greatest airflow

72

resistance (narrow straws and tube into water). Two earlier dual-channel

electroglottograph investigations have reported similar results regarding VLP.

Laukkanen et al. (1999) found a lower VLP compared to the resting position during

occlusive consonants (also considered SOVTE). Moreover, Wistbacka et al. (2015)

analyzed VLP during tube phonation with the free end in air and in water. EGG

results showed that VLP was lower during tube in water, while it increased during

phonation with the tube end in air. These findings confirm the fact that exercises

with higher airflow resistance tend to lower the larynx.

Although, several studies have reported a lower VLP during SOVTE compared

to resting position, the opposite effect of tube phonation on this dependent variable

has also been demonstrated (Laukkanen et al. 1996; Laukkanen et al. 2012;

Vampola et al.) Furthermore, two magnetic resonance imaging studies reported no

changes on the VLP during phonation into a resonance tube and during voiced

plosive consonants (Laukkanen 2010; Laukkanen, 2012). Additionally, Guzman et

al. (In press), in a recent CT research performed in subjects with voice disorders,

assessed VLP among other vocal tract features during phonation in commercial

drinking and stirring straws. Even though larynx tended to be lower during both

straws (the change being more prominent for the stirring straw), no significant

differences were found compared to vowel phonation.

As it comes to different results concerning VLP, it is worth mentioning some

possible underlying factors like individual differences. Differences in the subjects

background education (e.g. singing vs speaking training) and also the exploratory

method used (e.g. dual-channel EGG and imaging techniques) could be reasonable

explanations for this discrepancy. Regarding exploratory methods, Laukkanen et al.

(1999) in a study aimed to investigate the behavior of multi-channel EGG in VLP

using videofluorography as a reference method, reported that results from EGG

may be affected by e.g. anterior-posterior movements of the larynx, and even

73

opposite results can be obtained with EGG than what really happens based on

videofluoroscopic findings.

It is commonly agree in voice clinic that a low VLP is desirable during voice

exercises, since it has been linked to a heathy and relaxed way to produce voice.

Therefore, semi-occlusions and lengthening of the vocal tract might have an

important therapeutic effect if they really produce a laryngeal lowering. In Finland,

a lowering of larynx is an important goal during tube phonation in water (water

resistance therapy). Sovijärvi et al. (1989) stated that the positive outcomes of the

resonance tube method are due to the efficient lowering of the larynx during

exercising. It is important to notice that a lowered larynx during speaking is not a

goal since it would cause an unnatural sound.

The lower VLP found in the first study of the present dissertation (CT study), is

concordant to acoustic results obtained from FFT analysis. A decrease in the

frequency of the first five formants was observed. In a previous study performed

with MRI, similar results were observed in a male subject who lowered the larynx

after vocal warm-up with spoken exercises (Laukkanen 2012). The acoustic theory

of speech states that the formant frequencies depend on the length of the vocal tract

and the cross-sectional shape of the vocal tract as a function of its length (Fant

1970; Kent 1993). Vocal tract length determines the average location of formant

frequencies. In this regard, as length increases, the value of the formant frequency

will decrease. Since a lower larynx increases the vocal tract length, all formant

frequencies are expected to be lower. Lowering of formant frequencies, especially

the first formant, could contribute to a greater source-filter interaction determined

by the approximation of F1 to F0. Story et al. (2000) stated that phonating at a

frequency at or near the first formant may allow for an efficient voice production

that could possibly be associated with lower effort.

Originally, phonation in narrow tube with the free end in air or a resonance tube

in water was recommended in clinic for patients with hypernasality since such

74

exercises were supposed to rise the velum (Gundermann 1977). Currently, velum

position is another vocal tract feature being explored during SOVTE. Three

previous studies have demonstrated that velum rose to seal the nasopharyngeal port

during the tube phonation and remained elevated after exercising (Laukkanen et al.

2010; Vampola et al. 2011; Laukkanen et al. 2012). Our CT study showed exactly

the same changes during and after phonation into both glass tube and stirring straw.

In another CT study, Guzman et al (in press), demonstrated similar results in

subjects with functional dysphonia during voice production in drinking and stirring

straws. However changes were significant only during stirring straw, possible due

to a higher degree of flow resistance.

Energy transfer is expected to increase with a proper velar closure by decreasing

the damping caused by the nasal tract. This, in turn, should produce an increase in

the total SPL (Laukkanen et al, 2010). However, acoustic data from the first study

of the present dissertation did not demonstrate a greater total SPL after tube/straw

phonation. Since the participant was asked to keep the same loudness level during

all phonatory tasks, the lack of change in total SPL could be expected. The spectral

energy of F1 region did not change during the sequence neither. Since energy of F1

supports most of total SPL, this is also expected.

An open and relaxed throat is another common goal of voice therapists and

vocal coaches, in patients with excessive muscle tension and speech and singing

training. Hence, a wide pharyngeal area during voice exercises should be desirable

in voice therapy. Increment of pharyngeal area during all SOVTE was another

change observed in the fourth study from the present dissertation. Phonation with

tube into the water (10 and 3 cm below the surface) and phonation into a long-

narrow tube produced the most prominent changes. Earlier studies have

demonstrated similar outcomes on pharyngeal configuration (Laukkanen et al.

2010; Vampola et al. 2011; Laukkanen 2012). Moreover, in our CT study, several

changes were observed during both glass resonance tube and narrow straw

75

phonation regarding pharyngeal configuration. The lower pharynx area, the middle

pharyngeal region, and anterior-posterior width of the hypopharynx increased

during exercising compared to vowel phonation prior to the exercises. All of these

changes were larger during stirring straw than glass tube phonation. Vampola et al.

(2011) in a CT investigation showed that the most dominant change in the vocal

tract during phonation into a glass tube was caused by expansion of the cross-

sectional area of the oropharynx. In another CT study performed in patients with

voice disorders, a significant increment was found in hypopahrynx width during

two types of plastic straws (drinking and stirring straws). Increase of the inlet to the

lower pharynx and the epilaryngeal region was also observed during both straws,

but no significant differences were reported (Guzman et al. In press). Additionally,

correlations were found between inlet to the lower pharynx and epilaryngeal region

for both type of straws, hypopharyngeal width and inlet to the lower pharynx for

both straws, and hypopharyngeal width and epilaryngeal region for drinking straw

(Guzman et al. in press).

The ratio between the inlet to the lower pharynx (Ap) and the outlet of the

epilaryngeal tube (Ae) was also explored in the first study of the present

dissertation (Study I). Sundberg (1974) stated that the epilaryngeal tube should act

as a separate resonator (i.e. acoustically unlinked from the rest of the vocal tract)

when the cross-sectional area in the pharynx (Ap) is at least six times wider than

that of the laryngeal tube opening (Ae). When this relation is produced, the singer’s

formant cluster (a prominent spectrum envelope peak near 3 kHz associated with

the "ringing" voice quality) is likely to be produced. Acoustically, singing formant

is produce by a clustering of third, fourth and fifth formants. Findings from our

study I, demonstrated a clear tendency that the ratio Ap/Ae increases during and

after both tube and straw phonation. The greatest Ap/Ae ratio was observed after

straw phonation (5.5) which is close to the value suggested by Sundberg (1974)

(6.0). Related to this, results from the study IV (flexible laryngoscopy) showed an

76

aryepiglottic narrowing (A-P laryngeal compression) during all SOVTE. This

modification was more prominent during phonation with a tube submerged under

the water (3 and 10 cm) and during phonation with a long-narrow tube, as occurred

for the VLP and pharyngeal width. Similar findings have been obtained by

Peltokoski et al. (2016) in a nasoendoscopy study. Epilaryngeal region narrowed

more when the resonance tube was in water compared to resonance tube in air. In

addition, two interesting correlations were found in the study IV of the present

dissertation: 1) between pharyngeal width and A-P laryngeal compression, which

means that when the pharynx widened, there was also a narrowing of the

aryepiglottic sphincter. 2) Between VLP and A-P laryngeal compression, which

indicates that when the larynx is lower, there was also a narrowing of the

aryepiglottic sphincter. Regarding the latter correlation, Sundberg (1994) has

suggested that a low VLP is a way to obtain an aryepiglottic narrowing.

Results from acoustic analysis in the first study of this dissertation (CT study)

showed three interesting findings that could be related to results found regarding

Ap/Ae ratio: 1) after straw phonation, the largest increase in the energy of the

speaker’s/singer’s formant cluster region was found, 2) the singing power ratio

(SPR) demonstrated the largest decrease after straw phonation, which implies a less

steep spectral slope between the low harmonics and the harmonics located near to 3

kHz (singing formant region), 3) there was a clear formant cluster of F3 and F4

after exercising with both a glass tube and a straw, which was located in 2500-3000

Hz for vowel [a:] after exercising. In line with our acoustic findings, two earlier

case studies have also showed an increment of the spectral energy in the

singer’s/speaker’s formant region (Laukkanen et al. 2010; Laukkenen et al. 2012)

after tube phonation. Additionally, two studies whose aim was to compare the

effect on spectral energy distribution of SOVTE and open vowel exercising,

showed an increased spectral energy in the higher part of the spectrum (2000-5000

Hz) after SOVTE compared to voice spectrum after vocalizations with open vocal

77

tract (Guzman et al. 2013a; Guzman et al. 2013b). Considering formant clustering

after exercising with SOVTE, a significant decrease in the formant frequency

distance after straw phonation was also found by Laukkanen et al (2010). Such

formant clustering may have contributed to the changes mentioned above (lower

SPR, and higher SPL between 2500-4000 Hz).

These acoustic data suggest a warm-up effect on voice spectrum after straw

phonation, which could contribute to a more economic voice production. Vocal

economy is defined as the ratio between voice output (decibels) and intraglottal

impact stress (Kilopascal) (Verdolini et al. 1998; Berry et al. 2001). Since an

increase of acoustic energy in the singer’s/speaker’s formant cluster region

contribute to the perceived vocal loudness level, a lower SPR and a higher SPL

between 2500-4000 Hz may help to project vocal sound without increasing

laryngeal effort (recall that CQ decreased after straw phonation).

Another possible explanation of the energy increment observed in the singer’s

formant region in our first study could be an increased aerodynamic-acoustic

interaction of the vocal tract. This increment may be explained by an increased

input impedance of the vocal tract due to the higher Ap/Ae ratio. This increment

may be explained by an increased input impedance of the vocal tract due to the

higher Ap/Ae ratio. Increased input impedance (Especially positive reactance at the

fundamental frequency region) produces a faster cessation of the glottal flow and

consequently a less tilting spectrum (Rothenberg 1986; Fant 1987, Story et al.

2000). According to the results by Vampola et al. (2011) the changes found after

tube phonation resulted in an increase of total SPL, attributable to an increased

reactance of the vocal tract after the exercise, which in turn was caused by an

increment of Ap/Ae ratio. Specifically, the subject increased the vocal tract input

reactance and make the vocal tract more inertive by expanding mainly the

oropharynx rather than by reducing the epilaryngeal tube.

78

Moreover, an increased subglottic pressure after straw phonation could also

explain a less steep spectral tilt and a higher SPL in the singer’s/speaker’s formant

region. Nordenberg and Sundberg (2004) demonstrated that intensity is not linearly

correlated to the spectral contour. Energy of all frequencies of the spectrum does

not increase proportionally when there is an increase in total sound level. The gain

in dB in the region of lower frequencies is lower than in region of high frequencies

when total sound pressure level is increased (White 1998, Nordenberg et al. 2004).

Vocal tract changes observed in studies I and IV such as wider pharynx, lower

larynx and raised velum should be expected to increase total vocal tract volume. In

fact, two previous works have reported this modification during tube phonation.

Vampola et al. (2011) revealed that the total volume increased 38.5% after

phonation into the tube when compared to vowel phonation before tube. According

to the authors the increase in volume was mostly due to transversal expansion of

the vocal tract. Similar outcomes were found by Guzman et al. (In press). The total

volume of the vocal tract was larger during both drinking straw and stirring straw

compared to baseline, but it was significantly different only during use of the

stirring straw. The main cause of this increase during stirring straw phonation is

likely the same reason as explained by Vampola et al. (2011) study, an increase in

transversal dimension of the vocal tract. Data by Guzman et al. (In press) showed a

strong correlation of vocal tract total volume with the lower pharynx (A3) and the

inlet to the lower pharynx (Ap) during straw exercises, suggesting an association

between volume of the vocal tract and cross sectional areas. Even though changes

in vertical length of vocal tract (i.e. lowering of the larynx and rising of the velum)

may contribute to a larger total vocal tract volume during tube phonation, it seems

that the main cause are the changes observed in cross sectional areas.

Increased total volume caused by all vocal tract changes described above is

possibly due to the increased Poral during tube phonation. The increment on this

parameter during SOVTE may have mechanically pushed the larynx down, the

79

pharyngeal walls laterally, and velum up. However, it is also possible that

participants consciously closed the velum in order to get all the flow through the

tube. Therefore, velum closure could also be made by active muscle functioning

and not necessarily something that happens mechanically. Moreover, similar

laryngeal muscle activation could cause the laryngeal lowering during tube

phonation.

The first option mention above (increased volume is cause by increment in

Poral) is supported by earlier studies showing that during occlusion at the lip and or

artificial lengthening of the vocal tract, Poral raised compared to open vowel

production (Titze et al. 2001; Amarante et al. 2016; Radolf et al. 2014; Maxfield et

al. 2015). A recent investigation aimed to assess the static back pressure (Poral)

and airflow for different tubes commonly used for voice therapy, reported that

changes in the diameter of straws affect Poral considerably more compared with

the same amount of relative change in length (Amarante et al. 2016). Furthermore,

Maxfield et al. (2015) created a rank ordering by measuring the intraoral pressure

produced by thirteen SOVTE commonly used in voice therapy. The highest values

of Poral were found in straw submerged in water, raspberries, and stirring straw

with the free end in air. These findings may explain the vocal tract differences

found in our studies I and IV when compared different type of SOVTE. In the CT

study, both Finnish glass tube and the thin stirring straw produce the same vocal

tract changes, but all modifications were more prominent during the latter (it is

narrower than the glass tube). In the flexible laryngeal endoscopy study, similar

results were reported. Although, all exercises produced a wider pharynx, and lower

larynx, more prominent changes were found during tube submerged into water (10

and 3 cm) and during narrow-long tube (type of stirring long straw). In addition,

an interesting finding from our study IV may be linked to the impact of Poral on

vocal tract changes. All dependent variables demonstrated the greatest degree of

change when phonating with loud voice (Recall that three loudness levels were

80

requires to participants for all exercises). This result could be related to the higher

degree of Psub produced during artificial lengthening and occlusions of the vocal

tract when vocal intensity is higher. It seems that louder phonation caused by a

higher Psub is a factor to modify vocal tract shape. In this regard, loudness level

has also demonstrated in earlier works to be a factor influencing the degree of

vocal tract modifications in professional voice users (Yanagisawa et al. 1989;

Mayerhoff et al. 2014; Guzman et al. 2013, Guzman et al. in press). Data by

Mayerhoff et al. (2014) showed that both the degree of medial and A-P

compression were greater during loud phonation. These results were observed

during sustained vowel and connected singing productions by classically trained

singers. Guzman et al. (2013) found similar results. Authors reported that high

loudness produced the highest degree of laryngeal A-P compression, laryngeal

medial compression, and pharyngeal compression. Yanagisawa et al. (1989) found

greater A-P laryngeal compression when subjects performed loud phonation and

during the three loudest voice qualities: belting, twang, and opera.

4.4 Possible sources of error

Some possible sources of error from the studies included in the present dissertation

could be mentioned. The first study (CT investigation) only included one single

subject. Moreover, this subject was vocally trained and with several years of

experience using tubes and other SOVTE. However, most CT, acoustic and air

pressures measures findings are in line to earlier research and recent investigations

as well.

Due to small sample size, in the HSDI study (Study II) we only were able to

analyze the trend of values for all parameters and phonatory tasks without using

hypothesis testing to avoid lack of statistical power. Moreover individual variations

81

caused some difficulties in the interpretation of results. Additionally, in this study

rigid laryngeal endoscopy was used. This could have affected voice production

since tongue position during this exploratory method is not the natural position

during regular speaking. Further research should include a larger sample size,

statistical analysis and use flexible laryngeal endoscopy.

Even though our third study (air pressure and contact quotient measures study)

included four groups of subjects representing four different voice conditions, it did

not consider participants diagnosed with phonotraumatic organic dysphonia (e.g.

vocal nodules, polyps, cysts, etc.). Likely, organic phonotraumatic lesions could

cause different type of aerodynamic and glottal compensations during tube

phonation.

Additionally, this study did not include glottal airflow measures. This variable is

important since it is related to both glottal resistance and subglottic pressure.

One important aspect to be considered in the fourth study of the present

dissertation if that non-objective measures were used. Flexible laryngeal endoscopy

allows a full view of pharynx and larynx during examination, however, assessment

is dependent on the rater’s experience. Moreover, the distance between the distal

part of the fiber and laryngeal/pharyngeal structures could have been modified

simply by the movement of the experimenter’s hand. To decrease the possible

effect of these two sources of errors, we included otolaryngologists with more than

4 years of experience in voice disorders as judges. All of them had a laryngology

fellow certification. Furthermore, inter and intra-rater reliability analysis was

included. Moreover, the placement of flexible endoscope was fixed by securing the

fiberscope against the alar cartilage of the nose with the laryngologist’s finger.

82

4.5 Future research

Even though several studies have been performed regarding semioccluded vocal

tract exercises, there are still some issues that need to be addressed in further

investigations. One important aspect that should be considered is the possible effect

of semiocclusions on breathing function. To date an important number of studies

have reported changes on air pressure measures (e.g. subglottic pressure) that

somehow reflect some aspects of breathing function. Nevertheless, variables such

as degree of thoracic and abdominal excursions during and after SOVTE,

respiratory volumes, and respiratory muscle activity should be investigated.

Additionally, most studies on tube phonation have explored the effect during or

immediately after exercises. However, to the best of our knowledge only three

investigations have sought to know the efficacy of tube phonation protocols in

subjects with voice disorders after several sessions of voice therapy. All of them

have included a small sample size. In a recent randomized controlled trial study,

Kapsner-Smith et al. (2015) found significant improvement in Voice Handicap

Index scores compared to the control condition (no-treatment group) after a

therapeutic program based on flow-resistant tube exercises. Moreover, significant

improvement in roughness (from the CAPE-V scale) was found relative to control

group. Additionally, two longitudinal studies have been designed using phonation

into tubes submerged in water (water resistance therapy) in subjects with

behavioral dysphonia, (Tapani 1992; Simberg et al, 2006) and one study was

conducted to explore the effect of drinking straw phonation in air and the bilabial

consonant /ß:/ in a group of acting students diagnosed with muscle tension

dysphonia (Guzman et al. 2012). More studies including a larger sample size with

in different diagnosis and populations should be conducted in future research.

83

5 CONCLUSIONS

1. Overall, vocal exercises including semi-occlusions and artificial lengthening of

the vocal tract have a simultaneous effect on phonation, vocal tract configuration

and aerodynamic variables.

2. Most changes suggest that SOVTE are good tools for subjects with voice

disorders and professional voice users. These changes seem to lead to a more

economic voice production.

3. Subjects with different voice conditions (e.g. with voice disorders, healthy non-

vocally trained, and healthy trained subjects) behave quantitatively and

qualitatively in a similar manner under similar exercise’s conditions (i.e. flow

resistance exercises).

4. The degree of flow resistance should be considered when choosing treatment

exercises for patients with different types of diagnoses (e.g. hyperfunctional or

hypofuntional dysphonia). High degree of flow resistance exercises (e.g. stirring

straws and deep immersion when tube is submerged into the water) could be more

beneficial for patients with vocal fold paralysis or presbyphonia. On the other hand,

glottal function of patients with hyperaduction or vocal fatigue could be improved

using exercises with low degree of flow resistance (e.g. traditional glass tube and

shallow depths of immersion).

84

5. Aerodynamic aspects are mainly affected during SOVTE by increasing air

pressure measures such as Poral and Psub, the latter being a compensation of the

former. An increase in Poral, in turn, is caused by the airflow resistance offered by

semi-occlusions.

6. Influence of SOVTE in phonation is mainly related to the degree of vocal fold

collision and glottal adduction during exercising. This is mostly determined by the

degree of resistance to the airflow that each exercise offers. High resistance seems

to promote a greater vocal fold collision compared to a lower resistance exercise.

Therefore, higher resistance exercises could be more beneficial for patients with

diagnosis such as vocal fold paralysis or presbyphonia. On the other hand, glottal

function of patients with hyperaduction or vocal fatigue could be improved using

exercises involving lower airflow resistance.

7. Important therapeutic and training goals can be reached during SOVTE

regarding vocal tract configuration. A lower larynx, a wider pharynx and a higher

velum were produced during this type of exercises compared to vowel phonation,

the change being more prominent when the airflow resistance is higher. These

modifications are linked to positive changes in acoustic and perceptual output of

voice.

85

6 ACKNOWLEDGMENTS

First of all, I would like to deeply thank Dr. Anne-Maria Laukkanen, my PhD

advisor professor. Dr. Laukkanen helped me plan this thesis, revise its proposal,

and monitor the whole process. Her continuous help and support were invaluable to

this thesis and my career as a voice scientist. Her outstanding knowledge,

generosity, and patient were very important to carried out all studies and writing

process. She also motivated me to continue going forward in the science of voice

field.

I am also extremely grateful to Dr. Matthias Echternach. Thanks him, his voice

lab team, and his dedicated work and knowledge, the HSDI study was possible.

My sincere thanks go also to all scientists that collaborated in all steps in the CT

study. Thanks to Dr. Jan Švec, Dr. Jaromir Horáček, and Dr. Petr Krupa.

Special thanks to my friend and colleague Dr. Ahmed Geneid. We share so

many experiences together during the CT and HSDI studies in Czech Republic and

Germany respectively. He is really a so talented professional.

I also express my sincere gratitude to Dr. Bernhard Richter and Dr. Louisa

Traser for all their collaboration during the HSDI study in Germany.

I would like to also thank all the Chilean co-authors I had the pleasure to work

with. Without their work, this thesis would not have been possible. Thanks for their

support, understanding and cooperation. Many thanks Dr. Cristian Olavarria, Dr.

Miguel Leiva, Cristian Castro, Sofia Madrid, Dr. Daniel Muñoz, Dr. Alba Testart,

and Elizabeth Jaramillo.

86

7 FINANCIAL SUPPORT

The study I was supported by the Academy of Finland, grant number 128095. In

the Czech Republic, it was supported by the project GACR No. P101/12/1306 and

the European Social Fund project no. CZ.1.07/2.3.00/20.0057.

The study II was supported by the Deutsche Forschungsgemeinschaft, grant Ec

409/1-1 and Ri 1050/4-1 in Germany. In Finland, it was supported by the Finnish

Academy of Sciences (grant number 1128095).

Studies III and IV received no financial support.

87

9 REFERENCES

Ahmad M, Dargaud J, Morin A, Cotton F. Dynamic MRI of Larynx and Vocal

Fold Vibrations in Normal Phonation. Journal of Voice. 2009: 23, 235-239.

Ahmad K, Yan Y, and Bless D. Vocal Fold Vibratory Characteristics in Normal

Female Speakers From High-Speed Digital Imaging. Journal of Voice. 2012: 26,

239-253.

Amarante Andrade P, Wistbacka G, Larsson H, Södersten M, Hammarberg B,

Simberg S, Švec JG, Granqvist S. The Flow and Pressure Relationships in

Different Tubes Commonly Used for Semi-occluded Vocal Tract Exercises.

Journal of Voice. 2016: 30, 36-41.

Aronson AE. Clinical Voice Disorders: An interdisciplinary approach. 3er Ed.

Thieme. New York, NY. 1990.

Baken RJ and Orlikoff RF. Clinical Measurements of Speech and Voice. Thomson

Delmar Leraning. Clifton Park, NY. 2000.

Barrichelo-Lindström V, & Behlau M. Perceptual Identification and Acoustic

Measures of the Resonant Voice Based on “Lessac’s Y-Buzz”- A Preliminary

Study With Actors. Journal of Voice. 2007: 21, 46-53.

http://www.ncbi.nlm.nih.gov/pubmed/?term=Amarante%20Andrade%20P%5BAuthor%5D&cauthor=true&cauthor_uid=25873546

http://www.ncbi.nlm.nih.gov/pubmed/?term=Wistbacka%20G%5BAuthor%5D&cauthor=true&cauthor_uid=25873546

http://www.ncbi.nlm.nih.gov/pubmed/?term=Larsson%20H%5BAuthor%5D&cauthor=true&cauthor_uid=25873546

http://www.ncbi.nlm.nih.gov/pubmed/?term=S%C3%B6dersten%20M%5BAuthor%5D&cauthor=true&cauthor_uid=25873546

http://www.ncbi.nlm.nih.gov/pubmed/?term=Hammarberg%20B%5BAuthor%5D&cauthor=true&cauthor_uid=25873546

http://www.ncbi.nlm.nih.gov/pubmed/?term=Simberg%20S%5BAuthor%5D&cauthor=true&cauthor_uid=25873546

http://www.ncbi.nlm.nih.gov/pubmed/?term=%C5%A0vec%20JG%5BAuthor%5D&cauthor=true&cauthor_uid=25873546

http://www.ncbi.nlm.nih.gov/pubmed/?term=Granqvist%20S%5BAuthor%5D&cauthor=true&cauthor_uid=25873546

88

Bassiouny S. Efficacy of the accent method of voice therapy. Folia Phoniatrica et

Logopedica. 1998: 50, 146-164.

Behrman A, Dahl L, Abramson A, Schutte H. Anterior-Posterior and Medial

Compression of the Supraglottis: Signs of Nonorganic Dysphonia or Normal

Postures? Journal of Voice 2003: 17, 403-410.

Berry DA, Verdolini K, Montequin DW, Hess MM, Chan RW, Titze IR. A

quantitative output-cost ratio in voice production. Journal of Speech Language and

Hearing Research. 2001: 44, 29-37.

Bickley CA, Stevens KN. Effects of a vocal tract constriction on the glottal source:

data from voiced consonants. In: Baer T, Sasaki C, Harris K (Eds). Laryngeal

Function in Phonation and Respiration. College-Hill Press, Little, Brown and

Company. Boston, Mass. 1987, 239-254.

Bonilha HS and Deliyski DD. Period and Glottal Width Irregularities in Vocally

Normal Speakers. Journal of Voice. 2008: 22, 699-708.

Boone DR. The Voice and Voice Therapy. Prentice Hall. Englewood Cliffs, NJ.

1977.

Boone DR. McFarlane SC. The Voice and Voice Therapy. 4th ed. Prentice Hall.

Englewood Cliffs, NJ. 1988.

Boone DR, McFarlane SC, Von Berg S. The Voice and Voice Therapy. 7th ed.

Allyn and Bacon. Boston. 2005.

http://www.ncbi.nlm.nih.gov/pubmed/?term=Bassiouny%20S%5BAuthor%5D&cauthor=true&cauthor_uid=9691529

http://www.ncbi.nlm.nih.gov/pubmed/9691529


89

Brenner DJ, Hall EJ. Computed tomography-an increasing source of radiation

exposure. New England Journal of Medicine. 2007: 357, 2277-2284.

Chen T, Chodara A, Sprecher A, Fang F, Song W, Tao C, and Jiang J. A New

Method of Reconstructing the Human Laryngeal Architecture Using Micro-MRI.

Journal of Voice. 2012: 26, 555-562.

Chen SH, Hsiao T-Y, Hsiao L-C, Chung Y-M, Chiang S-C. Outcome of resonant

voice therapy for female teachers with voice disorders: Perceptual, physiological,

acoustic, aerodynamic, and functional measurements. Journal of Voice. 2007: 21,

415-425.

Clement P, Hans S, Hartl DM, Maeda S, Vaissiere J, and Brasnu D. Vocal Tract

Area Function for Vowels Using Three-Dimensional Magnetic Resonance

Imaging. A preliminary Study. Journal of Voice. 2007: 21, 522-530.

Conroy ER, Hennick TM, Awan SN, Hoffman MR, Smith BL, Jiang JJ.. Effect of

variations to a simulated system of straw phonation therapy on aerodynamic

parameters using excised canine larynges. Journal of Voice. 2014: 28, 1-6.

Crary MA, Kotzur IM, Gauger J, Gorham M, and Burton S. Dynamic Magnetic

resonance Imaging in the Study of Vocal Tract Configuration. Journal of Voice.

1996: 10, 378-388.

Davidson L, Amick EH . Formation of gas bubbles at horizontal orifices.

American Institute of Chemical Engineers Journal. 1956: 2, 337-342 .

http://www.ncbi.nlm.nih.gov/pubmed/?term=Conroy%20ER%5BAuthor%5D&cauthor=true&cauthor_uid=24286626

http://www.ncbi.nlm.nih.gov/pubmed/?term=Hennick%20TM%5BAuthor%5D&cauthor=true&cauthor_uid=24286626

http://www.ncbi.nlm.nih.gov/pubmed/?term=Awan%20SN%5BAuthor%5D&cauthor=true&cauthor_uid=24286626

http://www.ncbi.nlm.nih.gov/pubmed/?term=Hoffman%20MR%5BAuthor%5D&cauthor=true&cauthor_uid=24286626

http://www.ncbi.nlm.nih.gov/pubmed/?term=Smith%20BL%5BAuthor%5D&cauthor=true&cauthor_uid=24286626

http://www.ncbi.nlm.nih.gov/pubmed/?term=Jiang%20JJ%5BAuthor%5D&cauthor=true&cauthor_uid=24286626


90

Deliyski D. Endoscope Motion Compensation for Laryngeal High-Speed

Videoendoscopy. Journal of Voice. 2005: 9, 485-496.

Demolin D, Delvaux V, Metens T, Soquet A. Determination of Velum Opening for

French Nasal Vowels by Magnetic Resonance Imaging. Journal of Voice. 2003: 17,

454-467.

Doellinger M, Lohscheller J, McWhorter A, Kunduk M. Variability of Normal

Vocal Fold Dynamics for Different Vocal Loading in One Healthy Subject

Investigated by Phonovibrograms. Journal of Voice. 2009: 23,175-181.

Doellinger M, Berry D. Visualization and Quantification of the Medial Surface

Dynamics of an Excised Human Vocal Fold During Phonation. Journal of Voice.

2006: 20, 401-413.

Echternach M, Dippold S, Sundberg J, Arndt S, Zander M, and Richter B. High-

Speed Imaging and Electroglottography Measurements of the Open Quotient in

Untrained Male Voices’ Register Transitions. Journal of Voice. 2010: 24, 644-650.

Echternach M, Döllinger M, Sundberg J, Traser L, Richter B. Vocal fold vibrations

at high soprano fundamental frequencies. Journal of the Acoustical Society of

America. 2013: 133, 82-87.

Echternach M, Sundberg J, Arndt S, Breyer T, Markl M, Schumacher M, Richter

B. Vocal tract and register changes analysed by real time MRI in male professional

singers-a pilot study. Logopedics Phoniatrics Vocology. 2008: 33, 67-73.

http://www.ncbi.nlm.nih.gov/pubmed/?term=Echternach%20M%5BAuthor%5D&cauthor=true&cauthor_uid=23363198

http://www.ncbi.nlm.nih.gov/pubmed/?term=D%C3%B6llinger%20M%5BAuthor%5D&cauthor=true&cauthor_uid=23363198

http://www.ncbi.nlm.nih.gov/pubmed/?term=Sundberg%20J%5BAuthor%5D&cauthor=true&cauthor_uid=23363198

http://www.ncbi.nlm.nih.gov/pubmed/?term=Traser%20L%5BAuthor%5D&cauthor=true&cauthor_uid=23363198

http://www.ncbi.nlm.nih.gov/pubmed/?term=Richter%20B%5BAuthor%5D&cauthor=true&cauthor_uid=23363198



91

Echternach M, Sundberg J, Arndt S, Markl M, Schumacher M, and Richter B.

Vocal Tract in Female Registers-A Dynamic Real-Time MRI Study. Journal of

Voice. 2010a: 24, 133-139.

Echternach M, Sundberg J, Markl M, Richter B. Professional opera tenors’vocal

tract configurations in registers. Folia Phoniatrica et Logopaedica. 2010b; 62:278-

287.

Echternach M, Traser L, Markl M, and Richter B. Vocal Tract Configurations in

Male Alto Register Functions. Journal of Voice. 2011: 25, 670-677.

Enflo L, Sundberg J, Romedahl C, & McAllister A. Effects on vocal fold collision

and phonation threshold pressure of resonance tube phonation with tube end in

water. Journal of Speech Language and Hearing Research. 2013: 56, 1530-1538.

Fabre P. Un prodede electrique percutane d’inscription de l’accolement glottique

au cours de la phonation: glottographie de haute frequence. Premiers resultats.

Bulletin de L’Academie Nationale de Medecine. 1957:1, 66-69.

Fant G. Acoustic theory of speech production. Mouton press. The Hague. 1960.

Fant G. Acoustic Theory of Speech Production. With Calculations Based on X-ray

Studies of Russian Articulations. 2nd ed. The Hague. Mouton. 1970.

Fant G, & Lin Q. Glottal source-vocal tract acoustic interaction. Speech

Transmission Laboratory – Quarterly Progress and Status Report. 1987: 1, 13-27.

http://www.ncbi.nlm.nih.gov/pubmed?term=Enflo%20L%5BAuthor%5D&cauthor=true&cauthor_uid=23838993

http://www.ncbi.nlm.nih.gov/pubmed?term=Sundberg%20J%5BAuthor%5D&cauthor=true&cauthor_uid=23838993

http://www.ncbi.nlm.nih.gov/pubmed?term=Romedahl%20C%5BAuthor%5D&cauthor=true&cauthor_uid=23838993

http://www.ncbi.nlm.nih.gov/pubmed?term=McAllister%20A%5BAuthor%5D&cauthor=true&cauthor_uid=23838993


92

Fitch WT, Giedd J. Morphology and development of the human vocal tract: A

study using magnetic resonance imaging. Journal of the Acoustical Society of

America. 1999: 106, 1511-1522.

Freeman E, Woo P, Saxman J, and Murry T. A Comparison of Sung and Spoken

Phonation Onset Gestures Using High-Speed Digital Imaging. Journal of Voice.

2012: 26, 226-238.

Galgano J, Peck K, Branski R, Bogomolny D, Mener D, Ho M, Holodny A, and

Kraus D. Correlation between Functional MRI And Voice Improvement Following

Type I Thyroplasty in Unilateral Vocal Fold Paralysis-A Case Study. Journal of

Voice. 2009: 23, 639-645.

Gaskill C, Erickson M. The effect of a voiced lip trill on estimated glottal closed

quotient. Journal of Voice. 2008: 22, 634-643.

Gauffin J, Sundberg J. Spectral correlates of glottal voice source waveform

characteristics. Journal of Speech Language and Hearing Research. 1989: 32, 556-

65.

Granqvist S, Simberg S, Hertegård S, Holmqvist S, Larsson H, Lindestad PA,

Södersten M, Hammarberg B. Resonance tube phonation in water: High-speed

imaging, electroglottographic and oral pressure observations of vocal fold

vibrations - a pilot study. Logopedics Phoniatrics Vocology. 2014: 28, 1-9.

Gundermann H. Die Behandlung der gestörten Sprechstimme. Fischer. Stuttgart.

1977.

http://www.ncbi.nlm.nih.gov/pubmed/?term=Granqvist%20S%5BAuthor%5D&cauthor=true&cauthor_uid=24865620

http://www.ncbi.nlm.nih.gov/pubmed/?term=Simberg%20S%5BAuthor%5D&cauthor=true&cauthor_uid=24865620

http://www.ncbi.nlm.nih.gov/pubmed/?term=Herteg%C3%A5rd%20S%5BAuthor%5D&cauthor=true&cauthor_uid=24865620

http://www.ncbi.nlm.nih.gov/pubmed/?term=Holmqvist%20S%5BAuthor%5D&cauthor=true&cauthor_uid=24865620

http://www.ncbi.nlm.nih.gov/pubmed/?term=Larsson%20H%5BAuthor%5D&cauthor=true&cauthor_uid=24865620

http://www.ncbi.nlm.nih.gov/pubmed/?term=Lindestad%20PA%5BAuthor%5D&cauthor=true&cauthor_uid=24865620

http://www.ncbi.nlm.nih.gov/pubmed/?term=S%C3%B6dersten%20M%5BAuthor%5D&cauthor=true&cauthor_uid=24865620

http://www.ncbi.nlm.nih.gov/pubmed/?term=Hammarberg%20B%5BAuthor%5D&cauthor=true&cauthor_uid=24865620


93

Guzman M, Angulo M, Muñoz D, and Mayerhoff R. Effect on long-term average

spectrum of pop singers’ vocal warm-up with vocal function exercises.

International Journal of Speech Language Pathology 2013a: 15, 127-135.

Guzman M, Barros M, Espinoza F, Herrera A, Parra D, Muñoz D, Lloyd A.

Laryngoscopic, Acoustic, Perceptual, and Functional Assessment of Voice in Rock

Singers. Folia Phoniatrica et Logopaedica. 2013: 65, 248-256.

Guzman M, Callejas C, Castro C, García-Campo P, Lavanderos D, Valladares M,

Muñoz, D, & Carmona C. Therapeutic effect of semi-occluded vocal tract exercises

in patients with type I Muscle tension dysphonia. Revista Logopedia Foniatría y

Audiología. 2012: 32, 139-146.

Guzmán M, Higueras D, Fincheira C, Muñoz D, Guajardo C. Immediate effect of a

vocal exercises sequence with resonance tubes. Revista CEFAC. 2011: 14, 471-

480. What is the abbreviation CEFAC?

Guzman M, Higueras D, Finchiera C, Muñoz D, Guajardo C, Dowdall J.

Immediate acoustic effect of straw phonation exercises in subjects with dysphonic

voices. Logopedics Phoniatrics Vocology. 2013b: 38, 35-45.

Guzman M, Lanas A, Olavarria C, Azocar MJ, Muñoz D, Madrid S, Monsalve S,

Martinez F, Vargas S, Cortez P, Mayerhoff RM. Laryngoscopic and spectral

analysis of laryngeal and pharyngeal configuration in non-classical singing styles.

Journal of Voice. 2015: 29, 130.e21-8.

http://www.ncbi.nlm.nih.gov/pubmed/?term=Guzman%20M%5BAuthor%5D&cauthor=true&cauthor_uid=25179779

http://www.ncbi.nlm.nih.gov/pubmed/?term=Lanas%20A%5BAuthor%5D&cauthor=true&cauthor_uid=25179779

http://www.ncbi.nlm.nih.gov/pubmed/?term=Olavarria%20C%5BAuthor%5D&cauthor=true&cauthor_uid=25179779

http://www.ncbi.nlm.nih.gov/pubmed/?term=Azocar%20MJ%5BAuthor%5D&cauthor=true&cauthor_uid=25179779

http://www.ncbi.nlm.nih.gov/pubmed/?term=Mu%C3%B1oz%20D%5BAuthor%5D&cauthor=true&cauthor_uid=25179779

http://www.ncbi.nlm.nih.gov/pubmed/?term=Madrid%20S%5BAuthor%5D&cauthor=true&cauthor_uid=25179779

http://www.ncbi.nlm.nih.gov/pubmed/?term=Monsalve%20S%5BAuthor%5D&cauthor=true&cauthor_uid=25179779

http://www.ncbi.nlm.nih.gov/pubmed/?term=Martinez%20F%5BAuthor%5D&cauthor=true&cauthor_uid=25179779

http://www.ncbi.nlm.nih.gov/pubmed/?term=Vargas%20S%5BAuthor%5D&cauthor=true&cauthor_uid=25179779

http://www.ncbi.nlm.nih.gov/pubmed/?term=Cortez%20P%5BAuthor%5D&cauthor=true&cauthor_uid=25179779

http://www.ncbi.nlm.nih.gov/pubmed/?term=Mayerhoff%20RM%5BAuthor%5D&cauthor=true&cauthor_uid=25179779

http://www.ncbi.nlm.nih.gov/pubmed/?term=Laryngoscopic+and+spectral+analysis+of+laryngeal+and+pharyngeal+configuration+in+non-classical+singing+styles

94

Guzman M, Ortega A, Olavarría C, Muñoz D, Cortez P, Azocar MJ, Cayuleo D,

Quintana F, Silva C. Comparison of supraglottic activity and spectral slope

between theater actors and vocally untrained subjects. Journal of Voice. In press.

Hampala V, Garcia M, Švec JG, Scherer RC, Herbst CT. Relationship Between the

Electroglottographic Signal and Vocal Fold Contact Area. Journal of Voice. In

press.

Hixon T, Weismer G, Hoit J. Preclinical Speech Science. Plural Publishing. San

Diego, CA. 2008.

Horácek J, Laukkanen AM, Sidlof P, Murphy P, Svec JG. Comparison of

acceleration and impact stress as possible loading factors in phonation: a computer

modeling study. Folia Phoniatrica et Logopaedica. 2009: 61, 137-145.

Horáček J, Radolf V, Bula V, Laukkanen, A-M. Air-pressure, vocal folds vibration

and acoustic characteristics of phonation during vocal exercising. Part 2:

Measurement on a physical model. Engineering Mechanics, 2014: 21, 193-200.

Horáček J, Laukkanen A-M, Švec JG. Closed quotient as an estimate of impact

stress:A computer modelling study. Proceedings of the 7th International

Conference Advances in Quantitative Laryngology, Voice and Speech Research,

Groningen 6-7 October 2006.

Howard D, Tyrrell A, Murphy D, Cooper C, Mullen J. Bio-Inspired Evolutionary

Oral Tract Shape Modeling for Physical Modeling Vocal Synthesis. Journal of

Voice. 2009: 23, 11-20.

http://www.ncbi.nlm.nih.gov/pubmed/?term=Hampala%20V%5BAuthor%5D&cauthor=true&cauthor_uid=26256493

http://www.ncbi.nlm.nih.gov/pubmed/?term=Garcia%20M%5BAuthor%5D&cauthor=true&cauthor_uid=26256493

http://www.ncbi.nlm.nih.gov/pubmed/?term=%C5%A0vec%20JG%5BAuthor%5D&cauthor=true&cauthor_uid=26256493

http://www.ncbi.nlm.nih.gov/pubmed/?term=Scherer%20RC%5BAuthor%5D&cauthor=true&cauthor_uid=26256493

http://www.ncbi.nlm.nih.gov/pubmed/?term=Herbst%20CT%5BAuthor%5D&cauthor=true&cauthor_uid=26256493


http://www.ncbi.nlm.nih.gov/pubmed/?term=Hor%C3%A1cek%20J%5BAuthor%5D&cauthor=true&cauthor_uid=19571548

http://www.ncbi.nlm.nih.gov/pubmed/?term=Laukkanen%20AM%5BAuthor%5D&cauthor=true&cauthor_uid=19571548

http://www.ncbi.nlm.nih.gov/pubmed/?term=Sidlof%20P%5BAuthor%5D&cauthor=true&cauthor_uid=19571548

http://www.ncbi.nlm.nih.gov/pubmed/?term=Murphy%20P%5BAuthor%5D&cauthor=true&cauthor_uid=19571548

http://www.ncbi.nlm.nih.gov/pubmed/?term=Svec%20JG%5BAuthor%5D&cauthor=true&cauthor_uid=19571548


95

Inohara K, Sumita Y, Ohbayashi N, Ino S, Kurabayashi T, Ifukube T, Taniguchi H.

Standardization of Thresholding for Binary Conversion of Vocal Tract Modeling in

Computed Tomography. Journal of Voice. 2010: 24, 503-509.

Inwald E, Dollinger M, Schuster M, Eysholdt U, and Bohr C. Multiparametric

Analysis of Vocal Fold Vibrations in Healthy and Disordered Voices in High-

Speed Imaging. Journal of Voice. 2011: 25, 576-590.

Jiang JJ, Titze IR. Measurement of vocal fold intraglottal pressure and impact

stress. Journal of Voice. 1994: 8, 132-144.

Kankare E, Laukkanen AM, Ilomäki I, Miettinen A, Pylkkänen T.

Electroglottographic contact quotient in different phonation types using different

amplitude threshold levels. Logopedics Phoniatrics Vocology. 2012: 37, 127-132.

Kapsner-Smith MR, Hunter EJ, Kirkham K, Cox K, Titze IR. A Randomized

Controlled Trial of Two Semi-Occluded Vocal Tract Voice Therapy Protocols.

Journal of Speech Language and Hearing Research. 2015: 58, 535-549.

Kent R. Vocal tract acoustics. Journal of Voice.1993: 7, 797-117.

Kitzing P. Photo- and electroglottographical recording of the laryngeal vibratory

pattern during different registers. Folia Phoniatrica et Logopaedica. 1982: 34, 234-

241.

Kitzing P. Electroglottography. In: Ferlito A (Ed). Diseases of the Larynx. Arnold.

New York. 2000, 127-138.

http://www.ncbi.nlm.nih.gov/pubmed/?term=Jiang%20JJ%5BAuthor%5D&cauthor=true&cauthor_uid=8061769

http://www.ncbi.nlm.nih.gov/pubmed/?term=Titze%20IR%5BAuthor%5D&cauthor=true&cauthor_uid=8061769


http://www.ncbi.nlm.nih.gov/pubmed/?term=Kankare%20E%5BAuthor%5D&cauthor=true&cauthor_uid=22432606

http://www.ncbi.nlm.nih.gov/pubmed/?term=Laukkanen%20AM%5BAuthor%5D&cauthor=true&cauthor_uid=22432606

http://www.ncbi.nlm.nih.gov/pubmed/?term=Ilom%C3%A4ki%20I%5BAuthor%5D&cauthor=true&cauthor_uid=22432606

http://www.ncbi.nlm.nih.gov/pubmed/?term=Miettinen%20A%5BAuthor%5D&cauthor=true&cauthor_uid=22432606

http://www.ncbi.nlm.nih.gov/pubmed/?term=Pylkk%C3%A4nen%20T%5BAuthor%5D&cauthor=true&cauthor_uid=22432606


96

Krausert CR, Olszewski AE, Taylor LN, McMurray JS, Dailey SH, Jiang JJ.

Mucosal Wave Measurement and Visualization Techniques. Journal of Voice.

2001: 25, 395-405.

Krenmayr A, Wöllner T, Supper N, Zorowka P. Visualizing Phase Relations of the

Vocal Folds by Means of High-Speed Videoendoscopy. Journal of Voice. 2012:

26, 471-479.

Kotby N. The accent method of voice therapy. Singular Publishing. San Diego,

CA. 1995.

Köster O, Marx B, Gemmar P, Hess MM, Künzel HJ. Qualitative and Quantitative

Analysis of Voice Onset by means of a Multidimensional Voice Analysis System

(MVAS) Using High-Speed Imaging. Journal of Voice.1999: 13, 355-374.

Kotby M, El-Sady S, Basiouny S, Abou-Rass Y, Hegazi M. Efficacy of the accent

method of voice therapy. Journal of Voice. 1991: 4, 316-320.

Laukkanen A-M. About the so called ‘‘resonance tubes’’ used in Finnish voice

training practice. Scandinavian Journal of Logopedics and Phoniatrics. 1992:

17,151-161.

Laukkanen A-M, Horacek J, Krupa P, Svec JG. The effect of phonation into a

straw on the vocal tract adjustments and formant frequencies. A preliminary MRI

study on a single subject completed with acoustic results. Biomedical Signal

Processing and Control. 2010: 7, 50-57.

http://www.ncbi.nlm.nih.gov/pubmed?term=%22Krenmayr%20A%22%5BAuthor%5D

http://www.ncbi.nlm.nih.gov/pubmed?term=%22W%C3%B6llner%20T%22%5BAuthor%5D

http://www.ncbi.nlm.nih.gov/pubmed?term=%22Supper%20N%22%5BAuthor%5D

http://www.ncbi.nlm.nih.gov/pubmed?term=%22Zorowka%20P%22%5BAuthor%5D

http://www.ncbi.nlm.nih.gov/pubmed?term=%22K%C3%B6ster%20O%22%5BAuthor%5D

http://www.ncbi.nlm.nih.gov/pubmed?term=%22Marx%20B%22%5BAuthor%5D

http://www.ncbi.nlm.nih.gov/pubmed?term=%22Gemmar%20P%22%5BAuthor%5D

http://www.ncbi.nlm.nih.gov/pubmed?term=%22Hess%20MM%22%5BAuthor%5D

http://www.ncbi.nlm.nih.gov/pubmed?term=%22K%C3%BCnzel%20HJ%22%5BAuthor%5D

javascript:void(0);

javascript:void(0);

javascript:void(0);

javascript:void(0);

javascript:void(0);

97

Laukkanen AM, Horáček J, Havlík R. Case-study magnetic resonance imaging and

acoustic investigation of the effects of vocal warm-up on two voice professionals.

Logopedics Phoniatrics Vocology. 2012: 37, 75-82.

Laukkanen A-M, Lindholm P, Vilkman E. Phonation into a tube as a voice training

method: acoustic and physiologic observations. Folia Phoniatrica et Logopaedica.

1995a: 47, 331-338.

Laukkanen A-M, Lindholm P, Vilkman E. On the effects of various vocal training

methods on glottal resistance and efficiency. Folia Phoniatrica et Logopaedica.

1995b: 47, 324-330.

Laukkanen A-M, Lindholm P, Vilkman E, et al. A physiological and acoustic study

on voiced bilabial fricative /ß:/ as a vocal exercise. Journal of Voice. 1996: 10, 67-

77.

Laukkanen A-M, Pulakka H, Alku P, Vilkman E, Hertegård S, Lindestad PA,

Larsson H, Granqvist S. High-speed registration of phonation-related glottal area

variation during artificial lengthening of the vocal tract. Logopedics Phoniatrics

Vocology. 2007: 32, 157-164.

Laukkanen AM, Takalo R, Vilkman E, Nummenranta J, Lipponen T. Simultaneous

videofluorographic and dual-channel electroglottographic registration of the

vertical laryngeal position in various phonatory tasks. Journal of Voice. 1999: 13,

60-71.

http://www.ncbi.nlm.nih.gov/pubmed?term=Laukkanen%20AM%5BAuthor%5D&cauthor=true&cauthor_uid=22394011

http://www.ncbi.nlm.nih.gov/pubmed?term=Hor%C3%A1%C4%8Dek%20J%5BAuthor%5D&cauthor=true&cauthor_uid=22394011

http://www.ncbi.nlm.nih.gov/pubmed?term=Havl%C3%ADk%20R%5BAuthor%5D&cauthor=true&cauthor_uid=22394011


http://www.ncbi.nlm.nih.gov/pubmed?term=%22Laukkanen%20AM%22%5BAuthor%5D

http://www.ncbi.nlm.nih.gov/pubmed?term=%22Pulakka%20H%22%5BAuthor%5D

http://www.ncbi.nlm.nih.gov/pubmed?term=%22Alku%20P%22%5BAuthor%5D

http://www.ncbi.nlm.nih.gov/pubmed?term=%22Vilkman%20E%22%5BAuthor%5D

http://www.ncbi.nlm.nih.gov/pubmed?term=%22Herteg%C3%A5rd%20S%22%5BAuthor%5D

http://www.ncbi.nlm.nih.gov/pubmed?term=%22Lindestad%20PA%22%5BAuthor%5D

http://www.ncbi.nlm.nih.gov/pubmed?term=%22Larsson%20H%22%5BAuthor%5D

http://www.ncbi.nlm.nih.gov/pubmed?term=%22Granqvist%20S%22%5BAuthor%5D

http://www.ncbi.nlm.nih.gov/pubmed?term=Laukkanen%20AM%5BAuthor%5D&cauthor=true&cauthor_uid=10223676

http://www.ncbi.nlm.nih.gov/pubmed?term=Takalo%20R%5BAuthor%5D&cauthor=true&cauthor_uid=10223676

http://www.ncbi.nlm.nih.gov/pubmed?term=Vilkman%20E%5BAuthor%5D&cauthor=true&cauthor_uid=10223676

http://www.ncbi.nlm.nih.gov/pubmed?term=Nummenranta%20J%5BAuthor%5D&cauthor=true&cauthor_uid=10223676

http://www.ncbi.nlm.nih.gov/pubmed?term=Lipponen%20T%5BAuthor%5D&cauthor=true&cauthor_uid=10223676


98

Laukkanen A-M. Titze I, Hoffman H, Finnegan E. Effects of a Semi-occluded

Vocal Tract on Laryngeal Muscle Activity and Glottal Adduction in a Single

Female Subject. Folia Phoniatrica et Logopaedica. 2008: 60, 298-311.

Lawrence V. Laryngological observations on belting. Journal of Research in

Singing 1979: 2, 26-28.

Leino T, Laukkanen A-M (1993). Äänitysetäisyyden vaikutus puheäänen

keskiarvospektriin. (Effects of microphone distance on the long-term-average

spectrum of speech) (Abstract in English). In: Iivonen A, Aulanko R, toim.

Papers from the 17th

meeting of Finnish Phoneticians, Helsinki 1992.

Publications of the Department of Phonetics, University of Helsinki, Helsinki

1993, pp. 117-129.

Lindestad P, Södersten M, Merker B, Granqvist S. Voice Source Characteristics in

Mongolian “Throat Singing” Studied with High-Speed Imaging Technique,

Acoustic Spectra, and Inverse Filtering. Journal of Voice.2001: 15, 78-85.

Lohscheller J, Doellinger M, McWhorter A, Kunduk M. Preliminary Study on the

Quantitative Analysis of Vocal Loading Effects on Vocal Fold Dynamics Using

Phonovibrograms. Annals of Otology Rhinology Laryngology. 2008: 117, 484-

493.

Lohscheller J, Eysholdt U, Toy H, Doellinger M. Phonovibrography: mapping

high-speed movies of vocal fold vibrations into 2D-diagrams for visualizing and

analyzing the underlying laryngeal dynamics. Institute of Electrical and Electronic

Transaction on Medical Imaging. 2007.


99

Maeda S. A digital simulation of the vocal-tract system. Speech

Communication.1982: 1, 199-229.

Mayerhoff RM, Guzman M, Jackson-Menaldi C, Munoz D, Dowdall J, Maki A,

Johns MM 3rd, Smith LJ, Rubin AD. Analysis of supraglottic activity during

vocalization in healthy singers. Laryngoscope. 2014: 124, 504-509.

McDonnell M, Sundberg J, Westerlund J, Lindestad P, and Larsson H,Vocal Fold

Vibration and Phonation Start in Aspirated, Unaspirated, and Staccato Onset.

Journal of Voice. 2011: 25, 526-531.

Miller N, Gregory J, Semple S, Aspden R, Stollery P, and Gilbert F. The Effects of

Humming and Pitch on Craniofacial and Craniocervical Morphology Measured

Using MRI. Journal of Voice. 2012a: 26, 90-101.

Miller M, Gregory J, Semple S, Aspden R, Stollery P, and Gilbert F. Relationships

Between Vocal Structures, the Airway, and Craniocervical Posture Investigated

Using Magnetic Resonance Imaging. Journal of Voice. 2012b: 26, 102-109.

Moore CA. The correspondence of vocal tract resonance with volumes obtained

from magnetic resonance imaging. Journal of Speech Hearing Research. 1992: 35,

1009-1023.

Morrison MD, Rammage LA, Belisle GM, Pullan CB, Nichol H. Muscular tension

dysphonia. Journal of Otolaryngology 1983: 12, 302-306.

http://www.ncbi.nlm.nih.gov/pubmed?term=Mayerhoff%20RM%5BAuthor%5D&cauthor=true&cauthor_uid=23877891

http://www.ncbi.nlm.nih.gov/pubmed?term=Guzman%20M%5BAuthor%5D&cauthor=true&cauthor_uid=23877891

http://www.ncbi.nlm.nih.gov/pubmed?term=Jackson-Menaldi%20C%5BAuthor%5D&cauthor=true&cauthor_uid=23877891

http://www.ncbi.nlm.nih.gov/pubmed?term=Munoz%20D%5BAuthor%5D&cauthor=true&cauthor_uid=23877891

http://www.ncbi.nlm.nih.gov/pubmed?term=Dowdall%20J%5BAuthor%5D&cauthor=true&cauthor_uid=23877891

http://www.ncbi.nlm.nih.gov/pubmed?term=Maki%20A%5BAuthor%5D&cauthor=true&cauthor_uid=23877891

http://www.ncbi.nlm.nih.gov/pubmed?term=Johns%20MM%203rd%5BAuthor%5D&cauthor=true&cauthor_uid=23877891

http://www.ncbi.nlm.nih.gov/pubmed?term=Smith%20LJ%5BAuthor%5D&cauthor=true&cauthor_uid=23877891

http://www.ncbi.nlm.nih.gov/pubmed?term=Rubin%20AD%5BAuthor%5D&cauthor=true&cauthor_uid=23877891


100

Murugappan S, Khosla S, Casper K, Oren L, Gutmark E. Flow Fields and

Acoustics in a Unilateral Scarred Vocal Fold Model. Annals of Otology Rhinology

Laryngology. 2009: 118, 44-50.

Narayanan S, Nayak K, Lee S, Sethy A, Byrd D. An approach to real-time

magnetic resonance imaging for speech production. Journal of the Acoustical

Society of America. 2004: 15, 1771-1776.

Nordenberg M, Sundberg J. Effect on LTAS of vocal loudness variation.

Logopedics Phoniatrics Vocology. 2004: 29, 183-191.

Orlikoff R, Deliyski D, Baken R, and Watson B. Validation of a Glottographic

Measure of Vocal Attack. Journal of Voice. 2009: 23, 164-168.

Paes SM, Zambon F, Yamasaki R, Simberg S, Behlau M. Immediate effects of the

finnish resonance tube method on behavioral dysphonia. Journal of Voice. 2013:

27, 717-722.

Patel R, Dailey S, Bless D. Comparison of High-Speed Digital Imaging With

Stroboscopy for Laryngeal Imaging of Glottal Disorders. Annals of Otology

Rhinology Laryngology. 2008;7:413-424.

Peltokoski J, Tyrmi J, Kankare E, Ilomäki I, Laukkanen A-M, Geneid A.

Resonaattoriputki ilmassa ja vedessä (Resonance tube in air and in water). Puhe ja

kieli. 2015 :35, 145-159.

Pershall K, Boone S. Supraglottic Contribution to Voice Quality. Journal of Voice

1987: 1, 186-190.



http://www.ncbi.nlm.nih.gov/pubmed?term=Paes%20SM%5BAuthor%5D&cauthor=true&cauthor_uid=24119641

http://www.ncbi.nlm.nih.gov/pubmed?term=Zambon%20F%5BAuthor%5D&cauthor=true&cauthor_uid=24119641

http://www.ncbi.nlm.nih.gov/pubmed?term=Yamasaki%20R%5BAuthor%5D&cauthor=true&cauthor_uid=24119641

http://www.ncbi.nlm.nih.gov/pubmed?term=Simberg%20S%5BAuthor%5D&cauthor=true&cauthor_uid=24119641

http://www.ncbi.nlm.nih.gov/pubmed?term=Behlau%20M%5BAuthor%5D&cauthor=true&cauthor_uid=24119641




101

Raappana A, Koivukangas J, Pirilä T. 3D modeling-based surgical planning in

transsphenoidal pituitary surgery-preliminary results. Acta Otolaryngologica.

2008;128:1011-1018.

Radolf V, Laukkanen A-M, Horáček J, & Liu D. Air-pressure, vocal fold vibration

and acoustic characteristics of phonation during vocal exercising. Part 1:

Measurement in vivo. Engineering Mechanics. 2014: 21, 53-59.

Rothenberg M. Acoustic interaction between the glottal source and the vocal tract.

In: Stevens KN, Hirano M (Eds). Vocal Fold Physiology (pp. 305-328). Tokyo:

University of Tokyo Press; 1981.

Rothenberg M. Cosí fan tutte and what it means or Nonlinear Source-Tract acoustic

interaction in the soprano voice and some implications for the definition of vocal

efficiency. In Vocal fold physiology: Laryngeal function in Phonation and

Respiration. Ed by Baer T, Sasaki C & Harris K.S. San Diego: College-Hill Press;

1986.

Rothenberg M, Mahshie J. Monitoring vocal fold abduction through vocal fold

contact area. Journal of Speech Hearing Research. 1988: 31, 338-351.

Roy N, Weinrich B, Gray SD, Tanner K, Stemple JC, Sapienza CM. Three

treatments for teachers with voice disorders: A randomized clinical trial. Journal of

Speech, Language, and Hearing Research. 2003: 46, 670-688.

102

Rubin A, Praneetvatakul V, Gherson S, Moyer C, Sataloff R. Laryngeal

hyperfunction during whispering: Reality or myth? Journal of voice. 2006: 20, 121-

127.

Sihvo M. Terve ääni, äänen hoidon A B C. [In Finnish] Healthy voice. The A B C

for voice care. Kirjapaja. Helsinki. 2006.

Silverman PM, Zeiberg AS, Sessions RB, Troost TR, Zeman RK.

Threedimensional imaging of the hypopharynx and larynx by means of helical

(spiral) computed tomography. Comparison of radiological and otolaryngological

evaluation. Annals of Otology Rhinology Laryngology. 1995: 104, 425-431.

Simberg S, & Laine A. The resonance tube method in voice therapy: Description

and practical Implementations. Logopedics Phoniatrics Vocology. 2007: 32, 165-

170.

Simberg S, Sala E, Tuomainen J, & Sellman J. The effectiveness of group therapy

for students with mild voice disorders: a controlled clinical trial. Journal of Voice.

2006: 20, 97-109.

Sovijärvi A. Eräitä huomioita funktionaalisen dysfonianhoidosta [Some

observations of the treatment of functional dysphonia]. Finnish Society for

Phoneticians and Logopedists. 1977, 19-22.

Sovijärvi A, Häyrinen R, Orden-Pannila M, & Syvänen M. Aänifysiologisten

kuntoutusharjoitusten ohjeita. [Instructions for voice exercises]. Publications of

Suomen Puheopisto. Helsinki. 1989.

http://www.ncbi.nlm.nih.gov/pubmed?term=Simberg%20S%5BAuthor%5D&cauthor=true&cauthor_uid=15963687

http://www.ncbi.nlm.nih.gov/pubmed?term=Sala%20E%5BAuthor%5D&cauthor=true&cauthor_uid=15963687

http://www.ncbi.nlm.nih.gov/pubmed?term=Tuomainen%20J%5BAuthor%5D&cauthor=true&cauthor_uid=15963687

http://www.ncbi.nlm.nih.gov/pubmed?term=Sellman%20J%5BAuthor%5D&cauthor=true&cauthor_uid=15963687

103

Stager S, Bielamowicz S, Regnell J, Gupta A, Brakmeier J. Supraglottic activity:

Evidence of hyperfunction or laryngeal articulation? Journal of Speech Hearing

Research. 2000: 43, 229-238.

Stemple JC. Voice therapy: Clinical studies. Singular Publishing Group. San

Diego, CA. 2000.

Stemple J, Graze L, Klaben B. Clinical voice pathology: theory and management.

3rd ed. Singular Publishing Group. San Diego, CA. 2000.

Story BH. Comparison of magnetic resonance imaging-based vocal tract area

functions obtained from the same speaker in 1994 and 2002. Journal of the

Acoustical Society of America. 2008: 123, 327-335.

Story B, Laukkanen A-M, Titze I. Acoustic impedance of an artificially lengthened

and constricted vocal tract. Journal of Voice. 2000: 14, 455-469.

Story BH, Titze IR, Hoffman EA. Vocal tract area functions from magnetic

resonance imaging. Journal of the Acoustical Society of America. 1996: 100, 537-

554.

Story BH, Titze IR, Hoffman EA. Vocal tract area functions for an adult female

speaker based on volumetric imaging. Journal of the Acoustical Society of

America. 1998: 104, 471-487.

Story BH, Titze IR, Hoffman EA. The relationship of vocal tract shape to three

voice qualities. Journal of the Acoustical Society of America. 2001: 109, 1651-

1667.

http://www.ncbi.nlm.nih.gov/pubmed?term=%22Story%20BH%22%5BAuthor%5D

http://www.ncbi.nlm.nih.gov/pubmed?term=%22Titze%20IR%22%5BAuthor%5D

http://www.ncbi.nlm.nih.gov/pubmed?term=%22Hoffman%20EA%22%5BAuthor%5D


104

Sundberg J. Articulatory interpretation of the singing formant. Journal of the


Sundberg J, Birch P, Gumoes B, Stavad H, Prytz S, and Karle A. Experimental

Findings on the Nasal Tract Resonator in Singing. Journal of Voice.2007: 21, 127-

137.

Svec JG, Sram F, Schutte HK. Videokymography in voice disorders: what to look

for? Annals of Otology Rhinology Laryngology. 2007: 116, 172-180.

Takemoto H, Adachi S, Kitamura T, Mokhtari P, and Honda K. Acoustic roles of

the laryngeal cavity in vocal tract resonance. Journal of the Acoustical Society of

America.2006: 120, 2228-2238.

Tapani M. Resonaattoriputki toiminnallisen a a ihairion hoitmenetelma na. Seitsema n

naispotilaan Seurantatutukimus [Resonance tube as a therapy method for a

functional voice disorder. A follow-up study of seven female patients] (in Finnish).

Helsinki, Finland: University of Helsinki, 1992.

Thomas L, Stemple J. Voice therapy: Does science support the art? Communicative

Disorders review. 2007: 1, 49-77.

Titze I. The physics of small-amplitude oscillation of the vocal folds. Journal of

The Acoustical Society of America. 1988: 83, 1536-1552.

Titze I. Interpretation of electroglottographic signal. Journal of Voice. 1990: 4, 1-9.

http://www.ncbi.nlm.nih.gov/pubmed/?term=Svec%20JG%5BAuthor%5D&cauthor=true&cauthor_uid=17419520

http://www.ncbi.nlm.nih.gov/pubmed/?term=Sram%20F%5BAuthor%5D&cauthor=true&cauthor_uid=17419520

http://www.ncbi.nlm.nih.gov/pubmed/?term=Schutte%20HK%5BAuthor%5D&cauthor=true&cauthor_uid=17419520


105

Titze IR. Mechanical stress in phonation. Journal of Voice. 1994:8, 99-105.

Titze I. Voice research: the wide pharynx. Journal of Singing. 1998: 55, 27-28.

Titze I. Regulating glottal airflow in phonation: Application of the maximum

power transfer theorem to a low dimensional phonation model. Journal of the


Titze IR. A theoretical study of F0-F1 interaction with application to resonant

speaking and singing voice. Journal of Voice. 2003: 18, 292-298.

Titze IR. Theory of glottal airflow and source-filter interaction in speaking and

singing. Acta Acustica-Acustica. 2004: 90, 641-648.

Titze IR. Theoretical analysis of maximum flow declination rate versus maximum

area declination rate in phonation. Journal of Speech Language and Hearing

Research. 2006a: 49, 439-447.

Titze I. Voice training and therapy with a semi-occluded vocal tract: rationale and

scientific underpinnings. Journal of Speech Language and Hearing Research.

2006b: 49, 448-459.

Titze IR. Nonlinear source-filter coupling in phonation: theory. Journal of the


Titze IR, Tobias Riede, Peter Popolo. Nonlinear source–filter coupling in

phonation: Vocal exercises. Journal of the Acoustical Society of America. 2008:

123, 1902-1915.

http://www.ncbi.nlm.nih.gov/pubmed/?term=Titze%20IR%5BAuthor%5D&cauthor=true&cauthor_uid=8061776


http://www.ncbi.nlm.nih.gov/pubmed/?term=Titze%20I%5Bauth%5D

106

Titze I, Finnegan E, Laukkanen A, Jaiswal S. Raising lung pressure and pitch in

vocal warm-ups: the use of flow-resistant straws. Journal of Singing. 2002: 58,

329-338.

Titze I, Laukkanen A. Can vocal economy in phonation be increased with an

artificially lengthened vocal tract? A computer modeling study. Logopedics

Phoniatrics Vocology. 2007: 32, 147-156.

Titze I, Story B. Acoustic interactions of the voice source with the lower vocal

tract. Journal of the Acoustical Society of America. 1997: 101, 2234-2243.

Vampola T, Laukkanen A-M, Horáček J, Švec JG. Vocal tract changes caused by

phonation into a tube: A case study using computer tomography and finite-element

modeling. Journal of the Acoustical Society of America. 2011: 129, 310-315.

Vasconcelos M, Ventura S, Freitas D, and Tavares J. Towards the Automatic Study

of the Vocal Tract From Magnetic Resonance Images. Journal of Voice. 2011: 25,

732-742.

Verdolini K. Resonant voice therapy. In Verdolini K (Ed.). National Center for

Voice and Speech’s Guide to Vocology. National Center for Voice and Speech.

Iowa City, IA.1998.

Verdolini K, Chan R, Titze I, Hess I, Bierhals W. Correspondence of

Electroglottographic Closed Quotient to Vocal Fold Impact Stress in Excised

Canine Larynges. Journal of Voice. 1998: 12, 415-423.

http://www.ncbi.nlm.nih.gov/pubmed?term=%22Titze%20IR%22%5BAuthor%5D

http://www.ncbi.nlm.nih.gov/pubmed?term=%22Laukkanen%20AM%22%5BAuthor%5D

javascript:AL_get(this,%20'jour',%20'Logoped%20Phoniatr%20Vocol.');

javascript:AL_get(this,%20'jour',%20'Logoped%20Phoniatr%20Vocol.');


107

Verdolini K, Druker D, Palmer P, Samawi H. Laryngeal adduction in resonant

voice. Journal of Voice. 1998: 12, 315-327.

Vorperian HK, Kent RD, Lindstrom MJ, Kalina CM, Gentry LR, Yandell BS.

Development of vocal tract length during early childhood: a magnetic resonance

imaging study. Journal of the Acoustical Society of America. 2005: 117, 338-350.

Whalen DJ, Min Kang A, Magen HS, Fulbright RK, Gore JC. Predicting

midsagittal pharynx shape from tongue position during vowel production. Journal

of Speech Language and Hearing Research. 1999: 42, 592-603.

White P. A study of the effects of vocal loudness intensity variation on children’s

voices using long-term average spectrum analysis. Logopedics Phoniatrics

Vocology. 1998: 23, 111-120.

Wittenberg T, Tigges M, Mergell P, and Eysholdt U. Functional Imaging of Vocal

Fold Vibration: Digital Multislice High-Speed Kymography. Journal of Voice.

2000: 14, 422-442.

Yamasaki R, Behlau M, Campone O, and Yamashita H. MRI Anatomical and

Morphological Differences in the Vocal Tract Between Dysphonic and Normal

Adult Women. Journal of Voice. 2011: 25, 743-750.

Yanagisawa E, Estill J, Kmucha S, Leder S. The Contribution of Aryepiglottic

Constriction to "Ringing" Voice Quality A Videolaryngoscopic Study with

Acoustic Analysis. Journal of Voice. 1989:3, 342-350.

108

Yan Y, Ahmad K, Kunduk M, and Bless D. Analysis of Vocal-fold Vibrations

from High-Speed Laryngeal Images Using a Hilbert Transform-Based

Methodology. Journal of Voice.2005: 19, 161-175.

Zhang Y, Bieging E, Tsui H, and Jiang J. Efficient and Effective Extraction of

Vocal Fold Vibratory Patterns from High-Speed Digital Imaging. Journal of Voice.

2010: 24, 21-29.

Vocal Tract and Glottal Function During and After

Vocal Exercising With Resonance Tube and Straw

*,†Marco Guzman, †Anne-Maria Laukkanen, ‡Petr Krupa, §Jaromir Hor�a�cek, kJan G. �Svec,and {,#Ahmed Geneid, *Santiago, Chile, yTampere and {Helsinki, Finland, zBrno, xPrague, and kOlomouc, Czech Republic, and #Ismailia,Egypt

Summary: Objective. The present study aimed to investigate the vocal tract and glottal function during and after

AccepThis s

Czech RJ.G.S.) aFrom

Santiagoof Tampemomechics, FacuClinic, HNose andAddre

tion ScieUniversitE-mail: gJourna0892-1� 201http://d

phonation into a tube and a stirring straw.Methods. A male classically trained singer was assessed. Computerized tomography (CT) was performed when thesubject produced [a:] at comfortable speaking pitch, phonated into the resonance tube and when repeating [a:] after theexercise. Similar procedure was performed with a narrow straw after 15 minutes silence. Anatomic distances and areameasures were obtained from CTmidsagittal and transversal images. Acoustic, perceptual, electroglottographic (EGG),and subglottic pressure measures were also obtained.Results. During and after phonation into the tube or straw, the velum closed the nasal passage better, the larynx po-sition lowered, and hypopharynx area widened. Moreover, the ratio between the inlet of the lower pharynx and the outletof the epilaryngeal tube became larger during and after tube/straw phonation. Acoustic results revealed a stronger spec-tral prominence in the singer/speaker’s formant cluster region after exercising. Listening test demonstrated better voicequality after straw/tube than before. Contact quotient derived from EGG decreased during both tube and straw and re-mained lower after exercising. Subglottic pressure increased during straw and remained somewhat higher after it.Conclusion. CT and acoustic results indicated that vocal exercises with increased vocal tract impedance lead to in-creased vocal efficiency and economy. One of the major changes was the more prominent singer’s/speaker’s formantcluster. Vocal tract and glottal modifications were more prominent during and after straw exercising compared withtube phonation.Key Words: Vocal exercises–Semi-occlusions–Resonance tube–Vocal tract impedance–Computerized tomography–Electroglottography–Subglottic pressure–Singer’s/speaker’s formant cluster.

INTRODUCTION

Semi-occluded vocal tract setting has been extensively used byspeech pathologists and voice trainers as therapeutic and train-ing exercises, respectively. Various types of tubes have beenused to perform these voice exercises. One of them is the tradi-tional Finnish glass tube, ‘‘resonance tube,’’ 26–28 cm in lengthand 8–9 mm in inner diameter.1,2 A more accessible option iscommercial plastic drinking straws. Moreover, Titze3,4 hasproposed the use of stirring straws, shorter and thinner plasticstraws, which are commonly used to stir coffee in the UnitedStates. Resonance tube phonation into the water (waterresistance therapy) also has a long tradition as a therapeutictool.1,5

Several benefits have been attributed to vocal exercises in-volving tube phonation or other semi-occluded vocal tract pos-tures, such as y-buzz,6 tongue trill, lip trill, and voiced bilabial

ted for publication February 19, 2013.tudy was supported by the Academy of Finland, grant number 128095. In theepublic, it was supported by the project GACR No. P101/12/1306 (J.H. andnd the European Social Fund project no. CZ.1.07/2.3.00/20.0057 (J.G.S.).the *School of Communication Science and Disorders, University of Chile,, Chile; ySpeech and Voice Research Laboratory, School of Education, Universityre, Tampere, Finland; zSurGal Clinic, Brno, Czech Republic; xInstitute of Ther-anics, Academy of Sciences, Prague, Czech Republic; kDepartment of Biophys-lty of Sciences, Palacky University, Olomouc, Czech Republic; {Phoniatricelsinki University Hospital, Helsinki, Finland; and the #Department of Ear,Throat, Suez Canal University, Ismailia, Egypt.

ss correspondence and reprint requests to Marco Guzman, School of Communica-nce and Disorders, Speech and Voice Research Laboratory, School of Education,y of Chile, University of Tampere, Avenida Independencia 1029, Santiago, [email protected] of Voice, Vol. 27, No. 4, pp. 523.e19-523.e34997/$36.003 The Voice Foundationx.doi.org/10.1016/j.jvoice.2013.02.007

fricative [ß:]. Some of these benefits are an increase in vocaltract impedance, specifically resulting in changes in the inertivereactance,7–11 which may be favorable to voice production bydecreasing phonation threshold pressure10 and by increasingskewing of the glottal flow waveform (faster cessation of theflow).9,10 The vocal tract impedance can affect the voicesource function in two ways (1) through an acoustic-aerodynamic interaction and (2) through a mechano-acousticinteraction.7,12

In the former, the shape of the glottal flow pulse is affected bythe acoustic pressures in the vocal tract.7,8 When fundamentalfrequencies (F0s) are below the first formant, the skewing ofthe flow pulse is increased by supraglottic acoustic pressurescompared with the glottal area, so that the airflow issuppressed at glottal opening and maintained during theglottal closing phase. This increased skewing of the glottalflow waveform leads to strengthening of the higher harmonics(less spectral tilt), increase in sound pressure level(SPL),9,13,14 and to a more resonant voice quality, that is,vibratory sensations in the frontal part of face, alveolar ridge,and head area with easy voice production.15 This is reflectedin a brighter and louder sound. Titze and Sundberg16 pointedout that because the skewing of the glottal airflow signal isone of the determinants of vocal intensity, the source-filter in-teraction can be used to increase intensity rather than vibra-tional amplitude, thus avoiding an increase in the vocal foldimpact stress.14

The second way that glottal source function can be affectedby vocal tract impedance is the mechano-acoustic interactionof the vocal tract pressures and the vocal folds.7,8,17,18 This

Delta:1_given name

Delta:1_given name

Delta:1_surname

Delta:1_given name

Delta:1_given name

Delta:1_surname

Delta:1_given name

Delta:1_given name

Delta:1_surname

Delta:1_given name

Delta:1_given name

Delta:1_surname

mailto:[email protected]

http://dx.doi.org/10.1016/j.jvoice.2013.02.007

Journal of Voice, Vol. 27, No. 4, 2013523.e20

occurs when the increased vocal tract inertance affects the vocalfold vibration itself (through affecting the pressure above andinside the glottis). Specifically, the inertive reactance lowersthe phonation threshold pressure (the subglottal pressurerequired to barely initiate and sustain phonation).10 A low pho-nation threshold pressure would produce an ease of phonation(decrease in perceived phonatory effort).

Not only an anterior semi-occlusion or lengthening of the vo-cal tract can produce this favorable condition but also a narrow-ing in the lower vocal tract. Titze and Story10 reported thatwhen the epilaryngeal tube is quite narrow, making the inputimpedance to the vocal tract comparable to the glottal imped-ance, there is an important source-filter interaction. The iner-tance (the positive component of the reactance) of the vocaltract facilitates vocal fold vibration by lowering the oscillationthreshold pressure.When the narrowed epilarygeal tube is com-bined with a wide pharynx, two relevant effects occur: (1) thevocal tract is inertive for a wide range of F0 values and (2) a for-mant cluster between F3 and F5 is produced (singer’s orspeaker’s formant), which is desirable to increase the vocalloudness without an increase in vocal effort.

An increased supraglottal pressure and consequently an ele-vation of the intraglottal pressure have also been reported duringsemi-occlusions.3,17,19 When the vocal tract constriction is tightor the tube added to the vocal tract is narrow or long, it increasesintraglottal air pressurewhich tends to separate the vocal folds ifnot compensated by raising subglottic pressure and adduction.Separation of the vocal folds (decreased adduction), in turn,would explain a decreased contact quotient (CQ).3,19 Severalstudies have reported a change in the relative closed timeof the glottis on an electroglottographic (EGG) signal (eg,CQ) when semi-occlusion is compared with vowel phona-tion.2,3,20–26 Only some of them have shown a decreased CQ.

Different methods have been used to obtain data of theeffects of these vocal exercises. Electroglottography,3,20–26

acoustics,2,23,25,27–33 modeling, and simulation studies7,11,19 arethe most dominant. In addition, some studies have been carriedout using aerodynamic,3,23,27,34,35 electromyographic,34,35

endoscopic,27,32 and radiological37–39 measurements. Most ofthem have examined the changes of vocal folds parametersduring and/or after semi-occluded exercises. Nevertheless, fewresearches have reported outcomes related to vocal tract configu-ration changes (shape modification) as an effect of the semi-occlusion or artificial lengthening of the vocal tract.37–39

In this regard, in a recent investigation conducted by Laukka-nen et al,37 one female participant was assessed with magneticresonance imaging (MRI) and acoustic analysis before and afterstraw phonation. Midsagittal area of the vocal tract increasedand the velar closure improved during and after straw exercis-ing. Furthermore, the ratio of the transversal area of the lowerpharynx over that of the epilarynx increased both during and af-ter the straw. Acoustic changes showed a higher total SPL andalso more energy in the speaker’s formant cluster. The distancesbetween F4 and F3 and F5 and F4 demonstrated a decrease. Au-thors suggested that the use of straw vocal exercises helps toproduce a speaker’s formant cluster, which increases loudnessand thus improves vocal economy. Similar MRI and acoustic

findings were observed in another study designed to observechanges in voice production after warm-up for two professionalvoice users. One of the participants used semi-occlusions ([ß:],[m:], and closed vowels [y:] and [u:]).38

Vampola et al39 examined the vocal tract shape in a femalesubject before, during, and after phonation into a tube usingcomputerized tomography (CT) and finite element models(FEMs) to study changes in vocal tract input impedance. Re-sults indicated that the phonation into a tube causes changesin the vocal tract, which remain also when the tube is removed.Authors observed tightened velopharyngeal closure and en-larged cross-sectional areas of the oropharyngeal and oral cav-ities during and after the tube phonation when phonating onvowel [a:]. FEM calculations revealed an increased input iner-tance (inertive or positive reactance) of the vocal tract, espe-cially in the frequency range from 2500 to 4000 kHz, andincreased acoustic energy radiated out of the vocal tract afterthe tube phonation.To date, no studies have been addressed to assess the possible

different training effects produced by two different types of ar-tificial lengthening on the vocal tract setting. Few earlier studieshave reported differences in glottal source changes when usingtwo or more different semi-occlusions.2,3,7,19,27,34 The presentstudy therefore aimed at investigating the vocal tractmodifications and also the acoustic, aerodynamic, and EGGcharacteristics of the voice when comparing vocal exercisingwith two different vocal tract impedances (during phonationinto a glass tube and a stirring straw). We hypothesize thatthe glottis and/or the supraglottic behavior should adaptdifferently to different load impedances of the vocal tract.

METHOD

CT scanning

CT was carried out in Surgical Clinic, Department of Imagingand Radiology in Brno, Czech Republic. The CT imageswere acquired using a Toshiba-Aquilion CT machine. The CTimaging parameters used to provide images of the vocal tractwere voltage 120 kV, scan option: helical CT, time of the rota-tion 0.5 seconds, slice thickness 0.5 mm, and total number ofslices: 510. The examination was performed for one male clas-sically trained singer (34 years) who has 7 years of experienceusing tube phonation and other semi-occlusions as vocal train-ing and warm-up exercises. The subject did not report anyknown voice or hearing pathology at the time of the experiment.In supine position inside the CT machine, the subject was askedto produce the following phonatory tasks: (1) to sustain vowel[a:], (2) to phonate a sustained vowel-like sound into a glasstube (27 cm in length and 9 mm inner diameter) for 15 minutes,and, immediately after that, (3) to produce another sustainedvowel [a:]. All phonations were carried out at habitual loudnesslevel and speaking pitch. The participant was required to pro-duce a stable sound with a good closure at the lips and to feelas strong as possible vibratory sensations on the alveolar ridge,face, and head areas during tube phonation. After 15 minutes ofcomplete silence (vocal rest), phonation into a plastic coffeestraw (2.5 mm inner diameter and 13.7 cm in length) was

Marco Guzman, et al Resonance Tube and Straw Phonation 523.e21

performed for 15 minutes. Immediately after that, the partici-pant was asked to produce another sustained vowel [a:]. Thisstraw was chosen because it was used in a previous study.1

The subject was scanned two times while producing each pho-natory task. He was asked to adopt a relaxed posture in the CTscanner and exactly the same body and head position was keptduring the entire CT procedure. The head position was mechan-ically fixed in a frame during all experiments. The participantwas a volunteer and he was informed about the potential healthhazards of the CT examination.

CT image analysis

Ten CT midsagittal images (five phonatory tasks3 two repeti-tions) were chosen to perform a series of distance measure-ments (mm). Anatomic distances (Figure 1) of interestincluded (1) vertical length of the vocal tract (which is indica-tive of the vertical laryngeal position [VLP]) measured as thedistance between the lowest point of the odontoid process ofthe Atlas and the vocal folds following a vertical line, (2) hor-izontal length of the vocal tract measured as the distance be-tween the lowest point of Atlas and the narrowest pointbetween the lips, (3) lip opening measured as the distance be-tween the lower edge of the upper lip and the upper edge ofthe lower lip, (4) jaw opening measured as the distance betweenthe lowermost edge of the jawbone contour and the anterior endof the hard palate, (5) tongue dorsum height measured as thedistance between the lowermost edge of the jawbone and theuppermost point of the tongue dorsum, (6) oropharynx widthmeasured as the distance between the lowest point of the secondvertebra and the most posterior part of the tongue contour (forensuring the same angle, we used straight line from the anterioruppermost edge of the jawbone contour to the anterior lowestpoint of the second vertebra), (7) velum elevation measured

FIGURE 1. Distances (mm) measured in CTmidsagittal images: (1)

vertical length of the vocal tract, (2) horizontal length of the vocal tract,

(3) lip opening, (4) jaw opening, (5) tongue dorsum height, (6) oro-

pharynx width, (7) velum elevation, and (8) hypopharynx width.

as the distance from the posterior upper edge of the hard palateand the anterior lowest point of the uvula, and (8) hypopharynxwidth measured as the distance between the lowest point of thepharynx and the internal edge of the epiglottis following a linefrom the anterior uppermost edge of the jawbone contour to thelower point of the pharynx. All CT measurements were per-formed using the software vPACS view, Version 6.9.5 (Audio-scan, Prague, Czech Republic).

Moreover, three cross-sectional areas (mm2) were measuredfrom the same 10 midsagittal CT images (Figure 2). The areaswere (1) oral cavity (A1) measured from the lips to the velum(up to the line connecting the lowermost edge of the jawbonecontour and a break of declivity on the velum surface), (2)the pharyngeal region (A2) measured from the line ending A1down to the horizontal line connecting the lower edge of thethird vertebra and the lowermost edge of the jawbone contour,and (3) the epilaryngeal region (A3) measured from the lineending A2 down to the vocal folds.

Additionally, from the transversal CT images, two areas(mm2) were measured37,38 (Figure 3): (1) the inlet to the lowerpharynx (Ap) just above the collar of the epiglottis and (2) theoutlet of the epilaryngeal tube (Ae) just below the collar of theepiglottis where the epilarynx and sinus piriformis form threeseparate tubes. The ratio between these two areas was also cal-culated (Ap/Ae). Areas from transversal CT images were cho-sen taken into account the bending of the vocal tract. A linefrom backbone to jawbone through the tip of arytenoids in a par-allel way to the bottom line of the CT image was use to obtainAe. A line just above the tip of arytenoids, which was also par-allel to the bottom line, was used to obtain Ap.

Recordings

The recordings were performed separately from the CT mea-surements in a sound-treated room. Three signals were

FIGURE 2. Areas (mm2) measured in CT midsagittal images: oral

cavity (A1), pharyngeal region (A2), and epilaryngeal region (A3).

FIGURE 3. Areas (mm2) measured in CT transversal images: the inlet to the lower pharynx (Ap) and the outlet of the epilaryngeal tube (Ae).


digitized: audio, subglottic (oral) pressure, and EGG signal.The audio signal was recorded by a condenser microphone(Behringer ECM 8000; Behringer, Behringer City, China) ata distance of 40 cm from the mouth. An estimate of the subglot-tic pressure was recorded from the oral pressure during the oc-clusion of the consonant [p:] in the syllable [pa:] and byshuttering the outer end of the tube. This pressure was capturedwith a pressure transducer connected to a thin plastic and flex-ible tube. The tube was inserted into the corner of the mouth,extending a few millimeters behind the lips, without touchingthe tongue or any other oral structure. The manometer MSIF2(Glottal Enterprises, Syracuse, NY) was used.

The EGG signalwas recordedwith a two-channel electroglot-tograph (EG2; Glottal Enterprises) using a 20 Hz high-pass fil-tering (to exclude slow variations in the signal amplitude, whichcould be due to articulatory movements). The electrodes werecleaned with a slightly wet tissue, and a thin layer of conductivegel was applied (Mingograph electrode cream; Siemens-ElemaAB,Munich, Germany). They were positioned near the laminaeof the thyroid cartilage and secured with a velcro strip, whichwas wrapped around the participant’s neck as tightly as possibleto prevent any movement of electrodes throughout the data col-lection. The quality of the EGG signal was monitored through-out the recordings with an oscilloscope.

Audio signal was calibrated for further SPL measurements indB (Z) using a 440 Hz tone. Air pressure signal was calibratedusing a Glottal Enterprises calibrator, model MCU-4. Pressureand EGG signals were digitized and recorded simultaneouslyinto different channels at a sampling rate of 16 kHz using theSoundswell Signal Workstation (Hitech Development AB,Stockholm, Sweden). Audio signal was recorded at a samplingrate of 48 kHz with 16 bits quantization using a DAT recorder(Marantz PMD 671; Marantz, Mahwah, NJ).

Phonatory tasks

EGG and subglottic pressure signals were recorded during therepetition of the syllable [pa:] at habitual loudness and comfort-able pitch before performing the vocal exercises. After this se-quence, the subject phonated a sustained vowel-like sound intothe glass tube (27 cm in length and 9 mm in inner diameter) asa vocal exercise for 5 minutes using his habitual pitch and loud-

ness. He was asked to produce a stable sound with a good clo-sure at the lips and to feel perceptible vibratory sensations onthe alveolar ridge, face, and head areas during phonation. Im-mediately after tube phonation exercises, the participant pro-duced again the same series of consecutive syllable [pa:] andsustained vowel [a] following the last [pa:], using the samepitch and loudness level to assess the eventual effect of tubephonation after the tube exercising. Because subglottic pressureis one of the important factors that affect SPL, one of the exper-imenters monitored the SPL during the series of consecutive[pa:] syllable with a sound level meter (American recorder tech-nologies SPL-8810) (American Recorder Technologies, Inc.,Simi Valley, CA) positioned at a distance of 40 cm from themouth. Z frequency weighting and slow time response wereused for monitoring SPL. The SPL choice was made by the sub-ject in pre-exercising recording (at comfortable loudness).Then, these pre-exercising made free choices became the tar-gets for postexercising samples. An electronic keyboard wasused to give and control the pitch, which was monitored audi-torily by the participant and one of the experimenters. The re-cording captured during this consecutive syllable [pa:] wassaved as the sample after tube phonation. After 15 minutes ofcomplete silence (vocal rest), exactly the same entire procedurewas performed with a plastic stirring straw of 2.5 mm inner di-ameter and 13.7 cm in length.

Acoustic, air pressure, and EGG data analyses

To compare the samples recorded before and after tube andstraw phonations, acoustic measures were made using Praatsoftware (Version 5.2; Boersma and Weenink, 2008, Universityof Amsterdam, Amsterdam, The Netherlands). Long-term aver-age spectrum (LTAS) and Fast Fourier Transformation (FFT)spectrum (spectral slice and spectrogram) were used. FromLTAS analysis, the following variables were assessed: totalSPL (energy between 50 and 6000 Hz), F1 energy (spectral en-ergy between 500 and 800 Hz), singer/speaker’s formant en-ergy (energy between 2500 and 4000 Hz), and singing powerratio (SPR) (the difference of energy between the highestpeak around 0–2 kHz and the highest peak around 2–4 kHz).A frequency bandwidth of 25 Hz and Hanning window wasused for LTAS analysis. FFT spectrum was performed to obtain


an approximation of the formant frequencies from F1 to F5. Theformant frequencies were measured from the strongest peaks orin the middle of the two adjacent equally strong peaks in spec-tral slices (average from each sample).Wide band spectrograms(bandwidth 260 Hz) were used for a comparison. There the for-mant frequencies were located in the middle of the bands withthe strongest intensity. Frequency distances between formantsF2-F1, F3-F2, F4-F3, and F5-F4 were then calculated.

Every sample captured before, during, and after exercisinginto both tube and straw were analyzed to obtain the averagesubglottic pressure (estimated from the maximum peak of theoral pressure during the occlusion of the consonant [p:] in thesyllable [pa:] and during manual shuttering the outer end ofthe tube), oral pressure (obtained during nonshuttered phase),and the glottal CQ. Transglottal pressure was also calculatedfrom the difference between subglottic and oral pressures. ASoundswell Signal Workstation (Hitech Development AB)was used to calculate the oral pressure values. EGG CQ (the ra-tio of the duration of the ‘‘contact phase’’ to the entire glottalperiod) was obtained with the software EFxHist, Version 1.5(Mark Huckvale, University College of London, UK) fromthe middle section of the each EGG sample. EFxHist softwaredefines the contact phase using a criterion level of 50% from thepeak-to-peak amplitude of the EGG signal.

Auditory-perceptual assessment

To evaluate the voice quality of the samples, a perceptual anal-ysis was conducted with four listeners (one man and threewomen). This group of blinded judges consisted of speech-language pathologists with more than 8 years of experience invoice training and rehabilitation. Samples recorded beforeand after tube/straw phonation were played in randomizedpairs. Raters were required to judge which sample in eachpair was produced with a better voice quality or if there wasno difference between them. Listeners could replay each sam-ple as many times as they wanted before making their decision

FIGURE 4. Mean of the two repeated scans for anatomic distances (mm)

the CT measurements performed before, during, and after tube and straw ph

straw; AS, after straw.

and moving on to the next sample. The evaluation was per-formed in a sound-treated room using a laptop computer anda high-quality Audioengine loudspeaker (Audioengine, Kow-loon, Hong Kong). The listeners were located at approximately2 m from the loudspeaker. All the listeners reported normalhearing.

RESULTS

CT distance measurements

Mean of the two repeated scans for the anatomic distances(mm) calculated from the midsagittal images of the vocal tractobtained from the CT measurements performed before, during,and after tube/straw phonations are presented in Figure 4.Changes were observed in the vertical length (which is indica-tive of the VLP). It increased during tube phonation (8%) andeven more during straw phonation (21%) (Figures 5 and 6).This lowered laryngeal position remained after tube and strawphonations. The most evident change in laryngeal heightoccurred when comparing phonations before and during strawphonation (19 mm of difference, 21%). Because during thetube and straw phonations the horizontal length is determinedby the length of the tube and straw, respectively, this distancewas only measured before and after. Small changes wereobserved in the horizontal length (less than 2% in allsamples). An important modification was observed in thevelum position, which rose to seal the nasopharyngeal portduring the tube and straw phonations (Figures 5 and 6). Inboth the tube and straw phonations, the uvula rose 7 mm(35%). Although after tube and straw phonations, the velumwas not as high as during them, the higher uvula positionremained as compared with the position before tube and strawphonations. When comparing oropharynx width before and afterstraw phonations, there was about 37% of decrease from 8.7 to5.65 mm, respectively. This difference was also present after thetube phonation (25%), but it was not as clear as compared with

calculated from the midsagittal images of the vocal tract obtained from

onations. BT, before tube; DT, during tube; AT, after tube; DS, during

FIGURE 5. Midsagittal images of the vocal tract. Before tube (left), during tube (middle), and after tube phonation (right). The twomost dominant

changes are the higher velum and the lower laryngeal position in both during and after tube phonations.


the sample after the straw phonation.Hypopharynx becamewiderduring straw phonation (38%). During and after tube phonation,negligible changes were found in the hypopharynx width. Jawopening also showed a decrease during both tube and strawphonations (15% and 13%, respectively). However, this changewas probably a direct consequence of the straw and tubepresence between lips. Jaw opening did not change substantiallyafter tube/straw phonation. Small changes were revealed intongue dorsum height for all sample types (less than 7%),except for the sample during straw (13%). No clear differenceswere demonstrated in lip opening throughout the sequence.Because during tube and straw phonations the lip opening isdetermined by the diameter of the tube and straw, respectively,this distance was only measured before and after. Additionally,midsagittal images also show a more frontal tongue position inboth during and after tube/straw phonation. This change wasmore evident during that after exercising (Figures 5 and 6).

The mean differences between phonation before comparedwith phonation during and/or after tube/straw were greaterthan the differences between repetitions for vertical length ofvocal tract, velum elevation, pharyngeal width, jaw opening,and tongue dorsum height (Table 1). Therefore, one may sug-gest that the reported variations between before, during, and af-ter tube/straw phonations are true differences.

CT area measurements

Mean of the two repeated scans for the cross-sectional areas(mm2) measured from the midsagittal and transversal imagesof the vocal tract obtained from the CT scannings before, dur-ing, and after tube/straw phonations are presented in Figure 7.

FIGURE 6. Midsagittal images of the vocal tract. Before straw (left), durin

most evident changes are the higher velum and the lower laryngeal position in

is much wider during straw phonation. The straw is not visible due to the fact

of the CT scan possibilities.

Ap area increased both during tube and straw phonations com-pared with vowel phonation before them. This increment wasclearly much larger during straw (91%) than glass tube (9%)phonation (Figure 8). Negligible changes were observedwhen comparing Ap between vowel phonations before and aftertube and straw phonations (0.2% and 3.5%, respectively). TheAe area became larger during straw phonation (65%)(Figure 9), but no clear changes were observed comparing thephonations before and during tube phonation (2%). The samearea showed a decrease after tube (16%) and even more afterstraw phonation (23%) (Figure 10). The ratio of areas Ap/Aemeasured from the transversal CT images showed a clear ten-dency: during tube, the ratio increased by 7.9%; after tube, by19.4%; during straw, by 16%; and after straw, by 35%. Achange was observed in oral cavity area (A1), which decreasedduring both tube and straw phonations (37% and 29%, respec-tively) Figures 5 and 6). Although the oral cavity remainedsmaller after tube and straw phonations compared with thevowel production before them, this change (1% and 9% fortube and straw, respectively) was not as substantial as themodification during them. Because the presence of tube andstraw phonations might affect the oral cavity area, thechanges during exercising should be taken with caution. Thepharyngeal region (A2) showed the same increase (15%)during tube and straw phonations as compared with vowelproduction before exercising, but it decreased after tube andstraw phonations compared with phonation before them (20%and 27%, respectively). The epilaryngeal region (A3) becamelarger during both glass tube (13%) and straw (73%)phonations, the change being clearly much larger during

g straw (middle), and after straw (right). Likewise as in Figure 2, the two

both during and after straw phonations. Additionally, the hypopharynx

that it is made of a very thin walled and a very soft material, which is out

TABLE 1.

Results of Acoustic Analysis of Vowel [a:] Pre- and Posttraining With Tube and Straw for LTAS and FFT

Acoustic

Parameters

Before

Tube and

Straw

1 (dB)

Before

Tube and

Straw

2 (dB)

Difference

Before 1

and 2

After

Tube

1 (dB)

After

Tube

2 (dB)

Difference

After

Tubes

1 and 2

After

Straw

1 (dB)

After

Straw

2 (dB)

Difference

After

Straws

1 and 2

Mean

Before

(dB)

Mean

After

Tube

(dB)

Mean

After

Straw

(dB)

Difference

Mean

Before/

After Tube

Difference

Mean

Before/

After Straw

SPL at 40 cm 75.8 75.9 0.1 76.1 75.9 0.2 76.9 75.3 1.6 75.9 76.0 76.1 0.1 0.2

Energy SF

cluster

27.5 27.1 0.4 27.6 26.6 1.0 30.3 29.4 0.9 27.9 27.1 29.8 0.8 1.9

SPR 19.6 19.7 0.1 17.7 16.7 1.0 14.4 16.0 1.6 19.6 17.2 15.2 2.4 4.4

F1 energy 49.8 50.0 0.2 48.4 47.3 1.1 49.1 48.9 0.2 49.9 47.9 49.0 2.0 0.9

Formant

Frequencies

Before

Tube and

Straw

1 (Hz)

Before

Tube and

Straw

2 (Hz)

Difference

Before 1

and 2

After

Tube

1 (Hz)

After

Tube

2 (Hz)

Difference

After

Tubes

1 and2

After

Straw

1 (Hz)

After

Straw

2 (Hz)

Difference

After

Straws

1 and 2

Mean

Before

(Hz)

Mean

After

Tube

(Hz)

Mean

After

Straw

(Hz)

Difference

Before/

After

Tube

Difference

Before/

After

Straw

F1 707 697 10 624 614 10 588 624 36 702 619 606 83.0 96.0

F2 1143 1060 83 1008 1007 1 1010 1008 2 1102 1008 1009 94.0 93.0

F3 2691 2681 10 2587 2589 2 2608 2598 10 2686 2588 2603 98.0 83.0

F4 3200 3197 3 2940 2920 20 2962 2961 1 3199 2930 2966 269.0 233.0

F5 3761 3636 125 3361 3346 15 3332 3355 23 3699 3354 3344

Formant

Distances

Before

Tube and

Straw

1 (Hz)

Before

Tube and

Straw

2 (Hz)

Difference

Before

1 and 2

After

Tube

1 (Hz)

After

Tube

2 (Hz)

Difference

After

Tubes

1 and 2

After

Straw

1 (Hz)

After

Straw

2 (Hz)

Difference

Straws

1 and 2

Mean

Before

(Hz)

Mean

After

Tube

(Hz)

Mean

After

Straw

(Hz)

Difference

Before/

After

Tube

Difference

Before/

After

Straw

F2-F1 436 363 73 384 393 9 422 384 38 399.5 388.5 403 11 4

F3-F2 1548 1621 73 1579 1582 3 1598 1590 8 1584.5 1580.5 1594 4 10.5

F4-F3 509 516 7 353 331 22 354 363 9 512.5 342 358.5 170 154

F5-F4 561 439 22 421 426 5 370 394 33 500 423.5 382 77 118

F5-F3 1070 955 115 774 757 17 724 757 33 1012.5 765.5 740.5 247 272

Differences

Between

Formant

Distances

Before/After

Tube 1 (Hz

and %)

Before/After

Tube 2 (Hz

and %)

Difference

Before/After

Tubes

1 and 2

Before/After

Straw 1 (Hz

and %)

Before/After

Straw 2 (Hz

and %)

Difference

Before/After

Straws 1 and

2

Mean

Before/After

Tube (Hz

and %)

Mean

Before/After

Straw (Hz

and %)

F2-F1 52 (11.9) �30 (8.3) 22 14 (3.2) �21 (5.8) 7 11 (11.9) �3.5 (4.5)

F3-F2 �31 (2.0) 39 (2.4) 8 �50 (3.2) 31 (1.9) 19 4 (2.2) �9.5 (2.6)

F4-F3 156 (36.6) 185 (35.9) 29 155 (30.5) 153 (29.7) 2 170.5 (36.25) 154 (30.1)

F5-F4 140 (24.9) 13 (2.9) 127 140 (24.9) 13 (2.9) 127 76.5 (13.9) 76.5 (13.9)

F5-F3 296 (27.7) 198 (20.7) 128 346 (32.3) 198 (20.7) 148 247 (24.2) 272 (26.5)

MarcoGuzm

an,etal

ResonanceTubeandStra

wPhonatio

n523.e25

FIGURE 7. Mean of the two repeated scans for the cross-sectional areas (mm2) calculated from the midsagittal and transversal images of the vocal

tract obtained from the CTmeasurements performed before, during, and after tube and straw phonations. For definitions of the areas, Ae, Ap, A1, A2,

and A3, recall Figures 2 and 3. BT, before tube; DT, during tube; AT, after tube; DS, during straw; AS, after straw.


straw. Epilaryngeal region demonstrated small changes aftertube (2.4%) and after straw (5%) (Figures 5 and 6). The meandifferences between phonation before compared withphonation during and after tube/straw were greater than thedifferences between repetitions for most of the areameasurements. More detailed numerical information aboutCT distances and area measurements are presented in theAppendix.

Acoustic analysis results

Table 1 shows the results of acoustic analysis of vowel [a:] pre-and posttraining with tube and straw for LTAS and FFT. Someof the LTAS measures (SPL and F1 energy) did not change sub-stantially after tube and straw phonations as compared withphonation before training (less than 1.6 dB; 4%). The majorchange was observed in SPR, which decreased after both tube(2.5 dB; 12%) and straw (4.4 dB; 22%) phonations, the differ-ence being larger after straw. The energy in the speaker/singer’sformant cluster region increased in average 2.5 dB after straw(Figure 11). The mean differences between phonation beforecompared with phonation after tube/straw phonation weregreater than the differences between repetitions for SPR and

FIGURE 8. Transversal images of the vocal tract. Before tube/straw (left),

tube and straw phonations compared with vowel phonation. This increme

phonation.

the energy in the speaker/singer’s formant cluster region(Table 1). Therefore, one may suggest that the reported varia-tions between before and after tube/straw phonations are truedifferences.Formant frequencies values from F1 to F5, distances between

F2-F1, F3-F2, F4-F3, F5-F4, and F5-F3 before and after train-ing with tube and straw, and the differences in hertz and per-centage of the formant frequency distances are also showedin Table 1. The most evident change related to formant frequen-cies in the different phonations is found in F1 values. It de-creased after both tube and straw by 12% and 14%,respectively. Changes in the rest of the other formant frequen-cies measured were smaller than the change found in F1 (8%for F2, 4% for F3, 8% for F4, and 9% for F5). It is interestingto note that all formant frequencies of vowel [a:] became lowerafter both tube and straw training. From Table 1, it is possible topoint out that the clearest change was found in F4-F3 distancedifference. F3 and F4 were closer to each other after exercisingwith both tube and straw. The difference was in average 170 Hz(36.2%) and 154 Hz (30.1%) after tube and straw, respectively.This formant cluster between F3 and F4 was located in 2588–2930 Hz and 2603–2966 Hz after tube and straw, respectively.

during tube (middle), and during straw (right). Ap increased during both

nt was clearly larger during straw phonation than during glass tube

FIGURE 9. Transversal images of the vocal tract. Before straw (left) and during straw (right). The Ae area became larger during straw phonation.


The average F0 measured throughout the sequence was 113 Hzwith a range of 111–115 Hz.

Auditory-perceptual analysis

Table 2 displays the results of the listening test. Data show thatthere is a high degree of agreement among listeners for strawphonation samples. All the samples produced after straw exer-cises were evaluated by the four judges as better in voice qualitythan the samples before. Assessment for tube phonation did notshow the same degree of agreement found in straw phonationrecordings.

Air pressures and CQ

Subglottic pressure, oral pressure, transglottal pressure, and CQvalues before, during, and after tube and straw phonations arepresented in Table 3. There was an increase of subglottic pres-sure during straw phonation (18%), and this increment re-mained after removing the straw (7%). Subglottic pressurevalue did not demonstrate a significant change during tube pho-nation and became higher after tube phonation (6%). There wasalso an increase of oral pressure during both tube and straw pho-nations. Transglottal pressure decreased during both tube andstraw phonations. Oral pressure increased 17 times more duringa straw than a tube phonation, and transglottal pressure de-creased about two times more during a straw phonation.

Glottal CQ decreased during both tube and straw phonations4.5% and 20%, respectively. This change remained both aftertube (8.3%) and after straw phonations (9.2%). Figure 12 showsrepresentative EGG waveforms from the samples before, dur-ing, and after straw phonation.

FIGURE 10. Transversal images of the vocal tract. Before tube/straw (lef

after tube phonation and even more after straw phonation.

DISCUSSION

One of the major differences observed between samples beforeand both during tube and during straw phonations is the in-crease in the vertical length of the vocal tract, which reflectsa lower VLP and also a larger overall vocal tract length. Thischange remained during vowel production after tube and strawphonations. Findings from the acoustic analysis are concordantwith the CT results. FFT showed a decrease in the frequency ofthe first five formants. According to the acoustic theory ofspeech, the formant frequencies depend on the length of the vo-cal tract and the cross-sectional shape of the vocal tract asa function of its length.40,41 Vocal tract length determines theaverage location of formant frequencies. In this regard, aslength increases, the value of the formant frequency willdecrease. In a previous study performed with MRI, similarresults were observed in a male subject who lowered thelarynx after vocal warm-up with spoken exercises.38 On theother hand, in two MRI studies performed with a female partic-ipant,37,38 no changes were reported in the VLP neither afterstraw phonation nor after other semi-occluded vocal tract exer-cises ([ß:], [m:], [y:], and [u:]). It is important to point out thatboth the male subject from the present study and the male sub-ject from the earlier MRI investigation38 have classical singingvoice technique, whereas the female participant has a long ex-perience in speaking voice training. Because the classical sing-ing is produced with a ‘‘covered’’ sound which implies a longervocal tract due to a lower VLP and lip protrusion,42,43 it ispossible to question whether the lowered VLP demonstratedby our subject is due to the effect of tube phonation or not.

t), after tube (middle), and after straw (right). Ae area became smaller

FIGURE 11. Top: comparison between LTAS of vowel [a:] before

(thin line) and after (thick line) tube phonation. Bottom: comparison

between LTAS of vowel [a:] before (thin line) and after (thick line)

straw phonation. The energy of the singer/speaker’s formant region in-

creased after exercising, especially after straw phonation.


An important aspect that may be regarded is that the tube orstraw may make the perception of vibratory sensationsstronger and incite the subjects to change the vocal tract andglottal setting in such a way that it is prone to intensify andpertain these sensations. Therefore, the adjustments could bemodified according to the previously learned voice patterns,which are regulated by the voice use type (speech or singingor different styles of singing). According to Titze, the originof vibratory sensations in resonant voice could be due to theefficiency on the energy conversion process at the glottis(from aerodynamic to acoustic energy). When this process isefficient, the vibrations are distributed all over the face andhead areas. On the other hand, when the energy conversionprocess is poor, the vibrations will remain mostly in thelaryngeal area.15

TABLE 2.

Auditory-Perceptual Assessment of Voice Quality in Vowel Pro

Sample Before Tube After Tube No Differenc

Sample 1 1 2 1

Sample 2 2 2 0

Total 3 4 1

The numbers indicate the amount of listeners regarding which sample in each p

between them. Listeners n¼ 3 in total.

In earlier investigations performed with other assessmentmethods, there is evidence for both the effects, laryngeal lower-ing and laryngeal raising compared with the resting laryngealposition. Some studies have evidenced a lower VLP duringsemi-occluded tasks,40,44 whereas others have reported theopposite.23,34 Because people diagnosed with hyperfunctionaldysphonia commonly present a high VLP due to theabnormal contraction of suprahyoid muscles,46 semi-occlusions and lengthening of the vocal tract might have animportant therapeutic effect if they really produce a laryngeallowering. In this regard, according to Sovij€arvi et al,47 the pos-itive outcomes of the resonance tube method are due to the ef-ficient lowering of the larynx and the firming of the vibration ofthe vocal folds. The author pointed out that the length of thetube should be chosen according to the lowering of the larynxduring the tube phonation.1–48

Later also Simberg and Laine5 stated that to choose the tubethat best enhances the lowering of the larynx during the phona-tion is an important factor. Both Sovij€arvi and Simberg andLaine used tubes submerged into the water (water resistancetherapy). It is important to mention that tube phonation hasnot only been used in cases of hyperfunction as mentionedbut also in patients with hypofunctional dysphonia.5,47 As itcomes to the amount of laryngeal lowering, Sovij€arvi49 re-marked that it may just be some millimeters or rather avoidanceof raising the larynx, so the aim of the method was not to causea similar laryngeal lowering as in classical singing.The velum position was another important vocal tract modi-

fication in the present study. It rose to seal the nasopharyngealport during and after the tube and straw phonations. Our resultsare concordant with previous investigations carried out withMRI and CT.37–39 A suitable explanation for this changecould be the increased oral pressure produced during semi-occluded vocal tract postures.3,19 The oral pressure measuredin the present study during both tube and straw phonationswas higher than during vowel production.An improved energy transfer by decreasing the damping

caused by the nasal tract and hence an increase in the totalSPL may be expected with a proper velar closure.37 Neverthe-less, our findings did not show an SPL increment after tube/straw phonation. This lack of change is probably because theparticipant was required to produce the same loudness levelduring vowel production before and after exercising to avoidspectral changes merely caused by intensity increments. Re-lated to this, the energy of F1 did not show important changes

duction Before and After Tube and Straw Phonations

e Before Straw After Straw No Difference

0 4 0

0 4 0

0 8 0

air was produced with a better voice quality or if there was no difference

TABLE 3.

Subglottic Pressure, Oral Pressure, Transglottal Pressure, and CQ Before, During, and After Tube and Straw Phonations

CQ (%) Psub (cm Water) Poral (cm Water) Ptrans (cm Water)

Before tube and straw 1 58.5 9.6 0.2 9.4

Before tube and straw 2 56.2 10.1 0.2 9.9

During tube 1 52.6 6.1 0.6 5.5

During tube 2 57.4 7.8 0.7 7.1

After tube 1 53.3 9.4 0.2 9.2

After tube 2 52.2 11.2 0.2 11.0

During straw 1 42.2 10.9 5.6 5.3

During straw 2 49.9 12.2 8.5 3.7

After straw 1 51.8 10.0 0.2 9.8

After straw 2 52.8 10.9 0.2 10.7

Mean before tube and straw 57.6 9.8 0.2 9.6

Mean during tube 55,0 6.9 0.6 6.3

Mean after tube 52.8 10.3 0.2 10.1

Mean during straw 46.1 11.6 7.0 4.6

Mean after straw 52.3 10.5 0.2 10.3


during the sequence either. This is not surprising because thespectral energy of F1 region supports most of the overall SPL.

Because velum elevation has been found, through MRI ex-amination, to be a common modification in classical singersduring singing,50–53 it could be argued that the higher velumposition in the present study may be due to the classical vocalbackground of our participant. Nevertheless, previous ra-diological studies have demonstrated the same modificationwhen using artificial vocal tract lengthening with a tube insubjects who have an extensive experience teaching speakingvoice technique.37,38 Therefore, phonation into a narrow tubemay be a useful tool for the treatment of hypernasality aspreviously reported for using a narrow tube in air ora resonance tube in water.4,54

CT transversal images demonstrated a clear tendency that theratio Ap/Ae increases during and after both the tube and strawphonations. Ap/Ae has been suggested to be an important factor

FIGURE 12. Representative EGG waveforms from the samples be-

fore (top), during (middle), and after (bottom) straw phonation. Hori-

zontal axis is time and vertical is impedance (increasing downward).

for the singer’s formant cluster production (a prominent spec-trum envelope peak near 3 kHz associated with the ‘‘ringing’’voice quality). Sundberg55 suggested that when the cross-sectional area in the pharynx is at least six times wider thanthat of the laryngeal tube opening, the epilaryngeal tube isacoustically unlinked from the rest of the vocal tract acting asa separate resonator. Therefore, an extra formant would beadded to the vocal tract transfer function. If additionally the si-nus of Morgagni is wide, this extra formant would be tuned be-tween the third and fourth formants. Furthermore, if the sinuspiriformis are wide, the fifth formant would lower down toabout 3 kHz.55 These three things can be reached by loweringof the larynx.

In the present study, the greatest Ap/Ae ratio was observedafter straw phonation (5.5), which is close to the value sug-gested by Sundberg (6).55 Interestingly, acoustic results of thepresent study showed the largest increase in the energy of thespeaker/singer’s formant cluster region after straw. Addition-ally, the SPR showed the largest decrease after straw phonationas well. Recall that SPR is a spectral measurement created orig-inally for quantifying the singer’s formant.56 Therefore, if thesinger/speaker’s formant region demonstrates more energy af-ter exercising, it is expected that SPR value will be lower.Thus, it is possible to state that the outcomes obtained fromCTare in line with acoustic analysis results. These data suggesta clear immediate effect on spectral characteristics after strawphonation, specifically an increased spectral prominence inthe singer’s/speaker’s formant region and also a change in thespectral slope declination (ie, less steep slope). Values of SPRafter straw represent an increased energy in the higher har-monics of the spectrum compared with the lower ones. The in-crease of SPL in the singer/speaker’s formant region has alsobeen reported after tube phonation in earlier works.37–39

Furthermore, two studies30,31 whose aim was to compare theeffect on spectral energy distribution of semi-occluded vocalexercises and open vowel exercising demonstrated that sus-tained vowel production after phonation into stirring straws


and vocal function exercises (voice program which involvessemi-occlusions)57 produced more spectral energy increase inthe higher part of the spectrum (2000–5000 Hz) than vocaliza-tions with open vocal tract setting. The acoustic differences be-tween tube and straw phonations found in the present study(more increase of SPL in the singer/speaker’s formant afterstraw than tube) seem to be in line with the results of the listen-ing test. All samples after straw phonation were rated as repre-senting a better voice quality by all the listeners. Assessment fortube phonation did not show the same degree of agreement asfound in straw phonation recordings.

As mentioned above, the singer/speaker’s formant, in generalterms, can mainly be explained as a resonatory phenomenonarising from a clustering of third, fourth, and fifth formants. Ac-cording to our findings, there was a clear cluster of F3 and F4after exercising with both a glass tube and a straw. This formantcluster of F3 and F4 was located in 2500–3000 Hz for vowel[a:] after exercising. F5 also showed a decrease, and the fre-quency difference between F5 and F4 was smaller after tube/straw phonation as compared with vowel production beforethem. However, this change was not as clear as the change inF4-F3. A significant decrease in the formant frequency distanceafter straw phonation was also found by Laukkanen et al.37 Thisformant cluster may have contributed to the changes mentionedabove (lower SPR and higher SPL between 2500 and 4000 Hz).

According to Sundberg,55 the larynx lowering, typical ofmale classical singing, seems to be a way to obtain a high ratiobetween the cross-sectional area of the low pharynx and the ep-ilaryngeal tube opening and drawing the higher formants F3-F5closer to each other. Therefore, the lowered VLP observed dur-ing and after both tube and straw phonations in the present studymight be the physiological cause of the strengthening of thesinger’s formant cluster range showed by our participant. How-ever, if a low VLT is desirable to produce a singer’s formant,how can it be explained that without a laryngeal lowering it isalso possible to observe a formant cluster in vocally trainedspeakers37,38 There are some differences between the singer’sand speaker’s formant cluster. The former is usually locatedbetween 2 and 3 kHz, whereas the later is produced between3 and 4 kHz.55,58 The lower frequency of this peak in singersmay be related to lowering of the larynx.55 On the other hand,the VLP change does not seem to be the main cause of thespeaker’s formant. Previous studies suggested that a speaker’sformant could be obtained through a slight narrowing of the ep-ilaryngeal region, widening of the back of the mouth cavity, andnarrowing of the front part of it.58,59 The modeling results byLeino et al58 suggest that these two modifications may be re-sponsible for the peak at 3.5 kHz showed in male actors. In ad-dition, there is empirical evidence that vocal tract changes otherthan laryngeal lowering are able to produce a spectral promi-nence and a brilliant singing voice quality.60,61

Regarding the increase of SPL in the singer/speaker formantregion, it is important to point out that not only a formant clus-tering could be the explanation for this change but it could alsobe explained by increased input impedance of the vocal tract,for example, due to the higher Ap/Ae ratio. Higher input imped-ance produces a faster cessation of the glottal flow and conse-

quently a less tilting spectrum and a more resonant voicequality9 (as found in the present study after straw) and easyvoice production.34 Therefore, both a resonance strategy (for-mant cluster, possibly only due to changes in the vocal tractlength and thus formant frequencies) and increasedaerodynamic-acoustic interaction of the vocal tract might con-tribute to a louder voice without increasing the vocal effort.These resonance and acoustic-aerodynamic changes are desir-able effects for both vocal warm-up and voice training. Achange in the lower vocal tract may simultaneously cause theseboth.10 In addition, an increase of subglottic pressure after strawphonation (as found in the present study) would also be a possi-ble explanation for somewhat higher SPL in the singer/speaker’s formant region. An SPL rise of 1 dB is typically asso-ciated with a 1.5 dB difference near 3000 Hz in the LTAS.62

However, in the present study, SPL was controlled to be thesame before and after exercising, and there was an increase ofjust 0–1.1 dB between the samples before and after exercising,whereas the energy of the singer’s formant cluster increased upto 2.8 dB after the straw.The CQ values decreased during both tube and straw phona-

tions and remained lower after them as compared with vowelphonation before them. This change was larger for straw thantube phonation. Several studies have reported a change inCQ when semi-occlusion is compared with vowel phona-tion.2,3,20–26 However, only some of them have demonstrateda decrease. There is no clear pattern regarding the effect ofsemi-occlusion or lengthening of the vocal tract on CQEGG.From the theoretical point of view, an important effect of strawphonation, voiced bilabial fricative and also tube phonationwhen the tube is submerged into water is the resistance againstairflow. This will produce an increased supraglottal pressure,which consequently will cause a decrease of the transglottalair pressure (the force driving vocal fold vibration) as foundin our data during both tube and straw and an elevation of theintraglottal pressure if not compensated by increasing subglot-tic pressure and adduction. Increased intraglottal air pressure, inturn, would produce a lower CQ.3,17,19 Titze19 stated thata semi-occluded setting (especially during phonation intoa thin straw) may produce a slight separation of the vocal foldsand hence a smaller CQ. If this really occurs, it is possible toassume that occlusions of the vocal tract might be beneficialto decrease the vocal fold impact stress. When the vocal foldsimpact stress increases (due to more intense vocal folds colli-sion), the CQ tends to rise and vice versa.63 However, it is nota strictly linear phenomenon.Another possible explanation for the decreased CQ during

and after tube/straw phonation could be the low VLP. Earlierstudies have shown that a low VLP has also been correlatedto low glottal CQ.64,65 Therefore, it seems to be that lowerVLP is associated with less glottal adduction. This phonatoryeffect might be due to an abductive component of trachealpull when the larynx is low.During straw phonation, not only the lowest CQ but also the

highest subglottic pressure across the three stages of the se-quence (before, during, and after) was found. Increased sub-glottic pressure has also been reported in a previous study


carried out with stirring straw phonation.3 Earlier research hasshown that impact stress increases with lung pressure, that is,the impact stress is higher at higher lung pressure values.66,67

However, in the present study, a high subglottic pressure wasobserved seemingly without increment of impact stress. Thisassumption is based on the fact that low CQ was obtainedduring straw. What drives the vocal folds is the transglottalpressure, that is, the difference between the subglottal and thesupraglottal pressure. As mentioned above, a downstreamocclusion increases the supraglottal air pressure and reducesthe transglottal air pressure, which in turn reduces theadductive force. Occlusion of the vocal tract makes itpossible to lower the transglottal pressure (which is desirable)even if the subglottal pressure is high. In this regard, Titzeet al3 mentioned that with narrow straws, high subglottal pres-sure can be reached without a high risk of vocal fold impactstress.

Interesting modifications were also observed in the pharyn-geal region. First, the Ap area increased during both strawand tube phonations compared with vowel phonation beforethem (Figure 8). Second, and related to the first one, anterior-posterior width of the hypopharynx became wider during strawphonation (Figure 6). Moreover, the pharyngeal region area(A2) showed an increase during both tube and straw phonationsas compared with vowel production before exercising(Figures 5 and 6), and finally, the epilaryngeal region area(A3) became larger during both glass tube and strawphonations (Figures 5 and 6). All these modifications werelarger during straw than tube phonation. An importanttherapeutic application could arise from these observations:the widening of the pharyngeal area in people with vocalhyperfunction. Vocal hyperfunction is considered as excessiveand/or imbalanced use of muscular forces, which ischaracterized by excessive laryngeal and extralaryngealtension.68–70 One of the common features treated by voicetherapists in patients diagnosed with excessive muscle tensionis the relaxation of the laryngo-pharyngeal muscles and widen-ing of the vocal tract, especially in the pharyngeal area. The re-laxation of this area is also an important goal of classicalsinging and speaking voice pedagogy. Different exercises topromote an open throat sensation have been one of the mostused tools to produce freedom or lack of excessive tension inthe area of the throat and a better tone quality in both normaland pathologic voices.71–73 From our finding, it is possible tostate that straw phonation might be an option to produce thesame effect. Results from other CT and MRI studies supportour findings related to the modification of the pharyngeal areawith tube or straw exercises37–39 The increased oral pressureduring straw may push the pharyngeal tissues and maybe italso then has the possibility to aid muscle relaxation in thepharyngeal area.

Different from earlier studies,37–39 the present investigationdid not show neither an increase in jaw opening nor anincrease in the lip opening after tube/straw phonation. Thepreviously cited reports37–39 were obtained from a femalesubject who has mainly speech training experience, whereasour participant owns a classical singing voice background.

This fact could be a suitable explanation for the discrepancyin the results. Because jaw opening and lip opening seem tobe directly related in classical singing technique,50,52,53,74 thelack of changes in both together (jaw and lip opening) in thepresent study appears reasonable. Nevertheless, this relation(the fact that lip and jaw may go together) should not betaken as a rule for all cases. In earlier studies using tubephonation, it was possible to see lowering of the jaw, whichwas not directly related to lip opening.37 A similar movementrelationship has been also observed between jaw opening andtongue dorsum height. They vary independently.52,74

Likewise, our outcomes showed negligible changes fortongue dorsum height after both tube and straw phonations.Moreover, no important changes were observed after tube andstraw phonations in A1 area (oral cavity). These results differfrom previous MRI and CT studies where oral cavity wasfound larger after exercises with straw and tube,respectively.37,38 Last, but not least, midsagittal images alsoshowed a more frontal tongue position during and after tube/straw phonations, which is in line with earlier studies.37,39

This would imply that the pharyngeal constriction wasreduced; however, this implication did not occur in all CTsamples in the present study. The change in the tongueconfiguration could also have affected some vowel formants,especially F2.75 A more frontal tongue position is supposedto raise F2; however, this acoustic change was not observedin our data.

CONCLUSION

Our results suggest that both straw and tube phonations mayhave vocal training and vocal warm-up effects, one of the majorbeing an increased spectral prominence in the singer/speaker’sformant cluster region. This may be explained by an increase ofthe ratio between the cross-sectional area of the low pharynxand the area of the epilaryngeal tube opening. The lower laryn-geal position observed in the CT images could be a reasonableexplanation for this finding. Moreover, listening test demon-strated a better voice quality after straw/tube than before exer-cising, hence, this finding seem to be in line with the acousticand anatomic results. Therapeutic effects could also arisefrom our data such as a better velar closure (as treatment for hy-pernasality), a widening of the pharyngeal region (as treatmentfor patients with laryngeal and extralaryngeal muscle hyperten-sion), and possibly a laryngeal lowering (as an exercise for sub-jects with high larynx position due to vocal hyperfunction).Some of these therapeutic effects have been describedearlier.47,48,54,76–78 Because the changes in the vocal tract,CQEGG, subglottic pressure, acoustic parameters, and auditory-perceptual findings were greater during and after exercisingwith the straw than with the tube, it is reasonable to statethat more prominent changes are obtained when the vocaltract input impedance during exercises is higher. Consideringall the observed effects in the present study, it would be pos-sible to state that vocal exercises with increased vocal tractimpedance assist in increasing vocal efficiency and economy(more loudness without an increase of vocal loading due to


increased vocal fold collision). In the present study, only im-mediate results after exercising were obtained, long-term ef-fects still remain to be studied.

REFERENCES1. Sovij€arvi A. Nya metoder vid behandlingen av r€ostrubbningar.Nordisk Tid-

skrift for Tale og Stemme. 1969;3:121–131. [in Finnish].

2. Laukkanen A-M. About the so called ‘‘resonance tubes’’ used in Finnish

voice training practice. Scand J Logoped Phoniatr. 1992;17:151–161.

3. Titze I, Finnegan E, Laukkanen A, Jaiswal S. Raising lung pressure and

pitch in vocal warm-ups: the use of flow-resistant straws. J Sing. 2002;

58:329–338.

4. Titze I. How to use the flow resistant straws. J Sing. 2002;58:429–430.

5. Simberg S, Laine A. The resonance tube method in voice therapy: Descrip-

tion and practical Implementations. Logoped Phoniatr Vocol. 2007;32:

165–170.

6. Munro M, Leino T, Wissing D. Lessac’s y-buzz as a pedagogical tool in

the teaching of the projection of an actor’s voice. S Afr J Ling. 1996;14:

25–36.

7. Story B, Laukkanen A-M, Titze I. Acoustic impedance of an artificially

lengthened and constricted vocal tract. J Voice. 2000;14:455–469.

8. Titze I. The physics of small-amplitude oscillation of the vocal folds. J

Acoust Soc Am. 1988;83:1536–1552.

9. Rothenberg M. Acoustic interaction between the glottal source and the vo-

cal tract. In: Stevens KN, Hirano M, eds. Vocal Fold Physiology. Tokyo, Ja-

pan: University of Tokyo Press; 1981:305–328.

10. Titze I, Story B. Acoustic interactions of the voice source with the lower

vocal tract. J Acoust Soc Am. 1997;101:2234–2243.

11. Titze I, Laukkanen A. Can vocal economy in phonation be increased with

an artificially lengthened vocal tract? A computer modeling study. Logoped

Phoniatr Vocol. 2007;32:147–156.

12. Fant G, Lin Q. Glottal source-vocal tract acoustic interaction. STL-QPSR.

1987;1:13–27.

13. Rothenberg M. Source-tract acoustic interaction and voice quality. In:

Lawrence VL, ed. Transcripts of the 12th Symposium Care of Professional

Voice, Part L. New York, NY: The Voice Foundation; 1983:25–31.

14. Titze IR. Nonlinear source-filter coupling in phonation: theory. J Acoust

Soc Am. 2008;123:2733–2749.

15. Titze IR. Acoustic interpretation of resonant voice. J Voice. 2001;15:

519–528.

16. Titze IR, Sundberg J. Vocal intensity in speakers and singers. J Acoust Soc

Am. 1992;91:2936–2946.

17. Bickley CA, Stevens KN. Effects of a vocal tract constriction on the glot-

tal source: experimental and modeling studies. J Phonetics. 1986;14:

373–382.

18. Rothenberg M. Acoustic reinforcement of vocal fold vibratory behavior in

singing, vocal physiology: voice production, mechanisms and functions. In:

Fujimura O, ed. New York, NY: Raven Press; 1988:379–389.

19. Titze I. Voice training and therapy with a semi-occluded vocal tract: ratio-

nale and scientific underpinnings. J Speech Lang Hear Res. 2006;49:

448–459.

20. Gaskill C, Erickson M. The effect of a voiced lip trill on estimated glottal

closed quotient. J Voice. 2008;22:634–643.

21. Gaskill CS, EricksonML. The effect of an artificially lengthened vocal tract

on estimated glottal contact quotient in untrained male voices. J Voice.

2010;24:57–71.

22. Gaskill C, Quinney D. The effect of resonance tubes on glottal contact quo-

tient with and without task instruction: a comparison of trained and un-

trained voices. J Voice. 2012;26:e79–e93.

23. Laukkanen A-M, Lindholm P, Vilkman E, et al. A physiological and acous-

tic study on voiced bilabial fricative /ß:/ as a vocal exercise. J Voice. 1996;

10:67–77.

24. Cordeiro GF, Montagnoli AN, Nemr NK, Menezes MH, Tsuji DH. Com-

parative analysis of the closed quotient for lip and tongue trills in relation

to the sustained vowel /ε/. J Voice. 2012;26:17–22.

25. Hamdan AL, Nassar J, Al Zaghal Z, El-Khoury E, Bsat M, Tabri D. Glottal

contact quotient in Mediterranean tongue trill. J Voice. 2012;26:669.

e11–669.e15.

26. Guzman M, Rubin A, Mu~noz D, Jackson-Menaldi C. Changes in glottal

contact quotient during resonance tube phonation and phonation with vi-

brato. J Voice. 2013;27:305–311.

27. Laukkanen AM, Pulakka H, Alku P, et al. High-speed registration of

phonation-related glottal area variation during artificial lengthening of

the vocal tract. Logoped Phoniatr Vocol. 2007;32:157–164.

28. Barrichelo VM, Behlau M. Perceptual identification and acoustic measures

of the resonant voice based on "Lessac’s Y-Buzz", a preliminary study with

actors. J Voice. 2007;21:46–53.

29. Sampaio M, Oliveira G, Behlau M. Investigation of immediate effects of

two semi-ocluded vocal tract exercises. Pro Fono. 2008;20:261–266.

30. Angulo M, Mu~noz D, Mayerhoff R. Effect on long-term average spectrum

of pop singers’ vocal warm-up with vocal function exercises. Int J Speech

Lang Pathol. [in press].

31. Higueras D, Finchiera C, Mu~noz D, Guajardo C, Dowdall J. Immediate

acoustic effect of straw phonation exercises in subjects with dysphonic voi-

ces. Logoped Phoniatr Vocol. [in press].

32. GuzmanM, Callejas C, Castro C, et al. Therapeutic effect of semi-occluded

vocal tract exercises in patients with type I Muscle tension dysphonia. Re-

vista Logopedia Foniatr�ıa y Audiolog�ıa. 2012;32:139–146. [in Spanish].

33. Guzm�anM, Higueras D, Fincheira C, Mu~noz D, Guajardo C. Immediate ef-

fect of a vocal exercises sequence with resonant tubes. Revista CEFAC

2011.2012;14:471–480.

34. Laukkanen A, Titze I, Hoffman H, Finnegan E. Effects of a semi-occluded

vocal tract on laryngeal muscle activity and glottal adduction in a single fe-

male subject. Folia Phoniatr Logop. 2008;60:298–311.

35. Laukkanen AM, Lindholm P, Vilkman E. Vocal exercising and speaking re-

lated changes in glottal resistance. A pilot study. Logoped Phoniatr Vocol.

1998;23:85–92.

36. Laukkanen A-M, Lindholm P, Vilkman E. Phonation into a tube as a voice

training method: acoustic and physiologic observations. Folia Phoniatr

Logop. 1995;47:331–338.

37. Laukkanen A-M, Hor�a�cek J, Krupa P, �Svec JG. The effect of phonation into

a straw on the vocal tract adjustments and formant frequencies. A prelim-

inary MRI study on a single subject completed with acoustic results. Bi-

omed Signal Process Contr. 2010;7:50–57.

38. Laukkanen AM, Hor�a�cek J, Havl�ık R. Case-study magnetic resonance im-

aging and acoustic investigation of the effects of vocal warm-up on two

voice professionals. Logoped Phoniatr Vocol. 2012;37:75–82.

39. Vampola T, Laukkanen A-M, Hor�a�cek J, �Svec JG. Vocal tract changes

caused by phonation into a tube: a case study using computer tomography

and finite-element modeling. J Acoust Soc Am. 2011;129:310–315.

40. Fant G. Acoustic Theory of Speech Production. With Calculations Based on

X-ray Studies of Russian Articulations. 2nd ed. The Hague, The Nether-

lands: Mouton; 1970.

41. Kent R. Vocal tract acoustics. J Voice. 1993;7:97–117.

42. Hertegard S, Gauffin J, Sundberg J. Open and covered singing as studied by

means of fiberoptics, inverse filtering, and spectral analysis. J Voice. 1990;

4:220–230.

43. Miller DG, Schutte HK. Toward a definition of male ‘head’ register, passag-

gio and ‘cover’ in western operatic singing.Folia Phoniatr Logop. 1994;46:

157–170.

44. LaukkanenAM, Takalo R, Vilkman E, Nummenranta J, Lipponen T. Simul-

taneous videofluorographic and dual-channel electroglottographic registra-

tion of the vertical laryngeal position in various phonatory tasks. J Voice.

1999;13:60–71.

45. Elliott N, Sundberg J, Gramming R. Physiological aspects of a vocal exer-

cise. Speech Transm Lab Progr Status Rep. 1992;2-3:79–87.

46. Aronson AE. Clinical Voice Disorders. 3rd ed. New York, NY: Thieme

Stratton; 1990.

47. Sovij€arvi A, H€ayrinen R, Orden-Pannila M, Syv€anen M. A€anifysiologistenkuntoutusharjoitusten ohjeita [Instructions for voice exercises]. Helsinki,

Finland: Publications of Suomen Puheopisto; 1989.

48. Sovij€arvi A. Die Bestimmung der Stimmkategorien mittels Resonanzr€oh-

ren. Int Kongr Phon Wiss. 1965;5:532–535. [in German].


49. Sovij€arvi A. Adnifysiologiasta ja artikulaatiotekniikasta [On Voice Physi-

ology and Articulatory Technique] [in Finnish]. Helsinki, Finland: Depart-

ment of Phonetics, University of Helsinki; 1966.

50. Echternach M, Sundberg J, Arndt S, Breyer T, Markl M, Schumacher M,

Richter B. Vocal tract and register changes analysed by real-time MRI in

male professional singers—a pilot study. Logoped Phoniatr Vocol. 2008;

33:67–73.

51. Echternach M, Sundberg J, Markl M, Richter B. Professional opera tenors’

vocal tract configurations in registers. Folia Phoniatr Logop. 2010;62:

278–287.

52. Echternach M, Sundberg J, Arndt S, Markl M, Schumacher M, Richter B.

Vocal tract in female registers—a dynamic real-time MRI study. J Voice.

2010;24:133–139.

53. Echternach M, Traser L, Markl M, Richter B. Vocal tract configurations in

male alto register functions. J Voice. 2011;25:670–677.

54. Gundermann H. Die Behandlung der gest€orten Sprechstimme. Stuttgart,

Germany: Fischer; 1977.

55. Sundberg J. Articulatory interpretation of the singing formant. J Acoust Soc

Am. 1974;55:838–844.

56. Omori K, Kacker A, Carroll L, RileyW, Blaugrund S. Singing power ratio:

quantitative evaluation of singing voice quality. J Voice. 1996;10:228–235.

57. Stemple J. Voice Therapy: Clinical Studies. San Diego, CA: Singular Pub-

lishing Group; 2000.

58. Leino T, Laukkanen AM, Radolf V. Formation of the actor’s/speaker’s for-

mant: a study applying spectrum analysis and computer modeling. J Voice.

2011;25:150–158.

59. Laukkanen A-M, Radolf V, Hor�a�cek J, Leino T. Estimation of the origin of

a speaker’s and singer’s formant cluster using an optimization of 1D vocal

tract model. In: Juan Ignacio Godino Llorente, Pedro Gomez Vilda, Ruben

Fraile (eds.)Workshop AVFA’09: 3rd Advanced Voice Function Assess-

ment International Workshop, 18th–20th May 2009, Madrid, Spain.

Madrid:Cost 2103 Universidad Politecnica de Madrid, 2009; pp. 1–4.

60. Yanagisawa E, Estill J, Kmucha S, Leder S. The contribution of aryepiglot-

tic constriction to "ringing" voice quality—a videolaryngoscopic study

with acoustic analysis. J Voice. 1989;3:342–350.

61. Hanayama E, Camargo Z, Tsuji D, Pinho S. Metallic voice: physiological

and acoustic features. J Voice. 2009;23:62–70.

62. Nordenberg M, Sundberg J. Effect on LTAS of vocal loudness variation.

Logoped Phoniatr Vocol. 2004;29:183–191.

63. Verdolini K, Chan R, Titze IR, Hess M, Bierhals W. Correspondence of

electroglottographic closed quotient to vocal fold impact stress in excised

canine larynges. J Voice. 1998;4:415–423.

64. Iwarsson J, Sundberg J. Effects of lung volume on vertical larynx position

during phonation. J Voice. 1998;12:159–165.

65. Iwarsson J, Thomasson M, Sundberg J. Effects of lung volume on the glot-

tal voice source. J Voice. 1998;12:424–433.

66. Jiang J, Titze I. Measurement of vocal fold intraglottal stress and impact

stress. J Voice. 1994;8:132–144.

67. Hor�acek J, Laukkanen AM, Sidlof P. Estimation of impact stress using an

aeroelastic model of voice production. Logoped Phoniatr Vocol. 2007;32:

185–192.

68. Morrison MD, Rammage LA, Belisle GM, Pullan CB, Nichol H. Muscular

tension dysphonia. J Otolaryngol. 1983;12:302–306.

69. Roy N, Ford CN, Bless DM. Muscle tension dysphonia and spas-

modic dysphonia: the role of manual laryngeal tension reduction in

diagnosis and management. Ann Otol Rhinol Laryngol. 1996;105:

851–856.

70. Dworkin JP, Meleca RJ, Abkarian GG. Muscle tension dysphonia. Curr

Opin Otolaryngol Head Neck Surg. 2000;8:169–173.

71. Boone D. The Voice and Voice Therapy. New York, NY: Pearson; 2010.

72. Mitchell HF, Kenny DT, Ryan M, Davis PJ. Defining open throat through

content analysis of experts’ pedagogical practices. Logoped Phoniatr Vo-

col. 2003;28:167–180.

73. Mitchell HF, Kenny DT. The effects of open throat technique on long term

average spectra (LTAS) of female classical voices. Logoped Phoniatr Vo-

col. 2004;29:99–118.

74. Sundberg J. Articulatory configuration and pitch in a classically trained so-

prano singer. J Voice. 2009;23:546–551.

75. Fant G. Speech Sounds and Features. Cambridge, MA: MIT Press;

1973.

76. Bickley CA, Stevens KN. Effects of a vocal tract constriction on the glottal

source: data from voiced consonants. In: Baer T, Sasaki C, Harris K, eds.

Laryngeal Function in Phonation and Respiration. Boston, MA: College-

Hill Press Little Brown and Company; 1987:239–254.

77. SpießG.MethodischeBehandlungder nervrsenAphonie undeiniger anderer

Stimmstrmngen. Arch Laryngol Rhinol. 1899;9:368–376. [in German].

78. Habermann G. Funktionelle Stimmstrrungen und ihre Behandlung. Arch

Otorhinolaryngol. 1980;227:171–345. [in German].

Measurements

Before

Tube 1

Before

Tube 2

During

Tube 1

During

Tube 2

After

Tube 1

After

Tube 2

During

Straw 1

During

Straw 2

After

Straw 1

After

Straw 2

Mean

Before

Mean

During

Tube

Mean

After

Tube

Mean

During

Straw

Mean

After

Straw

Distance

(mm)

Vertical length 88 87.9 93 95 95 93 106 107 95 96 87.95 94 94 106.5 95.5

Horizontal length 91.9 90.6 90.9 90.8 91.9 91.6 91.25 90.85 91.75

Lip opening 9.7 9.5 9.5 9.5 9.5 9.5 9.6 9.5 9.5

Jaw opening 76.1 83.1 65.9 68.5 75.8 76.6 68.7 69 78.7 79.5 79.6 67.2 76.2 68.85 79.1

Tongue dorsum height 64.7 64.2 61.8 62.1 65.5 66 57.3 55.5 68.4 68.9 64.45 61.95 65.75 56.4 68.65

Oropharynx width 8.8 8.6 8 8.1 7 6.3 6 6.5 5.2 6.1 8.7 8.05 6.65 6.25 5.65

Velum elevation 21.6 19.5 13.9 13.4 16.8 17.4 13.7 13.4 15.8 16.6 20.55 13.55 17.1 13.55 16.2

Hypopharynx width 18.7 18.2 19.5 19.5 19 18.7 26.1 25.8 19.5 18.7 18.5 19.5 18.9 26 19.1

Area (mm2)

Outlet of the

epilaryngeal tube (Ae)

114 109 110 117 95 91 187 181 84 87 111.5 113.5 93 184 85.5

Inlet to the lower

pharynx (Ap)

460 444 495 497 498 404 881 849 471 465 452 496 451 865 468

Oral cavity (A1) 1577 1489 958 964 1478 1353 1083 1087 1367 1393 1533 961 1415.5 1085 1380

Pharyngeal region (A2) 429 444 522 489 377 313 416 594 312 317 436.5 505.5 345 505 314.5

Epilaryngeal region (A3) 547 533 608 616 551 555 933 938 547 587 540 612 553 935.5 567

APPENDIX

CT Distances (mm) and Cross-Sectional Areas (mm2)

JournalofVoice,Vol.27,No.4,2013

523.e34

ORIGINAL ARTICLE

The influence of water resistance therapy on vocal fold vibration: a high-speeddigital imaging study

Marco Guzmana,b, Anne-Maria Laukkanenc, Louisa Traserd, Ahmed Geneide, Bernhard Richterd, Daniel Mu~nozf

and Matthias Echternachd

aDepartment of Communication Sciences and Disorders, University of Chile, Santiago, Chile; bDepartment of Otolaryngology, Las CondesClinic, Santiago, Chile; cSpeech and Voice Research Laboratory, School of Education, University of Tampere, Tampere, Finland; dInstitute ofMusicians’ Medicine, Freiburg University Medical Center, Freiburg, Germany; eDepartment of Otorhinolaryngology and Phoniatrics—Head andNeck Surgery, Helsinki University Central Hospital, University of Helsinki, Helsinki, Finland; fDepartment of Otolaryngology, University of Chile,Santiago, Chile

ABSTRACTPurpose: This study investigated the influence of tube phonation into water on vocal fold vibration.Method: Eight participants were analyzed via high-speed digital imaging while phonating into a silicontube with the free end submerged into water. Two test sequences were studied: (1) phonation pre, dur-ing, and post tube submerged 5 cm into water; and (2) phonation into tube submerged 5 cm, 10 cm,and 18 cm into water. Several glottal area parameters were calculated using phonovibrograms.Results: The results showed individual differences. However, certain trends were possible to identifybased on similar results found for the majority of participants. Amplitude-to-length ratio, harmonic-to-noise ratio, and spectral flatness (derived from glottal area) decreased for all tube immersion depths,while glottal closing quotient increased for 10 cm immersion and contact quotient for 18 cm immersion.Closed quotient decreased during phonation into the tube at 5 cm depth, and jitter decreased duringand after it.Conclusion: Results suggest that the depth of tube submersion appears to have an effect on phon-ation. Shallow immersion seems to promote smoother and more stable phonation, while deeper immer-sion may involve increased respiratory and glottal effort to compensate for the increased supraglottalresistance. This disparity, which is dependent upon the degree of flow resistance, should be consideredwhen choosing treatment exercises for patients with various diagnoses, namely hyperfunctional orhypofunctional dysphonia.

ARTICLE HISTORYReceived 25 May 2015Revised 6 May 2016Accepted 23 June 2016Published online 2 August2016

KEYWORDSHigh-speed digital imaging;phonovibrograms; semi-occlusion; tube phonation;voice therapy

Introduction

Water resistance therapy includes phonation of a sustainedvowel sound into a tube with the distal end submerged inwater. The therapeutic process consists of several stepsoccurring during sessions throughout a period of weeks. Atthe beginning of water resistance therapy, the patient uses alimited pitch range for the first week(s) of training (1).Gradually, the patient starts to use a more varied intonationsuch as glides and simple intervals in a glissando mode. Thepatient is asked to keep the phonation stable and to follow anormal and comfortable breathing pattern in all exercises.Optimal body posture and breath control are also importantaspects in water resistance therapy (1).

Two main versions of water resistance therapy have beenused: (1) Phonation into a traditional Finnish resonance tube(made of glass, 24–28 cm in length with an 8–9 mm innerdiameter) that is submerged in a bowl of water (2); and (2)The Lax Vox technique, which involves phonation into aflexible silicone tube (35 cm in length with an inner diameterof 9–12 mm) that is submerged into a water-filled bottle (3).In both versions of water resistance therapy, the tubes are

kept approximately 1 mm between the teeth, with the lipsrounded so that no air leaks from the mouth (1). In the glasstube method, tubes are kept 1–2 cm or 5–15 cm below thesurface of the water, depending on the patient’s needs. Withthe Lax Vox method, the participant is instructed to keep thetube 1–7 cm in water for voice therapy (3). In both techni-ques, the patient is asked to feel vibratory sensations in theanterior areas of the facial tissues. According to Titze (4),vibration sensations on these areas are related to the effi-ciency of the energy conversion process at the glottis.

Although water resistance therapy requires regular prac-tice for at least several weeks, most relevant studies havebeen performed using tube phonation for a short period oftime (seconds or minutes). Some of these studies have beendesigned to investigate changes in the glottal function oraerodynamic measures of phonation (5,6), while others havelooked for changes in the vocal tract configuration (7).

The possible effect of resonance tube phonation in wateron phonation threshold pressure and collision thresholdpressure (CTP) was studied by Enflo et al. (6). CTP wasfound to be higher and the voice quality was perceived to be

CONTACT Marco Guzman [email protected] Avenida Independencia 1027, Santiago 8380000, Chile� 2016 Informa UK Limited, trading as Taylor & Francis Group

LOGOPEDICS PHONIATRICS VOCOLOGY, 2016http://dx.doi.org/10.1080/14015439.2016.1207097

better after the exercise. The perceptual changes were moreprominent in singers who did not practice singing daily andin singers who had less experience in singing in general (6).

According to the recent in vivo measurements by Radolfet al. (8) the mean oral pressure increased about 4 times inhabitual comfortable phonation and about 9 times in softphonation, when the subject was phonating into a resonancetube inserted 10 cm in water. The subglottic pressure doubledin normal phonation and quadrupled in soft phonation, thusthe subject compensated for the increase in supralaryngealresistance. Fundamental frequency (F0) decreased 11–15 Hz.Comparable results were obtained by Hor�acek et al. (5) usinga physical model of voice production. Flow rate was set con-stant in modeling. Oral pressure increased 10 times with aresonance tube 10 cm in water, and subglottic pressure hadto be increased 1.4 times to keep the flow rate constant.Flow resistance increased 1.5 times. F0 remained constant.Larger changes were observed for soft phonation, and F0decreased 38 Hz (19%).

Bubbles produced during phonation through a tube inwater generate a pulsating oral pressure at a frequency of15–40 Hz (5,8). Therefore, phonation into water may cause asensation of massage on the laryngeal and pharyngeal tissues.A massage-like effect with reduction of muscle hypertensioncould be desirable in patients with voice disorders, especiallyin subjects diagnosed with laryngeal and pharyngeal hyper-functionality. It has been hypothesized that a massage-likeeffect during phonation into water could also increase bloodflow in the vocal folds. This assumption is supported by evi-dence from the field of sports medicine indicating that mas-sage increases blood flow in muscles and skin (9–11). Othersemi-occluded exercises such as tongue and lip trills mayhave a comparable effect due to the oscillation of oral pres-sure produced by tongue and lip vibration (12,13). Eventhough the massage-like sensation of bubbling via tube phon-ation into water has been clinically reported, to date thereare no data supporting the hypothesis of a massage-likeeffect on the vocal folds or vocal tract.

Enflo et al. (6) stated that during phonation in the reson-ance tube in water the water bubbles generate oscillations oforal pressure, EGG, and audio signals. Specifically, oscillationof values of oral pressure modifies the transglottal pressure(which drives the vocal folds) and this, in turn, produceschanges in the EGG signal amplitude. The results of the invivo study by Radolf et al. (8) showed about 2–4 timeshigher peak-to-peak variation in oral pressure with the tubeimmersed 10 cm in water, compared to phonation on [u:](larger increase for soft phonation). Contact quotient, meas-ured from the EGG signal, increased 11% in habitual phon-ation. According to the modeling experiments by Hor�aceket al. (5) phonation at conversational loudness resulted in 2.2times larger peak-to-peak pressure variation and 1.6 timeslarger glottal amplitude variation with the tube submerged10 cm in water, compared to phonation on [u:]. Thus, themodulation of oral pressure during bubbling (tube phonationinto water) seems to have an effect on vocal fold oscillationand possibly on vocal fold tissues. A recent high-speed imag-ing study, designed to investigate tube phonation, reportedmodulations of vocal fold vibration and EGG signal due to

back pressure when the tube was held in water. Increasedmean value of open quotient with increasing water depthwas also reported (14).

Earlier investigations support the association betweenEGG contact quotient and the degree of vocal fold impactstress. When impact stress increases (stronger collisionbetween the vocal folds during vibration), the vocal foldsalso tend to stay together for longer intervals, thus increasingthe value of contact quotient or decreasing the value of openquotient (15).

Guzman et al. (7) evaluated via flexible laryngoscopy theeffect of eight different semi-occluded vocal tract postures onvertical laryngeal position (VLP), pharyngeal constriction,and laryngeal compression in subjects diagnosed with hyper-functional dysphonia. All semi-occluded techniques produceda lower VLP, narrower aryepiglottic opening, and a widerpharynx than in a resting position. More prominent changeswere obtained with tube phonation into water 3 cm and10 cm deep compared to the other semi-occluded exercisesthat did not involve water. Sovij€arvi et al. (16) assumed thatthe positive outcomes of phonation into a resonance tube inwater are due to the efficient lowering of the larynx and animproved vocal fold closure. Sovij€arvi stated that the lengthof the tube and depth into the water should be chosen sothat a clear lowering of the larynx would occur during theexercise. The goal after the exercise, however, is normal voic-ing with a neutral (not lowered) larynx (17–19).

Water resistance exercising precipitates changes in self-assessment of voice. Paes et al. (20) studied the immediateeffects of it on teachers with behavioral dysphonia. TheFinnish resonance tube immersed 2 cm into water was used.Significantly greater phonatory comfort and improved per-ceptual voice quality after the exercises were reported by thesubjects. Less spectral noise in the acoustic signal and lowerfundamental frequency were also observed (20).

Only two longitudinal studies have been conducted usingthe water resistance therapy. Positive results have beenobtained in the treatment of behavioral dysphonia (21,22).Perceptual assessment and results from a questionnaire onthe occurrence of vocal symptoms revealed significantchanges in the treatment group compared with the controlgroup (21).

Several visualization techniques of vocal fold vibrationhave been used in the diagnosis of laryngeal disorders and inthe investigation of normal voice production. High-speedimaging (HSI) has emerged as an effective method of visual-ization during the last decades (22). Commercial HSI systemsrecord images of sustained vocal fold vibration at the rate of2,000 to 10,000 frames per second, which is fast enough tocapture the actual phonatory vibrations of the vocal folds(23,24). For research, however, it was shown that using flex-ible endoscopy high-speed recordings are possible with aframe rate up to 20,000 frames per second, which allowsmore detailed analysis (24). To obtain precise and objectiveinformation from HSI videos, several methods have beendeveloped. Phonovibrography is a fast, robust, and preciseimage-processing strategy to extract information of thevocal fold movements during phonation (25). The methoddetects the vocal fold edges and transfers their movement

2 M. GUZMAN ET AL.

into a static geometric presentation (phonovibrograms).Several glottal variables can be calculated from phonovibro-grams (26).

The present study aimed to observe the influence of tubephonation into water on vocal fold vibration by using high-speed digital imaging and phonovibrography. Specifically, weattempted to answer two questions: (1) Is there any influenceof tube phonation into water on vocal fold vibration; and (2)Does immersion depth affect vocal fold vibration? Wehypothesize that: (1) phonation into water causes modulationin glottal variables; (2) the modulations imply a decrease inthe average values of closed quotient, amplitude-to-lengthratio, perturbation, noise-to-harmonic ratio, and spectral flat-ness; (3) the average parameter change observed duringwater resistance exercising remains in vowel phonation after-wards; (4) the effect is stronger when the immersion isdeeper than 5 cm; and (5) deeper than 10 cm immersion inwater would promote more glottal compensation forincreased airflow resistance (more adducted focal folds andincrease of closed quotient), and therefore higher impactstress than shallower immersion.

Methods

Participants and phonatory tasks

Eight volunteer participants (five male and three female)were analyzed using high-speed digital imaging. All partici-pants reported normal voice and hearing at the time of theexperiment. The health of the larynx was ascertained throughlaryngoscopy. Four participants were trained singers; oneparticipant reported experience in speech training, and threehad no voice training. The age range of the participants was23–45 years.

Two test conditions were studied:

1. The sequence for the first condition was: phonation pre,during, and post tube submerged 5 cm into water. Inthis sequence, subjects were asked to produce a sus-tained and stable vowel [i:] before and after phonationinto the tube. Each individual phonation sample wasproduced for approximately 5 seconds. Successivesequences of tube phonation were produced for5 minutes to gain a possible training effect. A flexiblesilicone (Lax Vox-like) tube (45 cm in length, 2 cm ininner diameter) was used. Participants were asked tophonate at a comfortable pitch and vocal loudness in allthree tasks.

2. The sequence for the second condition consisted of three5-second tube phonation samples with the tube sub-merged 5 cm (taken from the previous sequence), 10 cm,and 18 cm into water. Participants were asked to use thesame comfortable pitch produced in the first sequencefor these trials.

Samples of the vocal fold vibration were obtained pre,during, and post tube phonation via a rigid endoscopeattached to a plastic mouth piece. The flexible tube was alsoattached to the same mouth piece (Figure 1). The mouth

piece was maintained between the rounded lips, so that noair would leak from the mouth; the free end of the tube waskept submerged in the water as an extension of the vocaltract. Three marks were made on the water container at5 cm, 10 cm, and 18 cm, to help in maintaining the appropri-ate depth of immersion. A grand piano was used to give thepitch, which was auditorily monitored by one of the authors(M.G.). Only one recording per phonatory task was per-formed for each participant. A total of five samples wereobtained from each participant (pre tube, phonation duringtube submerged 5 cm, post tube, phonation during tube sub-merged 10 cm, and phonation during tube submerged18 cm). In each sample, participants phonated for approxi-mately 5 seconds. However, only 1 second of phonation wascaptured by the high-speed camera. The captured sampleswere obtained from the mid-portion of each phonatory task.

Instrumentation

Data collection was performed in a room typically used forclinical laryngoscopic assessment of voice. Laryngeal endo-scopic procedure was performed using a HRES-Endocam5562 system (Fa. Wolf, Knittlingen, Germany) coupled witha 90� rigid endoscope (Fa. Wolf, Knittlingen, Germany). Thissystem also includes software for digital storing of therecordings. The high-speed camera allows recording of 4,000frames per second with a pixel resolution of 256� 256. Forillumination, a 300-W xenon light source (Auto-LP 5131,Richard Wolf, Knittlingen, Germany) was used. To preventtissue overheating during the recordings, the duration of lightexposure was kept at a minimum. No calibration of theimages could be performed. Laryngoscopic procedures wereperformed by a laryngologist and a phoniatrician, both co-authors of this study (M.E. and A.G.). Participants wererequired to stay in standing position during examinations,and no topical anesthesia was used.

Image processing

Phonovibrography was used to extract information of thevocal fold movements during phonation. For each sample,1,000 frames (250 ms) of the high speed material were ana-lyzed using the custom-made Phonovibrogram Software Tool(Glottal Analysis Tools v5, Michael D€ollinger and Denis

Figure 1. General view of the experiment setting during high-speed registration.

LOGOPEDICS PHONIATRICS VOCOLOGY 3

Dubrovsky, Department of Phoniatrics and Pedaudiology,Medical School Erlangen, Germany). The excerpts for theanalyses were chosen from the part of the samples wherephonation was stable, excluding the onset and offset of voice.There was no randomization in the analyses. The glottis areawas first segmented in a single frame and semi-automaticallytransferred to the other frames. In the next step the glottallength axis was constructed, and a phonovibrogram wasestablished as described before (27).

Glottal area variables

The following glottal variables were derived from all high-speed registrations as described in the literature (28). Theseparameters were chosen due to the fact that most of themhave been used in earlier studies. Moreover, these parametersdo not require calibration of the images from pixels intoarea.

� Amplitude-to-length ratio (A-LR): ratio between vibra-tional amplitude at the center of the vocal folds and thelength of the vocal folds.

� Closed quotient (CQ): ratio between closed phase and theentire glottal period.

� Closing quotient (ClQ): ratio between closing phase andentire glottal period.

� Fundamental frequency (F0): number of cycles of vocalfold vibration per second.

� Harmonic-to-noise ratio (HNR): relation between har-monic energy and noise energy.

� Spectral flatness (SF): spectral declination or spectral tilt(frequency range from 0 to 2,000 Hz). Flatness is maximalfor white noise.

� Jitter%: frequency perturbation in percentage.� Shimmer%: amplitude perturbation in percentage.

It is worth noting that, even though HNR, SF, jitter, andshimmer are often calculated from acoustic signals, in thepresent study these parameters were calculated from glottalarea variation.

Only descriptive statistics were calculated for the variables,including median and interquartile range. We analyzed thetrend of values for all parameters and phonatory tasks with-out using hypothesis testing to avoid lack of statistical powerdue to small sample size. All analyses and graphics wereperformed using Stata 13.1 (StataCorp, College Station,TX, USA).

Results

Tables 1 and 2 show median and interquartile ranges forsequence 1 and 2, respectively. Moreover, results are pre-sented below in terms of the observed trends among partici-pants, i.e. when similar changes (compared to vowelphonation) were observed for the majority (five or more) ofthe participants. Figures 2 and 3 summarize the trends forsequence 1 and 2, respectively.

First sequence: phonation pre, during, and post tubesubmerged 5 cm into water

During phonation into a tube submerged in water to a depthof 5 cm, CQ decreased (5 of 8 participants) compared tobaseline condition, HNR decreased (6 participants), spectralflatness decreased (5 participants), A-LR (right and left vocalfolds) decreased (5 and 6 participants, respectively), F0increased (6 participants), and jitter decreased (5 partici-pants). Thus phonation seemed to be less tight, softer, a bitmore noisy, and yet more stable. After phonation into tube5 cm in water there was an increase in CQ (6 participants),ClQ (5 participants), HNR (5 participants), and F0 (5 partici-pants) and a decrease in jitter (5 participants). Thus, it seemsthat phonation was tighter, and the voice was less noisy andmore stable.

Second sequence: tube submerged 5 cm, 10 cm, and18 cm into water

CQ decreased with the tube 5 cm in water (5 participants),increased during phonation into a tube submerged in waterto a depth of 18 cm (6 participants), ClQ increased duringphonation into a tube submerged in water to a depth of10 cm (6 participants), HNR and A-LR decreased for all tubedepths (6 and 5 participants, respectively), jitter decreasedfor the tube 5 cm in water (5 participants) and increased forthe tube 10 cm in water (5 participants), and F0 increasedfor the tube 5 cm and 10 cm in water. SF decreased duringtube submerged 5 cm, 10 cm, and 18 cm below the water sur-face (5 participants for each condition). These findings seemto imply less tight phonation for the tube 5 cm in water,while the results for deeper immersion are more difficult tointerpret. Decreased A-LR, HNR, and SF seem to implysofter phonation (less collision between the vocal folds),while increased ClQ and CQ suggest the opposite.

Table 1. Medians and interquartile ranges for glottal area parameters pre,during, and after tube phonation submerged 5 cm in water (sequence 1).

Parameter Pre 5 cm During 5 cm Post 5 cm

F0 198 ± 89 222 ± 60 216 ± 110HNR 15 ± 3 14 ± 3 16 ± 8Jitter 4 ± 4 3 ± 2 2 ± 2Shimmer 0.3 ± 0.2 0.4 ± 0.5 0.4 ± 0.5SF –20 ± 4 –17 ± 3 –20 ± 2CQ 0.2 ± 0.05 0.3 ± 0.15 0.2 ± 0.13AL-R (right) 0.08 ± 0.04 0.07 ± 0.05 0.07 ± 0.03AL-R (left) 0.1 ± 0.05 0.09 ± 0.05 0.09 ± 0.04

Table 2. Medians and interquartile ranges for glottal area parameters duringtube submerged 5 cm, tube submerged 10 cm, and tube submerged 18 cm(sequence 2).

Parameter During 5 cm During 10 cm During 18 cm

F0 222 ± 60 235 ± 81 216 ± 72.4HNR 14 ± 3 14 ± 3 13 ± 1.9Jitter 3 ± 2 4 ± 0.9 4 ± 4.2Shimmer 0.4 ± 0.5 0.4 ± 0.2 0.5 ± 0.3SF –17 ± 3 –19 ± 3.7 –18 ± 5.08CQ 0.3 ± 0.2 0.3 ± 0.11 0.2 ± 0.13A-LR (right) 0.07 ± 0.05 0.07 ± 0.02 0.08 ± 0.05A-LR (left) 0.09 ± 0.05 0.1 ± 0.05 0.1 ± 0.03

4 M. GUZMAN ET AL.

Discussion

This study analyzed vocal fold oscillatory patterns duringphonation into a tube submerged at different depths intowater. In general, it was demonstrated that vocal fold oscilla-tion is affected by tube phonation. This has been notedbefore when phonation through a tube in the air has beenstudied, e.g. using air pressure and electroglottographic regis-tration (29), by calculating the electroglottographic contactquotient (30,31), and by applying high-speed filming (32).

In line with earlier results (31,32), our results showed thatparticipants behaved differently and thus no statistically sig-nificant differences were observed in any parameters studied.This suggests that water resistance exercises do not giveautomatically certain results but the effect depends on how aperson reacts to the increased airflow resistance. Certaintrends were observed, though.

A-LR was in most cases lower during phonation intowater, for all immersion depths. A decrease in A-LR couldbe expected during semi-occlusions because of the decreasein transglottal pressure (the driving force for phonation). Alow A-LR is expected to imply a low impact stress on vocalfolds. Therefore, a phonotraumatic reaction would beunlikely. To the best of our knowledge, only one earlier studyused a similar measure, the amplitude-to-dynamic length(A-DL) ratio, in investigating the effects of tube phonation.An increased A-DL ratio was found for the longest tube,

which, according to the authors, may suggest a raised vocaleffort (increased subglottic pressure) (32). In the presentstudy, a relatively high A-LR was found for some subjectswith the tube 18 cm in water.

To date, no clear patterns have been evidenced regardingcontact quotient from electroglottography (CQEGG) signalduring and after semi-occlusions (30,31,33–39). Some studieshave found a decrease in CQEGG during and after semi-occluded exercises, while others have reported an increase.An interesting trend was detected for CQ (closed quotientobtained from high-speed samples) in the present study.Tube phonation into 18 cm of water showed higher values ofCQ than the other conditions and baseline. Phonation into5 cm of water in turn demonstrated a decrease in CQ com-pared to baseline condition. It is possible to speculate thatthe effect of vocal exercises on CQ depends on the degree offlow resistance (e.g. depth of immersion) and on the subjects’reactions to it (38). From our findings, it seems that a shal-lower depth tends to lead to a decrease in CQ, while adeeper submersion tends to increase the glottal CQ.Similarly, Laukkanen et al. (32) in a high-speed imagingstudy found higher CQ for longer tubes compared withshorter ones. Results from the study by Hor�acek et al. (5)showed how tube diameter also affects glottal function. Sincedepth of immersion, and tube length and inner diameter allaffect the amount of flow resistance, it seems plausible toexpect that they also affect CQ. An increase of CQ could

Figure 2. Trends for sequence 1 (phonation pre, during, and post tube submerged 5 cm into water).


result from increased adduction, or it could be a passive con-sequence of increased subglottic pressure (5). It is most likelythat in speech production these variables co-vary and theglottis reacts reflexively to the information from various sen-sory receptors. As mentioned before, the importance of CQduring tube phonation is due to the fact that there is a rela-tionship between CQ and vocal fold impact stress (15).

Two trends were observed for ClQ. It increased after 5 cmphonation compared to baseline condition and during phon-ation in the tube 10 cm in the water. In other words, theclosing phase seemed to be relatively longer compared tobaseline condition. Different changes were observed byLaukkanen et al. (32). ClQ values were smaller for longertubes. As mentioned before, a long and/or narrow tube couldbe equivalent to a tube submerged deep under the water sur-face because of the high flow resistance. Since a low ClQ isrelated to the abruptness of vocal fold closure, one mayexpect that high flow resistance exercises (i.e. long and/ornarrow tubes or deep immersion) cause a higher vocal foldimpact stress compared to shallower depth of immersion.However, this was not observed in our data.

Harmonic-to-noise ratio was also examined in the presentstudy, based on image analysis. This parameter demonstratedan increase after tube phonation submerged into 5 cm ofwater compared to pre tube samples. Thus, more harmonic

energy compared to noise was observed. Decreased noiseafter semi-occlusion exercises has been reported in previousworks, based on acoustic signal analysis (20,42,43). Thischange has been observed after a few minutes of exercises(20), and after several weeks of treatment (43). However, thesubjects of these studies were patients with hyperfunctionaldysphonia.

Related to harmonic energy, spectral flatness decreased intube phonation for all immersion depths compared to base-line. Therefore, it is possible to state that our data show asteeper spectral slope for all conditions compared to baseline.Recall that SF in the present study was derived from glottalarea variation. No trend was observed in the present study insamples obtained after tube phonation. Earlier studies onacoustic spectra have shown a less steep spectral slope afterphonation with different semi-occluded vocal tract exercises(2,39,44,45), suggesting that semi-occlusions produce anincreased spectral energy in the higher part of the spectrum.The amplitude of higher harmonics is particularly sensitiveto the speed at which the glottal airflow decreases at the endof the closing phase (46). A lower SF during phonationthrough a tube in water seems to imply a smoother collisionbetween the vocal folds (47).

The effect of the depth of immersion on perturbationmeasures was also analyzed. There was an interesting trend

Figure 3. Trends for sequence 2 (phonation during 5 cm, 10 cm, and 18 cm into water).

6 M. GUZMAN ET AL.

that jitter decreased during and after phonation into a tubesubmerged 5 cm in water for most participants. A number ofinvestigations have used acoustic measures to observe theeffect of semi-occluded exercises. However, only few of themhave utilized perturbation measures. Jitter and shimmer wererecently assessed before and after stirring straw phonation ina group of school teachers with slight dysphonia (42). Bothparameters were found to be significantly lower after per-forming vocal exercises. Furthermore, Barrichelo-Lindstr€omet al. (48) reported a decrease of jitter and shimmer in nor-mal-voiced participants after vocal warm-up using Y-buzz,another semi-occluded exercise. It is possible to speculatethat lower perturbation values may reflect better sensory-motor control (49) or greater activity in laryngeal muscles ifone increases subglottic pressure and adduction during andafter tube phonation (50–53). A negative correlation hasbeen found between perturbation measures and F0 and SPL.Earlier studies have suggested that a higher F0 and SPLwould imply higher muscle activity, and this, in turn, wouldreduce perturbation of F0 and amplitude (49–52).

Earlier investigations have reported that tube phonationand other semi-occluded exercises may also modify F0. Ourdata showed higher values for most conditions compared tobaseline. Most previous studies regarding semi-occludedvocal tract exercises have reported a decrease in F0 duringexercising (2,20,29,40–43). Others have indicated an increase(13), no change (5), or no clear trend. According to two pre-vious studies, F0 shifts depended on the length of the tubeand depth of immersion. Laukkanen et al. (32) found lowerF0 values for longer tubes compared to shorter ones.Furthermore, Hor�acek et al. (5) observed (for soft phonation)that F0 was lower with a resonance tube in water and with athin straw. Therefore, it seems that F0 is related to thedegree of airflow resistance.

It is important to mention that F0 results could dependon the instructions given in the recording process. In severalstudies, participants have been instructed to keep the samepitch before and after exercising in order to make the resultscomparable (as it was required in the present study).Moreover, considering that all earlier studies have demon-strated changes of only a few Hz, these data should be takenwith caution because it likely has no clinical impact and itcan be only a physical modification. Additionally, no F0changes have clearly remained after exercising.

It seems that choosing the most appropriate semi-occlu-sion (e.g. length/diameter of the tube and depth of immer-sion) for each subject would promote the best vocal function.The results suggest that there is a tendency for smoother col-lision between the vocal folds and increased stability inphonation through a tube in water with a smaller immersiondepth. A decrease in SF, HNR, and A-LR observed in mostcases in tube phonation for all immersion depths suggests asmoother collision between the vocal folds. The increase inCQ for deep immersion depth, however, suggests that thepossibility for increased vocal loading cannot be excludedwhen deep immersion depth is used. The desired vocal goalalways depends on what the starting-point is. Our resultssuggest that a shallow immersion may be suited for patientswith hyperfunctional voice disorders, while deep immersion

may be a good option for patients with hypofunctional voicedisorder. Thus, our results are in line with the common clin-ical tradition (see e.g. Simberg and Laine) (1). Even thoughtechnical characteristics of semi-occlusions (e.g. length orinner diameter of tube) are important, instructions and guid-ance on how to perform the exercise are also crucial whenthese exercises are used in voice rehabilitation and training.

As mentioned above, it has been found that CQ correlateswith impact stress. However, during phonation into a tube inwater the bubbling of water affects the vocal fold vibration.Therefore, variation in CQ values can be also present.Further studies are needed to know how well a mean CQand other mean values of glottal area variables characterizevocal fold vibration and impact stress during water resistancetherapy.

Conclusion

Even though subjects showed individual variation for mostglottal area parameters, data from the present study suggestthat tube phonation into water causes changes in glottal vari-ables. In most cases the amplitude-to-length ratio, harmonic-to-noise ratio, and spectral flatness (derived from glottalarea) decreased, which seems to imply a smoother collisionbetween the vocal folds. However, the effect does not neces-sarily remain in vowel phonation after tube phonation.Findings obtained from shallow immersion suggest a favor-able mechanical and physiological interaction in terms ofdecreased closed quotient and reduced jitter. Deeper immer-sion, instead, showed higher closed quotient. This may implyan increased respiratory and glottal effort (higher subglotticpressure and more adducted vocal folds) to compensate forthe strongly increased supraglottal resistance. Thus, theimpact stress during phonation through a tube into watermay increase during deep immersion. The results suggestthat this disparity of the effects, which is dependent uponthe degree of flow resistance, should be considered whenchoosing treatment exercises for patients with different typesof diagnoses, namely hyperfunctional or hypofunctional dys-phonia. Deeper immersion could be more beneficial forpatients with vocal fold paralysis or presbyphonia. On theother hand, glottal function of patients with hyperadductionor vocal fatigue could be improved using shallower depths.

Acknowledgements

Special thanks to Denis Dubrovsky and Michael D€ollinger for providingthe phonovibrogram software.

Disclosure statement

The authors report no conflicts of interest. The authors alone areresponsible for the content and writing of the paper.

Funding

In Germany, this study was supported by the DeutscheForschungsgemeinschaft [grant Ec 409/1-1 and Ri 1050/4-1]. InFinland, it was supported by the Finnish Academy of Sciences [grantnumber 1128095].


References

1. Simberg S, Laine A. The resonance tube method in voice therapy:description and practical Implementations. Logoped PhoniatrVocol 2007; 32: 165–70.

2. Laukkanen A-M. About the so called ‘resonance tubes’ used inFinnish voice training practice. Scandinavian Journal ofLogopedics & Phoniatrics 1992; 17: 151–61.

3. Sihvo M. Terve €a€ani, €a€anen hoidon A B C [Healthy voice. TheA B C for voice care]. Helsinki: Kirjapaja; 2006. [In Finnish]

4. Titze IR. Acoustic interpretation of resonant voice. J Voice 2001;15: 519–28.

5. Hor�acek J, Radolf V, Bula V, Laukkanen A-M. Air-pressure, vocalfolds vibration and acoustic characteristics of phonation duringvocal exercising. Part 2: measurement on a physical model.Engineering Mechanics 2014; 21: 193–200.

6. Enflo L, Sundberg J, Romedahl C, McAllister A. Effects on vocalfold collision and phonation threshold pressure of resonance tubephonation with tube end in water. J Speech Lang Hear Res 2013;56: 1530–8.

7. Guzman M, Castro C, Testart A, Mu~noz D, Gerhard J. Laryngealand pharyngeal activity during semioccluded vocal tract posturesin subjects diagnosed with hyperfunctional dysphonia. J Voice2013; 27: 709–16.

8. Radolf V, Laukkanen A-M, Hor�acek J, Liu D. Air-pressure, vocalfold vibration and acoustic characteristics of phonation duringvocal exercising. Part 1: measurement in vivo. EngineeringMechanics 2014; 21: 53–9.

9. Mori H, Ohsawa H, Tanaka TH, Taniwaki E, Leisman G, Nishijo K.Effect of massage on blood flow and muscle fatigue followingisometric lumbar exercise. Med Sci Monit 2004; 10: CR173–8.

10. Hinds T, McEwan I, Perkes J, Dawson E, Ball D, George K.Effects of massage on limb and skin blood flow after quadricepsexercise. Med Sci Sports Exerc 2014; 36: 1308–13.

11. Weerapong P, Hume PA, Kolt GS. The mechanisms of massageand effects on performance, muscle recovery and injury preven-tion. Sports Med 2005; 35: 235–56.

12. Miller DG, Schutte HK. Effects of downstream occlusions onpressures near the glottis in singing. In: Gauffin J, HammarbergB, editors. Vocal fold physiology. Acoustic, perceptual andphysiological aspects of voice mechanisms. San Diego (CA):Singular Publishing; 1991. p. 91–8.

13. Schwarz K, Cielo CA. Modificac~oes lar�ıngeas e vocais produzidaspela t�ecnica de vibrac~ao sonorizada de l�ıngua. Pr�o-Fono 2009; 21:161–6.

14. Granqvist S, Simberg S, Hertegård S, Holmqvist S, Larsson H,Lindestad PA, et al. Resonance tube phonation in water: high-speed imaging, electroglottographic and oral pressure observationsof vocal fold vibrations—a pilot study. Logoped Phoniatr Vocol2014; 28: 1–9.

15. Verdolini K, Chan R, Titze I, Hess I, Bierhals W. Correspondenceof electroglottographic closed quotient to vocal fold impact stressin excised canine larynges. J Voice 1998; 12: 415–23.

16. Sovij€arvi A, H€ayrinen R, Orden-Pannila M, Syv€anen M.A€anifysiologisten kuntoutusharjoitusten ohjeita [Instructions forvoice exercises]. Helsinki: Publications of Suomen Puheopisto; 1989.

17. Sovij€arvi A. Die Bestimmung der Stimmkategorien mittelsResonanzr€ohren. In: Int Kongr Phon Wiss 1965. p. 532–5.

18. Sovij€arvi A. €A€anifysiologiasta ja artikulaatiotekniikasta [On voicephysiology and articulatory technique]. Helsinki, Finland:Department of Phonetics, University of Helsinki press; 1966. [inFinnish]

19. Sovij€arvi A. Nya metoder vid behandlingen av r€ostrubbningar.Nordisk Tidskrift for Tale og Stemme 1969; 3: 121–31.

20. Paes SM, Zambon F, Yamasaki R, Simberg S, Behlau M.Immediate effects of the finnish resonance tube method onbehavioral dysphonia. J Voice 2013; 27: 717–22.

21. Simberg S, Sala E, Tuomainen J, Sellman J. The effectiveness ofgroup therapy for students with mild voice disorders: a controlledclinical trial. J Voice 2006; 20: 97–109.

22. Tapani M. Resonaattoriputki toiminnallisen €a€aih€airionhoitmenetelm€an€a. Seitsem€an naispotilaan Seurantatutukimus[Resonance tube as a therapy method for a functional voicedisorder. A follow-up study of seven female patients]. Helsinki,Finland: University of Helsinki; 1992. [in Finnish]

23. Patel R, Dailey S, Bless D. Comparison of high-speed digitalimaging with stroboscopy for laryngeal imaging of glottal disor-ders. Ann Otol Rhinol Laryngol 2008; 7: 413–24.

24. Krausert CR, Olszewski AE, Taylor LN, McMurray JS, Dailey SH,Jiang JJ. Mucosal wave measurement and visualization techniques.J Voice 2001; 25: 395–405.

25. Wittenberg T, Tigges M, Mergell P, Eysholdt U. Functional imag-ing of vocal fold vibration: digital multislice high-speed kymogra-phy. J Voice 2000; 14: 422–42.

26. Echternach M, D€ollinger M, Sundberg J, Traser L, Richter B.Vocal fold vibrations at high soprano fundamental frequencies.J Acoust Soc Am 2013; 133: 82–7.

27. Lohscheller J, Toy H, Rosanowski F, Eysholdt U, Dollinger M.Clinically evaluated procedure for the reconstruction of vocal foldvibrations from endoscopic digital high-speed videos. Med ImageAnal 2007; 11: 400–13.

28. Inwald EC, D€ollinger M, Schuster M, Eysholdt U, Bohr C.Multiparametric analysis of vocal fold vibrations in healthy anddisordered voices in high-speed imaging. J Voice 2011; 25:576–90.

29. Titze IR, Finnegan EM, Laukkanen A-M, Jaiswal S. Raising lungpressure and pitch in vocal warm-ups: the use of flow-resistantstraws. Journal of Singing 2002; 58: 329–38.

30. Gaskill CS, Erickson ML. The effect of an artificially lengthenedvocal tract on estimated glottal contact quotient in untrainedmale voices. J Voice 2010; 24: 57–71.

31. Gaskill C, Quinney D. The effect of resonance tubes on glottalcontact quotient with and without task instruction: a comparisonof trained and untrained voices. J Voice 2012; 26e: 79–93.

32. Laukkanen A-M, Pulakka H, Alku P, Vilkman E, Hertegård S,Lindestad PA, et al. High-speed registration of phonation-relatedglottal area variation during artificial lengthening of the vocaltract. Logoped Phoniatr Vocol 2007; 32: 157–64.

33. Laukkanen A-M, Lindholm P, Vilkman E, Haataja K, Alk P.A physiological and acoustic study on voiced bilabial fricative/b:/as a vocal exercise. J Voice 1996; 10: 67–77.

34. Gaskill C, Erickson M. The effect of a voiced lip trill on estimatedglottal closed quotient. J Voice 2008; 22: 634–43.

35. Cordeiro GF, Montagnoli AN, Nemr NK, Menezes MH,Tsuji DH. Comparative analysis of the closed quotient for lip andtongue trills in relation to the sustained vowel/e/. J Voice 2012;26: e17–22.

36. Hamdan AL, Nassar J, Al Zaghal Z, El-Khoury E, Bsat M, TabriD. Glottal contact quotient in mediterranean tongue trill. J Voice2012; 26: 669.e11–15.

37. Guzman M, Rubin A, Mu~noz D, Jackson-Menaldi C. Changes inglottal contact quotient during resonance tube phonation andphonation with vibrato. J Voice 2013; 27: 305–11.

38. Guzman M, Laukkanen A-M, Krupa P, Hor�acek J, �Svec J, GeneidA. Vocal tract and glottal function during and after vocal exercis-ing with resonance tube and straw. J Voice 2013; 27: 523.e19–34.

39. Guzman M, Calvache C, Romero L, Mu~noz D, Olavarria C,Madrid S, et al. Do different semi-occluded voice exercises affectvocal fold adduction differently in subjects diagnosed with hyper-functional dysphonia? Folia Phoniatr Logop 2015; 67: 68–75.

40. Titze I. Theoretical analysis of maximum flow declination rateversus maximum area declination rate in phonation. J SpeechLang Hear Res 2006; 49: 439–47.

41. Laukkanen A, Titze I, Hoffman H, Finnegan E. Effects of a semi-occluded vocal tract on laryngeal muscle activity and glottaladduction in a single female subject. Folia Phoniatr Logop 2008;60: 298–311.

42. Guzm�an M, Higueras D, Fincheira C, Mu~noz D, Guajardo C.Immediate effect of a vocal exercises sequence with resonanttubes. Revista CEFAC 2011; 14: 471–80.

8 M. GUZMAN ET AL.

43. Guzman M, Callejas C, Castro C, Garc�ıa-Campo P,Lavanderos D, Valladares M, et al. Therapeutic effect of semi-occluded vocal tract exercises in patients with type I muscletension dysphonia. Revista Logopedia Foniatr�ıa y Audiolog�ıa2012; 32: 139–46.

44. Guzman M, Angulo M, Mu~noz D, Mayerhoff R. Effect onlong-term average spectrum of pop singers’ vocal warm-up withvocal function exercises. Int J Speech Lang Pathol 2013; 15:127–35.

45. Guzman M, Higueras D, Finchiera C, Mu~noz D, Guajardo C,Dowdall J. Immediate acoustic effect of straw phonation exercisesin subjects with dysphonic voices. Logoped Phoniatr Vocol 2013;38: 35–45.

46. Story B, Laukkanen A-M, Titze I. Acoustic impedance of anartificially lengthened and constricted vocal tract. J Voice 2000;14: 455–69.

47. Fant G. Acoustic theory of speech production. The Hague:Mouton Press; 1960.

48. Barrichelo-Lindstr€om V, Behlau M. Perceptual identification andacoustic measures of the resonant voice based on ‘‘Lessac’s Y-Buzz’’– a preliminary study with actors. J Voice 2007; 21: 46–53.

49. Titze IR. Principles of voice production. Englewood Cliffs, NJ:Prentice-Hall; 1994.

50. Orlikoff RF, Baken RJ. Consideration of the relationship betweenthe fundamental frequency of phonation and vocal jitter. FoliaPhoniatr (Basel) 1990; 42: 31–40.

51. Orlikoff RF, Kahane JC. Influence of mean sound pressure levelon jitter and shimmer measures. J Voice 1991; 5: 113–19.

52. Laukkanen A-M, Ilom€aki I, Lepp€anen K, Vilkman E. Acousticmeasures and self-reports of vocal fatigue in female teachers.J Voice 2008; 22: 283–9.

53. Lepp€anen K, Laukkanen A-M, Ilom€aki I, Vilkman E. A compari-son of the effects of voice massage and voice hygiene lecture onself-reported vocal well-being and acoustic and perceptual speechparameters in female teachers. Folia Phoniatr Logop 2009; 61:227–38.


Air Pressure and Contact Quotient Measures During

Different Semioccluded Postures in Subjects With

Different Voice Conditions

*,†Marco Guzmán, ‡Christian Castro, *Sofia Madrid, §Christian Olavarria, §Miguel Leiva, ¶Daniel Muñoz,

‡Elizabeth Jaramillo, and **Anne-Maria Laukkanen, *,†,§, ¶Santiago, Chile; ‡Valparaiso, Chile; and **Tampere, Finland

Summary: Objective. The purpose of this study was to investigate the effect of phonation into tubes in air andtubes submerged in water on air pressure variables and vocal fold adduction in subjects with different voice conditions.Methods. Forty-five participants representing four vocal conditions were included: (1) subjects diagnosed with normalvoice and without voice training, (2) subjects with normal voice with voice training, (3) subjects with muscle tensiondysphonia, and (4) subjects with unilateral vocal fold paralysis. Participants phonated into different kinds of tubes (drink-ing straw, 5 mm in inner diameter; stirring straw, 2.7 mm in inner diameter; silicon tube, 10 mm in inner diameter)with the free end in air and in water. Aerodynamic, acoustic, and electroglottographic signals were captured simulta-neously. Mean values of the following variables were considered: glottal contact quotient (CQ) measured byelectroglottograph, fundamental frequency, subglottic pressure (Psub), oral pressure (Poral), and transglottal pressure.Results. All exercises had a significant effect on Psub, Poral, transglottal pressure, and CQ (P < 0.05). Phonation intoa 55-cm silicon tube submerged 10 cm in water and phonation into a stirring straw resulted in the highest values forCQ, Psub, and Poral compared with baseline (repetition of syllable [pa:]) for all vocal status. Poral and Psub corre-lated positively.Conclusion. During semioccluded exercises, most variables behaved in a similar way (same trend with a quite largeindividual variation) regardless of the vocal status of the participants.Key Words: Tube phonation–Vocal semiocclusion exercises–Water resistance therapy–Functional dysphonia–Vocalfold paralysis.

INTRODUCTION

Voice exercises that raise the airflow resistance of the vocal tractcompared with open vowels are widely used in voice therapyand training. They are commonly called semioccluded vocal tractexercises (SOVTEs). These exercises include phonation on voicedfricatives, nasals, lip and tongue trills, or during sudden cover-ing of the mouth by hand or finger kazoo and phonation intodifferent tubes with the outer end either freely in the air or im-mersed in water. In voice training of subjects with normal voice,semiocclusion exercises have been used to improve voice qualityand increase loudness in an effortless way. Especially, phona-tion into tubes with the outer end immersed in water has beenused in voice therapy both to reduce hypertension and, in con-trast, to increase adductory activity in patients with hypofunctionalvoice production.1–5

From the physical point of view, resistance is defined as theimpediment to flow. Mathematically, resistance is a ratio betweenmean pressure and flow. At a given potential (difference of pres-sure), a lower flow will be produced when there is a high

resistance. Therefore, flow is inversely proportional to resis-tance and proportional to potential.6 Resistance is higher whena constriction in the vocal tract is narrower. Longer and nar-rower tubes offer more resistance to the airflow than shorter andwider ones because of frictional losses.7,8 Moreover, when a tubeis submerged in water, an extra resistance is added due to hy-drostatic pressure, which is dependent on the depth of immersionand also on the angle of immersion in water.8–10

Titze et al7 measured the resistance of different kinds of tubesand straws that are commercially available and thus easy to obtain.From the results, it can be seen that a so-called resonance tube(7.5 mm in inner diameter, 30 cm in length)11 offers very littleflow resistance, whereas a narrow straw (2.0 mm in inner di-ameter, 11.5 cm in length) in the United States used to stir coffee,offered 56 times more resistance than a resonance tube and 37times more than a drinking straw (6.0 mm in inner diameter,19.5 cm in length), at a range of flow which is typical for or-dinary speech production. In a recent paper, Amarante et al8

studied the oral pressure (Poral; so-called back pressure) to flowrelation for 10 different tubes that are commonly used for voiceexercises. These tubes were studied with the outer end free inthe air. Additionally, the resonance tube and a silicone lax voxtube (10 mm in inner diameter, 35 cm in length) were studiedwith the outer end immersed 1–7 cm in water. As expected, theresults showed that a change in tube diameter affects back pres-sure remarkably more than a change in tube length. An increasein flow increased back pressure, especially for thinner straws.When a wider tube is immersed in water, the Poral needs to over-come the pressure generated by the water depth before flow canstart. Otherwise, for a tube in water, the back pressure seems

Accepted for publication September 17, 2015.From the *School of Communication Sciences, University of Chile, Santiago, Chile; †De-

partment of Otolaryngology, Voice Center, Las Condes Clinic, Santiago, Chile; ‡Schoolof Communication Sciences, University of Valparaiso, Valparaiso, Chile; §Department ofOtolaryngology, Voice Center, University of Chile Hospital, Santiago, Chile; ¶Depart-ment of Otolaryngology, University of Chile, Santiago, Chile; and the **School of Education,Speech and Voice Research Laboratory, University of Tampere, Tampere, Finland.

Address correspondence and reprint requests to Marco Guzmán, School of CommunicationSciences, University of Chile, Avenida Independencia 1027, Santiago, Chile. E-mail address:[email protected]

Journal of Voice, Vol. ■■, No. ■■, pp. ■■-■■0892-1997© 2015 The Voice Foundation. Published by Elsevier Inc. All rights reserved.http://dx.doi.org/10.1016/j.jvoice.2015.09.010

ARTICLE IN PRESS


to be mainly determined by the immersion depth and not muchby changes in flow.8

Titze et al7 stated that when airflow resistance is increased atthe lips, the intraoral (supraglottal) pressure is positive (com-pared with atmospheric pressure), and this in turn would reducethe transglottal pressure (Ptrans; difference between subglotticand Poral) while the person is phonating, unless the subglotticpressure (Psub) is raised.7 The higher the resistance a tube offers,the more Psub a subject needs to generate to start and sustainphonation.7 The effect of different airflow resistances of the vocaltract on air pressure and on vocal fold vibration has been studiedwith physical model of voice production.10 According to theresults, when flow was kept constant, Psub was necessary to raiseconsiderably more for phonation into a stirring straw in air com-pared with a resonance tube 10 cm in water. Larger changes inPsub were observed for soft phonation compared with habitualloudness of speech. Peak-to-peak variation in Poral was largestfor phonation into the tube in water.10

Several studies have investigated voice production of humansubjects during and immediately after SOVTEs.7,9,12–14 Maxfieldet al12 created a rank ordering by measuring the intraoral pres-sure produced by 13 semiocclusions commonly used in voicetherapy. The highest values of Poral were evidenced in strawsubmerged in water, lip trills, and stirring straw with the freeend in the air. According to the results by Radolf et al9 ob-tained from one trained female speaker, the Poral increased inphonation into a resonance tube and a stirring straw most whenthe resonance tube was 10 cm in water. During voice produc-tion into a resonance tube submerged 10 cm below the watersurface, the increase in mean Poral (compared with vowel pho-nation) was about four times in habitual comfortable phonationand about nine times in soft phonation. This illustrates the factthat Poral is affected by flow and thus reflects not only supra-glottic load but also Psub and adduction. Results by Radolf etal9 showed that Psub tended to increase relatively more thanPoral, and thus Ptrans was higher in the SOVTEs comparedwith vowel phonation. Contact quotient (CQ) measured byelectroglottograph (EGG) increased. In tube 10 cm in water,Ptrans decreased and CQ increased, which seems to imply in-creased adduction as compensation. A trained male singer inthe study by Guzman et al13 in turn, showed decreased Ptransand CQ both for phonation into a resonance tube and for pho-nation into a stirring straw in the air. Titze et al7 estimated Psuband registered EGG signal for two trained singers (one classi-cally trained amateur tenor and one female professional pop-jazz singer) phonating at four pitches into seven different strawsin the air. The results showed individual differences in the be-havior of the subjects when phonating into tubes. Similarly,Gaskill and Erickson reported different changes in CQEGG forvocally untrained subjects phonating into a resonance tube inthe air.14

Phonation into a tube in water results in bubbling of the water.Water bubbles produce a varying back pressure, which in turnmodulates the Poral and gives an effect that has been de-scribed as a massage-like effect.15,16 The Poral modulations causerhythmic modulations in vocal fold contact and glottal area andalso on vertical position of the larynx.9,15,16

Changes in aerodynamic variables and vocal fold vibrationhave also been observed immediately after SOVTEs. In somestudies, the changes followed those observed during theexercises,13 whereas in others, no consistent pattern was foundwithin participants or across exercises.17

To date, to the best of our knowledge, all studies measuringvoice production in humans during semiocclusions have beencarried out with vocally normal participants and a small samplesize. There is no evidence of whether subjects with different vocalskills and vocal fold status differ from each other in their reac-tions toward altered aerodynamic conditions of phonation. Thepurpose of this study was to investigate the effect of differentartificial lengthening of the vocal tract (phonation through a tubewith the free end in water and in air) on air pressure variablesand vocal fold adduction. Research question was: Do subjectswith different voice conditions change differently the air pres-sure and vocal fold adduction variables during different artificiallengthening?

Based on previous data and clinical observations, we hypoth-esize that semiocclusion with narrow straws and with tube withthe free end in water results in more prominent increment (1)in air pressures and (2) in CQ compared with other exercises,and that (3) changes in pressure and CQ are affected by voicecondition. Specifically, we hypothesize that trained subjects areable to compensate for changes in supraglottic load and thus keepPtrans and CQ the same, which would keep phonation stable andunaltered. Vocally untrained participants and subjects with vocalfold paralysis may try to compensate more for changes in su-praglottic load and thus Ptrans and CQ would increase. Patientswith muscle tension dysphonia (MTD) may tend to “give in”for increased supraglottic load, and thus decreased Ptrans andCQ are hypothesized.

METHODS

Participants

Forty-five participants were included in this study. They werechosen to represent four groups: (1) subjects diagnosed withnormal voice and without voice training (n = 12; six males, sixfemales), (2) subjects with normal voice and with voice train-ing (n = 9; two males, seven females), (3) subjects diagnosedwith MTD (n = 14; five males, nine females), and (4) subjectsdiagnosed with unilateral vocal fold paralysis (n = 10; two males,eight females). Vocal fold position in all participants with vocalfold paralysis was paramedian (ie, the glottal closure was notcomplete). The average age was 24 (20–33) years for subjectswith normal voice and without voice training, 27 (21–37) yearsfor subjects with normal voice with voice training, 28 (23–35)years for subjects with MTD, and 45 (31–56) years for sub-jects with vocal fold paralysis. Participants from all groups werenative speakers of Spanish. This study was reviewed and ap-proved by the University of Chile, Faculty of Medicine ReviewBoard. Informed consent was obtained from all participants.

Phonatory tasks

Prior to aerodynamic and EGG assessment, all participants wereasked to undergo rigid videostroboscopy to confirm medical

ARTICLE IN PRESS2 Journal of Voice, Vol. ■■, No. ■■, 2015

diagnosis. Laryngoscopic examinations were performed by twoexperienced laryngologists at the University of Chile Hospital.No topical anesthesia was used during endoscopic procedure.

Firstly, baseline recording was made for all participants. It con-sisted of repetition of syllable [pa:] at a rate from 2.5 to foursyllables per second, in habitual, comfortable speaking pitch andloudness. Thereafter, participants were asked to select and producea series of five semioccluded vocal tract postures in a randomorder: (1) drinking straw with the free end in the air (5 mm ininner diameter and 25.8 cm in length), (2) stirring straw withthe free end in the air (2.7 mm in inner diameter and 10.7 cmin length), (3) silicon tube with the free end in the air (10 mmin inner diameter and 55 cm in length), (4) silicon tube with thefree end submerged 3 cm below water surface, and (5) silicontube with the free end submerged 10 cm below the water surface.During the different postures, the subjects were phonating. Theywere asked to maintain the same pitch as they used in the base-line samples. An electronic keyboard was used to cue the pitch,which was also auditorily monitored by the experimenters. Loud-ness level was perceptually controlled by experimenters. Whenperforming the semiocclusion exercises, the participants wereinstructed to phonate with ease voice and to sense vibrations inthe anterior face and the mouth. Before data collection, all theexercises were demonstrated to the participants by trained speech-language pathologists. Three repetitions were performed for eachphonatory task. This resulted in 810 samples in total (45 sub-jects × six postures × three repetitions). Each entire recordingsession took approximately 30 minutes.

Equipment and data collection

Data were collected in the Voice Research Laboratory at Univer-sity of Chile. Aerodynamic and EGG signals were capturedsimultaneously during all phonatory tasks. Aerodynamic data werecollected with a Phonatory Aerodynamic System (PAS; model

4500, KayPENTAX, Lincoln Park, NJ). EGG data were ob-tained with an EGG (model 6103, KayPENTAX). Bothaerodynamic and EGG systems were connected to an interface(Computerized Speech Lab, Model 4500, KayPENTAX), whichin turn was connected to a desktop computer running a Real-Time EGG Analysis (Model 6600, version 3.4, KayPENTAX). Toobtain fundamental frequency (F0), acoustic signal was recordedat a constant microphone-to-mouth distance of 20 cm, using a con-denser microphone AKG (AKG Acoustics, Vienna, Austria)integrated into the PAS. All samples were recorded digitally at asampling rate of 22.1 KHz with 16 bits/sample quantization.

At the beginning of the examination, participants were askedto sit comfortably upright on a chair. Two EGG electrodes wereattached over the thyroid cartilage by means of a lightweightelastic band, which was comfortably wrapped around the pa-rticipant’s neck as tightly as possible to prevent any movementof the electrodes throughout the data collection. Quality of EGGsignal was monitored from the real-time oscillogram incorpo-rated in the EGG software. To obtain optimal electricalconductivity, the electrodes were cleaned with a slightly wet tissue,and a thin layer of conductive gel was applied (Spectra 360 elec-trode gel, Parker Laboratories, New Jersey).

The Poral was captured with a pressure transducer (incorpo-rated into the PAS) connected to a thin flexible plastic tube (13 cmin length). The tube was inserted into the corner of the mouth,extending a few millimeters behind the lips, without touchingthe tongue or any other oral structure. Calibration of the air pres-sure was performed according to the manufacturer’s instructions.A nose clip was used for all participants during data collectionto avoid air leakage through the nose. Oral airflow leakage wascontrolled by one of the experimenters by placing the hand nearthe participant’s lips during each phonatory task. If some leakagewas suspected, the task was repeated with a firmer closure atthe lips. Figure 1 shows the experimental settings.

FIGURE 1. Experimental settings using tube in air (a) and tube into the water (b).

ARTICLE IN PRESSMarco Guzman et al Air Pressure and Contact Quotient Measures 3

The EGG, acoustic, and pressure signals were analyzed toobtain the mean values for glottal CQ, F0, Psub (estimated fromthe maximum peak of the Poral during the occlusion of the con-sonant [p:] in the syllable [pa:] and during manual shutteringof the outer end of the tube), Poral (obtained during vowel [a:]in the syllable and during nonshuttered phase in tube phona-tion), and Ptrans (difference between subglottic and Poral).

Because during phonation into water, bubbling of water causesmodulations in Poral,15,16 maximum and minimum Porals weremeasured from samples where bubbling was produced (silicontube submerged 3 and 10 cm below water). Peak-to-peak Poralwas manually obtained (Figure 2). In addition, bubbling fre-quency was measured from the pressure signal. All samples wereanalyzed with Real-Time EGG Analysis (model 6600, version3.4, KayPENTAX). Only the most stable part from the middlesection of each sample was analyzed. Criterion level of 25% fromthe peak-to-peak amplitude of the EGG signal was used for CQanalysis.

Statistical analysis

Numerical variables were described by median and interquartilerange, and compared by phonatory task and vocal status sepa-rately using Kruskal-Wallis test. A generalized multivariable linearmodel to observe the joint influence of phonatory task and vocalstatus in vocal parameters was fitted. Separate subgroup anal-ysis (silicon tube submerged 3 and 10 cm into water) for minimaland maximal Poral was also performed using Wilcoxon test.Finally, linear correlation analysis using Pearson coefficient foroverall correlation was used. All analyses were performed usingStata 13.1 (StataCorp, College Station, TX), P < 0.05 was con-sidered to be statistically significant, and all reported P valueswere two sided.

RESULTS

Table 1 shows the comparison between score averages (ob-tained from Kruskal-Wallis test) by phonatory task for eachvariable. P values indicate that the six phonatory conditions dif-fered significantly from each other for all variables (P < 0.05),except for F0. Figures 3–6 show the distribution of values for

FIGURE 2. Modulations in oral pressure caused by bubbling of water.The registration is from one participant while phonating into a silicontube submerged 10 cm in water.

FIGURE 3. Distribution of values for all groups separately in all semioccluded postures for contact quotient.


CQ, Psub, Poral, and Ptrans for all groups and in all semioccludedpostures.

Results from the multivariate linear regression model includ-ing F0, CQ, Psub, Poral, and Ptrans as outcomes and phonatorytask as predictive variable are shown in Tables 2–5. Each tableincludes data from each voice condition separately. In general,all semioccluded postures produced an increase in Psub, Poral,Ptrans, and CQ compared with the baseline for all vocal con-ditions. In addition, during semiocclusion exercises, most variablesbehaved in average in the same way regardless of the vocal statusof the participants.

Correlation analysis (considering all voice conditions togeth-er) demonstrated a strong linear relation between Psub and Poral(r = 0.82; P < 0.001). Similar results were obtained when eachgroup was studied separately. The only correlation found for allgroups was between Psub and Poral. Correlation for normal un-trained participants: normal untrained participants: r = 0.86;

P < 0.001, for normal trained participants: r = 0.89; P < 0.001,for subjects with MTD: r = 0.79; P < 0.001, and for subjects withvocal folds paralysis: r = 0.82; P < 0.001.

Tables 6 and 7 display values of maximum and minimum Poralduring phonation into the tube in water. As can be expected, bothmaximum and minimum Poral were significantly higher whentube immersion depth was deeper (Table 6). However, the averagepeak-to-peak amplitude of Poral did not differ significantly withimmersion depth. No significant differences were observedbetween voice conditions either (Table 7).

DISCUSSION

All semioccluded postures produced an increase in Psub and Poralcompared with the baseline condition (syllable production). Inaverage, phonation with the silicon tube below the water surfaceproduced the highest values for Poral (Table 1). The same pho-natory tasks also showed the highest values for Psub. Poral and

TABLE 1.

Median and Interquartile Range of Variables in Phonatory Tasks

F0 (Hz)

Men Women CQ (%)Psub

(cm H2O)Poral

(cm H2O)Ptrans

(cm H2O)

Baseline 128.27 (17.50) 215.13 (32.45) 54.14 (10.20) 7.28 (2.53) 0.22 (0.14) 6.90 (2.42)Drinking straw 130.37 (9.14) 208.38 (23.46) 56.43 (9.66) 10.62 (4.73) 2.84 (2.54) 7.20 (3.11)Stirring straw 127.63 (7.94) 160.55 (90.49) 57.50 (8.56) 14.47 (5.57) 6.75 (4.44) 6.58 (2.93)Silicon tube in air 129.45 (18.20) 215.97 (50.60) 54.61 (13.26) 10.32 (4.33) 0.55 (0.82) 9.38 (3.85)Silicon tube in water (3 cm) 134.91 (19.12) 222.04 (62.15) 54.99 (13.40) 12.05 (3.60) 32.94 (7.67) 8.31 (3.56)Silicon tube in water (10 cm) 143.91 (42.08) 207.64 (50.03) 59.44 (11.09) 17.82 (3.42) 38.03 (7.01) 9.07 (3.49)P value 0.27622 0.30380 0.01519 0.0001 0.0001 0.0001

Significance of differences studied with Kruskal-Wallis test.

FIGURE 4. Distribution of values for all groups separately in all semioccluded postures for subglottic pressure.


Psub correlated strongly (r = 0.8298; P < 0.001). The stirring strawitself is known to offer a higher flow resistance than a (reso-nance) tube submerged 10 cm in water.7,8,10 However, in the presentstudy, the highest Poral and Psub were measured for the tubesubmerged 10 cm in water. A similar observation has been pre-sented by Radolf et al9 (for one subject). Such a result may bedue to a leakage at the lips (overblowing) when phonating into

the stirring straw. Another reason could be that people tend toreduce airflow during phonation into a straw, as phonation mayfeel uncomfortably hard when Poral increases too much.

In the present study, multivariate linear regression modelshowed that subject groups representing all four voice condi-tions (normal trained, normal untrained, MTD, and vocal foldparalysis) behaved similarly regarding air pressure variables. Tube

FIGURE 5. Distribution of values for all groups separately in all semioccluded postures for oral pressure.

FIGURE 6. Distribution of values for all groups separately in all semioccluded postures for transglottal pressure.


TABLE 2.

Mean Parameter Values, Beta Coefficients, and Standard Errors From Multivariate Linear Regression Model for Normal Untrained Participants

F0 (Hz) CQ (%) Psub (cm H2O) Poral (cm H2O) Ptrans (cm H2O)

Gender 42.74(−2.97; 88.47)

— — — —

Task (ref. baseline)Drinking straw 119.27*** (49.03; 190.91) 53.80*** (40.5; 67.06) 9.43*** (7.15; 11.53) 2.9** (.86; 4.93) 6.48*** (3.43; 9.53)Stirring straw 88.97* (8.36; 169.58) 55.38*** (42.11; 68.64) 12.52*** (10.33; 14.71) 5.86*** (3.82; 7.90) 6.57*** (3.52; 9.62)Silicon tube in air 133.17*** (69.06; 197.27) 49.19*** (35.92; 62.45) 9.01*** (6.81; 11.19) 0.48 (−1.55; 2.52) 7.80*** (5.74; 9.85)Silicon tube in water (3 cm) 121.40*** (52.90; 189.91) 48.82*** (35.56; 62.09) 12.40*** (10.21; 14.59) 30.67*** (28.63; 32.71) 8.34*** (6.29; 10.39)Silicon tube in water (10 cm) 131.23*** (64.72; 197.73) 55.74*** (42.47; 69.00) 18.5*** (16.31; 20.68) 37.28*** (35.24; 39.32) 9.71*** (7.66; 11.76)

*P < 0.05, **P < 0.01, ***P < 0.001.

TABLE 3.

Mean Parameter Values, Beta Coefficients, and Standard Errors From Multivariate Linear Regression Model for Normal Trained Participants


Gender −10.98 (−96.70; 74.73) — — — —Task (ref. baseline)Drinking straw 162.16** (48.94; 275.39) 54.61*** (38.32; 70.90) 9.71*** (7.29; 12.12) 3.3*** (1.61; 4.98) 6.23*** (3.59; 8.87)Stirring straw 137.57 (−13.01; 288.14) 58.24*** (41.95; 74.53) 13.28*** (10.87; 15.70) 7.14*** (5.46; 8.83) 5.99*** (3.35; 8.63)Silicon tube in air 178.24** (65.19; 291.64) 55.94*** (39.65; 72.23) 9.58*** (7.17; 12.01) 1.01 (−.67; 2.68) 8.46*** (5.82; 11.10)Silicon tube in water (3 cm) 178.78*** (84.35; 273.21) 49.81*** (33.52; 66.10) 10.67*** (8.25; 13.09) 30.22*** (28.53; 31.90) 6.56*** (4.53; 8.59)Silicon tube in water (10 cm) 176.44*** (82.01; 270.87) 57.53*** (41.24; 73.82) 16.75*** (14.33; 19.17) 37.15*** (35.47; 38.83) 7.77*** (5.75; 9.80)

*P < 0.05, **P < 0.01, ***P < 0.001.

AR

TIC

LE

INP

RE

SS

Marco

Gu

zman

etal

Air

Pressu

rean

dC

on

tactQ

uo

tient

Measu

res7

TABLE 4.

Mean Parameter Values, Beta Coefficients, and Standard Errors From Multivariate Linear Regression Model for Participants With Muscle Tension Dysphonia


Gender −28.80 (−74.62; 16.85) — — — —Task (ref. baseline)Drinking straw 182.25*** (121.24; 243.26) 57.18*** (43.76; 70.61) 10.55*** (8.25; 12.84) 2.96*** (35.47; 38.83) 7.63*** (5.12; 10.15)Stirring straw 181.29*** (94.26; 268.31) 57.64*** (43.71; 71.57) 14.71*** (12.33; 17.10) 7.52*** (5.99; 9.05) 6.67*** (4.51; 9.73)Silicon tube in air 191.88*** (130.87; 252.89) 56.28*** (42.85; 69.70) 11.01*** (8.70; 13.29) 0.63 (−.84; 2.11) 10.34*** (7.83; 12.85)Silicon tube in water (3 cm) 187.54*** (126.53; 248.56) 54.84*** (41.46; 68.32) 13.51*** (11.22; 15.81) 34.17*** (32.69; 35.64) 9.82*** (7.69; 11.95)Silicon tube in water (10 cm) 189.89*** (131.54; 247.80) 58.15*** (44.73; 71.58) 18.51*** (16.21; 20.80) 39.21*** (36.73; 39.68) 10.70*** (8.57; 12.83)

*P < 0.05, **P < 0.01, ***P < 0.001.

TABLE 5.

Mean Parameter Values, Beta Coefficients, and Standard Errors From Multivariate Linear Regression Model for Participants With Vocal Fold Paralysis


Gender 42.22 (−45.23; 129.68) — — — —Task (ref. baseline)Drinking straw 212.69* (43.44; 381.93) 60.72*** (42.28; 79.16) 13.89*** (10.27; 17.51) 5.48* (1.20; 9.76) 8.19*** (4.08; 12.31)Stirring straw 168.73* (35.27; 302.26) 62.67*** (42.96; 82.39) 19.07*** (14.75; 23.40) 10.19** (4.67; 15.72) 6.03*** (3.06; 13.68)Silicon tube in air 167.01** (68.21; 265.80) 56.08*** (37.64; 74.52) 11.73*** (7.92; 15.54) 0.9 (−3.61; 5.41) 10.89*** (6.56; 15.23)Silicon tube in water (3 cm) 218.04*** (138.55; 297.52) 62.42*** (45.03; 79.80) 14.49*** (10.87; 18.10) 35.49*** (31.21; 39.77) 9.79*** (6.47; 13.10)Silicon tube in water (10 cm) 175.20*** (100.02; 250.38) 67.44*** (50.94; 83.93) 19.85*** (16.23; 23.47) 43.91*** (39.63; 48.19) 9.08*** (5.77; 12.40)

*P < 0.05, **P < 0.01, ***P < 0.001

AR

TIC

LE

INP

RE

SS

8Jo

urn

alo

fVo

ice,Vo

l.■■

,N

o.■

■,

2015

in water and stirring straw in air presented the highest valuesin both Psub and Psupra for all voice conditions. It seems thatthese variables are in general more dependent on the degree ofairflow resistance than vocal status of participants.

Earlier investigations have shown changes in vocal tract con-figuration during SOVTEs.13,18–20 These changes may result fromchanges in Poral. Guzman et al13 reported that during glass tubeand stirring straw phonation, the hypopharyngeal area widened,the larynx position lowered, and the velum closed the nasalpassage better compared with open vowel phonation. Betterclosure of the nasal passage may contribute to increase in Poral.All changes were more prominent during stirring straw in theair (high degree of airflow resistance) than in Finnish glass tubein the air (low resistance). In another study, where eight differ-ent semioccluded exercises were explored, all of them produceda lower vertical laryngeal position, narrower aryepiglottic opening,and a wider pharynx than resting position.20 More prominentchanges were obtained with a tube in the water (10 and 3 cm)and narrow tube in the air.

In the present study, Ptrans was higher than baseline condi-tion and significantly different throughout all semioccludedpostures (except for silicon tube in the air) (Tables 1–5). Thisimplies that although both Poral and Psub increase, they do notchange proportionally. Psub increases relatively more than Poral.This seems to be due to a compensatory adjustment to sustainthe phonatory airflow during phonation. The same trend was seenin Radolf et al for both tube and stirring straw phonation.9 Re-garding voice condition, no clear pattern was observed in ourdata for Ptrans in the multivariate linear regression model. It isworth noticing, however, that distribution of Ptrans was very widefor participants with vocal fold paralysis (Figure 5). Ptrans wentdown to zero, suggesting that the subjects had difficulties in pho-nating into the highly resistive straw.

In the present study, CQ was overall significantly greater forall semioccluded postures than for the baseline (syllable

production). Phonatory tasks that showed in average the highestvalues of CQ were tube 10 cm in water and stirring straw in air(Table 1 and Figure 3). Recall that these exercises were also foundto present the highest values of Psub and Poral. It seems thatwhen supraglottic load increased, a higher Psub and a compen-satory glottal adduction were produced, and that the changes wererelated to the degree of airflow resistance that the semiocclusionsoffered. Similar results have been obtained by Radolf et al9 andGuzman et al21 for CQEGG and by Guzman et al22 for CQ derivedfrom high-speed registration, even though the effect in the latterwas not statistically significant. Opposite results have been foundby Granqvist et al15 who in their high-speed study reported (fortwo subjects) increased open quotient with increased water depth(2–6 cm). Guzman et al studied depths of 5 cm, 10 cm, and18 cm.22 Phonation with deeper immersion depths (>6 cm) maythus require more adduction of the vocal folds.

If CQ is really affected by airflow resistance as also shownpreviously,15,22 it should be considered that patients with low vocalfold adduction (eg, vocal fold paralysis or presbyphonia) couldbe treated using high-flow resistance exercises. Instead, lowerflow resistance should be applied in subjects with vocal foldhyperadduction. In this regard, clinical recommendations wereproposed by Sovijärvi: Tube submerged 1–2 cm below the watershould be used for patients with hyperfunctional voice disor-ders, whereas tube submerged in a depth of 10–15 cm could bemore appropriate for subjects with vocal fold paralysis.2,23 Theserecommendations have been followed in clinical practice of waterresistance therapy in Finland.1

During tube phonation into water, the water bubbling fre-quency in our data showed an average of 22 Hz, with a rangeof 12–32 Hz, regardless of tube immersion depth and thevocal condition of the subjects. The amplitude of Poral modu-lation (mean difference between max. and min. Poral) causedby bubbling did not show significant differences either whencomparing tube immersion depths of 3 cm and 10 cm in water.

TABLE 6.

Median and Interquartile Range of Maximum, Minimum, and Difference

Between Maximum and Minimum of Poral Considering Depth of Tube Im-

mersion in Water (Wilcoxon Test)

Silicon Tube inWater (3 cm)

Silicon Tube inWater (10 cm) P Value

Max. Poral (cm H2O) 5.58 (1.43) 10.67 (2.47) 0.0001Min. Poral (cm H2O) 2.01 (1.20) 6.69 (1.40) 0.0001Diff. Poral (cm H2O) 3.64 (1.32) 3.86 (1.19) 0.2066

TABLE 7.

Median and Interquartile Range of Maximum and Minimum Poral for Tube in Water Considering Different Groups (Both

Immersion Depths Together)

NormalUntrained

NormalTrained Dysphonia Paralysis P Value

Max. Poral (cm H2O) 8.32 (5.75) 7.17 (5.18) 7.08 (4.34) 10.47 (6.60) 0.2509Min. Poral (cm H2O) 4.46 (5.46) 3.74 (5.15) 3.05 (4.05) 6.42 (4.75) 0.1121


Based on the results by Radolf et al9 and our results, this maysuggest that a higher Psub does not necessarily imply a largermodulation of the Poral. Therefore, a stronger massage-likeeffect (usually reported by patients for 10 cm compared with3 cm immersion depths in water) most likely cannot be associ-ated with the degree of tube immersion when the flow is keptconstant.

Our study has certain limitations. Because only 45 partici-pants were included in the present study and they were dividedin four different groups, it was not possible to perform vari-ance analysis, and it was not possible to consider division bygender or type of voice training (speaking or singing voice). More-over, more types of commonly used tubes could have beenincluded such as the traditional Finnish glass tube. Further in-vestigations should assess whether changes in air pressures andCQ remain over a longer period of time (long-term effect). Inaddition, patients with phonotraumatic vocal fold lesions shouldalso be included in future studies. Finally, airflow measure-ments should also be considered.

CONCLUSION

During semiocclusion exercises, most variables behaved in averagein the same way regardless of the vocal status of the partici-pants. This implies the importance of proper instructions in voicetherapy, as the expectations of therapy outcome naturally are dif-ferent for patients with different types of voice disorders.

REFERENCES1. Simberg S, Laine A. The resonance tube method in voice therapy: description

and practical implementations. Logoped Phoniatr Vocol. 2007;32:165–170.2. Sovijärvi A, Häyrinen R, Orden-Pannila M, et al. Aänifysiologisten

kuntoutusharjoitusten ohjeita [Instructions for voice exercises]. Helsinki:Publications of Suomen Puheopisto; 1989.

3. Sovijärvi A. Die Bestimmung der Stimmkategorien mittels Resonanzröhren.In: Int Kongr Phon Wiss, 532–535, 1965.

4. Sovijärvi A. Äänifysiologiasta ja artikulaatiotekniikasta [On Voice Physiologyand Articulatory Technique]. Helsinki, Finland: Department of Phonetics,University of Helsinki Press; 1966.

5. Sovijärvi A. Nya metoder vid behandlingen av röstrubbningar. Tale ogStemme. 1969;3:121–131.

6. Baken R, Orlikoff R. Clinical Measurements of Speech and Voice. CliftonPark: Thomson; 2000.

7. Titze I, Finnegan E, Laukkanen A, et al. Raising lung pressure and pitchin vocal warm-ups: the use of flow-resistant straws. J Singing. 2002;58:329–338.

8. Amarante P, Wistbacka G, Larsson H, et al. The flow and pressurerelationships in different tubes commonly used for semi-occluded vocal tractexercises. J Voice. 2015;doi:10.1016/j.jvoice.2015.02.004; [Epub ahead ofprint].

9. Radolf V, Laukkanen A-M, Horácek J, et al. Air-pressure, vocal foldvibration and acoustic characteristics of phonation during vocal exercising.Part 1: measurement in vivo. Eng Mech. 2014;21:53–59.

10. Horácek J, Radolf V, Bula V, et al. Air-pressure, vocal folds vibration andacoustic characteristics of phonation during vocal exercising. Part 2:measurement on a physical model. Eng Mech. 2014;21:193–200.

11. Laukkanen A-M. About the so called “resonance tubes” used in Finnishvoice training practice. Logoped Phoniatr Vocol. 1992;17:151–161.

12. Maxfield L, Titze I, Hunter E, et al. Intraoral pressures produced bythirteen semi-occluded vocal tract gestures. Logoped Phoniatr Vocol.2015;40:84–90.

13. Guzman M, Laukkanen A-M, Krupa P, et al. Vocal tract and glottal functionduring and after vocal exercising with resonance tube and straw. J Voice.2013;27:305–311.

14. Gaskill CS, Erickson ML. The effect of an artificially lengthened vocal tracton estimated glottal contact quotient in untrained male voices. J Voice.2010;24:57–71.

15. Granqvist S, Simberg S, Hertegård S, et al. Resonance tube phonation inwater: high-speed imaging, electroglottographic and oral pressureobservations of vocal fold vibrations—a pilot study. Logoped Phoniatr Vocol.2014;28:1–9.

16. Enflo L, Sundberg J, Romedahl C, et al. Effects on vocal fold collision andphonation threshold pressure of resonance tube phonation with tube end inwater. J Speech Lang Hear Res. 2013;56:1530–1538.

17. Dargin TC, Searl J. Semi-occluded vocal tract exercises: aerodynamicand electroglottographic measurements in singers. J Voice. 2015;29:155–164.

18. Vampola T, Laukkanen A-M, Horácek J, et al. Vocal tract changes causedby phonation into a tube: a case study using computer tomography andfinite-element modeling. J Acoust Soc Am. 2011;129:310–315.

19. Laukkanen A-M, Horácek J, Krupa P, et al. The effect of phonation into astraw on the vocal tract adjustments and formant frequencies. A preliminaryMRI study on a single subject completed with acoustic results. Biomed SignalProcess Control. 2010;7:50–57.

20. Guzman M, Castro C, Testart A, et al. Laryngeal and pharyngeal activityduring semioccluded vocal tract postures in subjects diagnosed withhyperfunctional dysphonia. J Voice. 2013;27:709–716.

21. Guzman M, Calvache C, Romero L, et al. Do different semi-occluded voiceexercises affect vocal fold adduction differently in subjects diagnosed withhyperfunctional dysphonia? Folia Phoniatr Logop. 2015;67:68–75.

22. Guzman M, Laukkanen A-M, Traser L, et al. Effect of Water ResistanceTherapy on Vocal Fold Vibration: A High Speed Registration Study. TheVoice Foundation’s 42st Annual Symposium: Care of the Professional Voice,Philadelphia, 2014.

23. Sovijärvi A. Eräitä huomioita funktionaalisen dysfonianhoidosta [Someobservations of the treatment of functional dysphonia]. Finnish PhoniaSociety for Phoneticians and Logopedists. 1977;19–22.


http://refhub.elsevier.com/S0892-1997(15)00210-6/sr0010




































































Laryngeal and Pharyngeal Activity During

Semioccluded Vocal Tract Postures in Subjects

Diagnosed With Hyperfunctional Dysphonia

*MarcoGuzman, †Christian Castro, ‡Alba Testart, §DanielMu~noz, and kJulia Gerhard, *xSantiago, yValparaiso, zVi~na delMar, Chile, and kMiami, Florida

Summary: High vertical laryngeal position (VLP), pharyngeal constriction, and laryngeal compression are common

AccepFrom tySchoolzSchoolxDepartmand the kAddre

tion ScieguzmanvJourna0892-1� 201http://d

features associated with hyperfunctional voice disorders. The present study aimed to observe the effect on these vari-ables of different semioccluded vocal tract postures in 20 subjects diagnosed with hyperfunctional dysphonia. Duringobservation with flexible endoscope, each participant was asked to produce eight different semioccluded exercises: liptrills, hand-over-mouth technique, phonation into four different tubes, and tube phonation into water using two differentdepth levels. Participants were required to produce each exercise at three loudness levels: habitual, soft, and loud. Todetermine the VLP, anterior-to-posterior (A-P) compression, and pharyngeal width, a human evaluation test with threeblinded laryngologists was conducted. Judges rated the three endoscopic variables using a five-point Likert scale. Anintraclass correlation coefficient to assess intrarater and interrater agreement was performed. A multivariate linear re-gression model considering VLP, pharyngeal width, and A-P laryngeal compression as outcomes and phonatory tasksand intensity levels as predictive variables were carried out. Correlation analysis between variables was also conducted.Results indicate that all variables differ significantly. Therefore, VLP, A-P constriction, and pharyngeal width changeddifferently throughout the eight semioccluded postures. All semioccluded techniques produced a lower VLP, narroweraryepiglottic opening, and a wider pharynx than resting position. More prominent changes were obtained with a tubeinto the water and narrow tube into the air. VLP significantly correlated with pharyngeal width and A-P laryngeal com-pression. Moreover, pharyngeal width significantly correlated with A-P laryngeal compression.KeyWords: Semiocclusion–Vocal tract–Voice therapy–Hyperfunction–Dysphonia–Aryepiglottic narrowing–Verticallaryngeal position–Pharyngeal width.

INTRODUCTION

It is generally agreed among clinicians and voice scientists thatthe vertical laryngeal position (VLP) is an important aspect ofvoice production in both normal and pathological voices.1,2 Itseems that several factors affect the VLP, such as phoneticfeatures,3,4 lung volume,5 voice technique,6,7 pitch control,6 re-spiratory technique,8 and vocal loudness.9

A high laryngeal position is commonly associatedwith voicesthat have a strong component of muscle tension, especially inpatients diagnosed with hyperfunctional voice disorders. Com-monly, the abnormally high tension in extrinsic laryngeal mus-cles may cause a high position of the larynx.10–12 Therefore,a lowering of the elevated larynx is usually an important goalin clinical voice therapy and classical singing pedagogy.1,13–19

Several vocal exercises have been reported as usefultherapeutic and training tools to lower the larynx. The yawn-sigh technique is one of the most popular among voice patholo-gists and voice teachers.14 Other exercises are the prolongedconsonant /b:/,15 soft and aspirate vocal onset,16 and laryngealmanipulation.17–19

ted for publication May 17, 2013.he *School of Communication Sciences, University of Chile, Santiago, Chile;of Communication Sciences, University of Valparaiso, Valparaiso, Chile;of Communication Sciences, Universidad del Mar, Vi~na del Mar, Chile;ent of Network Management, Barros Luco-Trudeau Hospital, Santiago, Chile;Department of Otolaryngology, University of Miami, Miami, Florida.ss correspondence and reprint requests to Marco Guzman, School of Communica-nces, University of Chile, Avenida Independencia 1027, Santiago, Chile. E-mail:[email protected] of Voice, Vol. 27, No. 6, pp. 709-716997/$36.003 The Voice Foundationx.doi.org/10.1016/j.jvoice.2013.05.007

VLP has both acoustic and physiological implications. Anupward movement of the larynx from its resting positionshortens vocal tract length,which raises all formant frequencies;this, in turn, produces a brighter vocal quality.20,21 The lowposition of the larynx produces the opposite acoustic effect.The VLP also has important effects on the biomechanicalproperties of the vocal folds. A high VLP stiffens the vocalfold tissues, therefore increasing fundamental frequency andpotentially changing the folds’ vibratory pattern. Furthermore,high VLP usually facilitates a tight vocal fold adductionas part of the valving laryngeal function for airwayprotection.20,21 Moreover, Titze22 reported that vocal folds arelikely to be thicker when the larynx is lowered. Thus, the coverof vocal folds loosens and themedial surfacesmake a better glot-tal closure. When this occurs, a greater maximum flow declina-tion rate is produced, which contributes to the increased vocalintensity without additional vocal effort.

Another common feature treated by voice therapists inpatients diagnosed with muscle tension is the relaxation andopening of the pharyngeal area. This is also an important goalof singing pedagogy. Exercises to produce an open throathave been one of the most used tools to produce freedom orlack of tension in the area of the throat, resulting in a lack ofconstriction and a better voice quality in both normal and path-ological voices.13,23,24 Most teachers include the use of theopen throat technique as an important feature in singingtraining, especially in classical singing. The purpose of thesetypes of exercises is described by voice trainers to be a wayof maximizing pharyngeal space and/or achieving abductionof the ventricular folds.13 Titze25 as well as Titze and Story26

Delta:1_given name

Delta:1_surname

Delta:1_given name

Delta:1_surname

Delta:1_given name


http://dx.doi.org/10.1016/j.jvoice.2013.05.007

Journal of Voice, Vol. 27, No. 6, 2013710

described a ‘‘wide pharynx’’ as an acoustic enhancement to thefirst formant and to the overall sound. An open throat produc-tion has perceptually been described as a rounded, free, effort-less, and warm sound.27

Supraglottic activity refers to the movements and configura-tions of structures above the vocal folds. There are two typesof supraglottic activity: (1) anterior-to-posterior (A-P) laryngealcompression (aryepiglottic narrowing), which occurs when thearytenoid cartilages approximate the petiole of the epiglottisand (2) medial constriction, which refers to adduction of thefalse vocal folds.28,29 Supraglottic activity has been commonlyclassified as a sign of nonorganic hyperfunctional dysphonia byclinicians.30 In addition, for many years, the development ofseveral benign lesions on the vocal fold surface has beenassumed to be related to hyperfunctional behavior or phono-trauma.31On the other hand, some studies show that supraglotticactivity could be present in subjects with normal voice.28,29,32

In fact, both A-P and medial compression have been foundto be normal and even desirable laryngeal behaviors insinging7,33–36 and speaking among professional voice users.37

The present study aimed to observe and compare the effect ofeight semioccluded vocal tract postures on VLP, A-P laryngealcompression, and pharyngeal width in a group of subjects diag-nosed with hyperfunctional dysphonia.

METHODS

Participants

This study was approved by the research ethics committee at theSchool of Communication Disorders of the University of Val-paraiso, Chile. Informed consent was obtained from 28 adultsubjects (19 women and 9 men). The average age of this subjectset was 26 years, with a range of 20–28 years old. Inclusion cri-teria for this study included (1) no previous voice therapy orvoice training and (2) diagnosis of hyperfunctional dysphoniawithout any vocal fold lesions. Individuals with a history ofsmoking were excluded from this study. Although 28 subjectswere recruited, seven of them did not meet the inclusion crite-ria. Therefore, only 21 were included in the analysis. Partici-pants were asked to undergo flexible laryngoscopy tocorroborate laryngeal diagnosis and to perform the phonatorytasks. The initial diagnosis was made by a laryngologist withmore than 20 years of experience in voice disorders. All partic-ipants reported at least 1 year of voice problems.

Laryngoscopic procedure

At the beginning of the examination, participants were asked tosit upright in a comfortable chair. Assessment of the laryngealand pharyngeal activity was carried out through endoscopic ex-amination with a flexible fiberoptic endoscope (Olympus ENFtype p4; Olympus, Center Valley, PA) connected to a video cam-era (Sony DCX-LS1 Sintek; Sony Corporation, New York, NY)and a Richard Wolf LP 4200 light source (Richard Wolf Medi-cal Instrument Corporation, Vernon Hills, IL). Analog imageswere digitalized with Pinnacle Studio HD 10 software (CorelCorporation, Fremont, CA), and views were monitored ona color television monitor (Sony SSM-20L120). All examina-

tions were performed without topical nasal anesthesia. The flex-ible endoscope was placed directly below the tip of the uvula,allowing a full view of the pharynx and larynx. This placementwas fixed by securing the fiberscope against the alar cartilage ofthe nose with the laryngologist’s finger. A steady placement ofthe fiberscope is crucial because observation of laryngeal heightadjustments and other laryngeal configurations can be affectedby movement of the fiberscope. For the purposes of this study,three aspects were observed during laryngoscopic procedure:VLP, pharyngeal width, and A-P laryngeal compression.

Phonatory tasks

During the observation with the flexible endoscope, each partic-ipant was required to produce four different semioccluded vocaltract exercises at habitual speaking pitch: lip trill, hand-over-mouth technique, tube phonation into air, and tube phonationinto water. All subjects performed these phonatory tasks twice.Participants were asked to perform tasks at their habitual speak-ing pitch to avoid variation (the effect of pitch on) in VLPs. Par-ticipants were asked to produce each exercise at three loudnesslevels: soft, moderate, and loud. Tube into water exercises wereperformed at only moderate loudness. The tube phonation taskswere performed using four different types of plastic commer-cial drinking (wide) and stirring (narrow) straws. Each partici-pant was instructed to hold the straw with one hand, straight outfrom the mouth. The straw was maintained a few millimetersbetween the rounded lips, so that no air would leak from themouth; the free end was kept either in the air or submerged un-der the water as an extension of the vocal tract. Careful controlof pitch throughout the entire sequence was performed. Anelectronic keyboard was used to give and control the pitch,which was monitored auditorily. Pitch control is relevant be-cause it may influence both laryngeal and pharyngeal activities.Each phonatory task was produced for a minimum of 7 seconds,and subjects were required to breathe normally between tasks toavoid the effect of lung volume on the laryngeal height. Beforeeach semioccluded task, participants returned to a phonatoryresting position to obtain baseline measures.Each complete assessment session was accomplished in ap-

proximately 15 minutes with the following protocol:

1. Phonation into a long-wide tube (6 mm of inner diameterand 20 cm in length).

2. Phonation into a long-narrow tube (3 mm of inner diam-eter and 20 cm in length).

3. Phonation into a short-wide tube (6 mm of inner diameterand 10 cm in length).

4. Phonation into a short-narrow tube (3 mm of inner diam-eter and 10 cm in length).

5. Phonation into a long-wide tube submerged 3 cm belowthe water surface.

6. Phonation into a long-wide tube submerged 10 cm belowthe water surface.

7. Phonation using the hand-over-mouth technique.8. Phonation with lip trill.

The order of the tasks in the protocol was not randomized.

TABLE 1.

Intrarater Reliability Analysis

Judge

ICC (95% Confidence

Interval) P Value

1 0.71 (0.61–0.86) <0.0001

2 0.78 (0.67–0.88) <0.0001

3 0.65 (0.54–0.79) <0.0001

Marco Guzman, et al Laryngeal and Pharyngeal Activity During Semioccluded Exercises 711

Visual evaluation

To determine the VLP, A-P compression, and pharyngeal width,we conducted a human evaluation test with three blinded judges(2 men, 1 woman; mean age of 46 years with a range of 42–50 years). All judges were laryngologists with more than 4 yearsof experience in voice disorders. All audio signals were re-moved from video samples before performing the assessmentto avoid the possible effect of voice quality on the judges’ rat-ings. To standardize the rating parameters and rating scales, thethree judges participated in a 1-hour training session in video-laryngoscopic examinations. Video samples from each subjectwere played to the judges, and they were instructed to rate thethree endoscopic variables using a five-point Likert scale; forVLP (1¼ very high and 5¼ very low), A-P laryngeal compres-sion (1 ¼ very opened and 5 ¼ very narrow), and pharyngealwidth (1 ¼ very narrow and 5 ¼ very wide). The evaluationwas performed in a quiet room. Ratings were completed intwo sessions by all the raters. Each session lasted no morethan 1 hour. All raters reported normal or corrected-to-normalvision. Fifteen percent of the samples were randomly repeatedto assess the intrarater reliability. Judges were not aware ofthese repetitions.

Statistical analysis

Descriptive statistics were calculated for the variables, includ-ing mean and standard deviation. A multivariate linear regres-sion model was used to obtain an intraclass correlationcoefficient (ICC) to assess the judges’ reliability (intraraterand interrater agreement) controlled by phonatory task and vo-cal loudness. Then, another multivariate linear regressionmodel considering VLP, pharyngeal width, and A-P laryngealcompression as outcomes as well as phonatory tasks and its in-tensity as predictive variables (and its interactions if exist) wasperformed. Simple correlation analysis using Spearman rho be-tween VLP, pharyngeal width, and A-P laryngeal constrictionwas also conducted. One-way analysis of variance for test dif-ferences between phonatory task scores was used. The analysiswas performed using Stata 12.1 (StataCorp LP, College Station,TX) software. An alpha of .05 was used for the statisticalprocedures.

RESULTS

Table 1 shows the results from the intrarater reliability analysis.A good intrarater concordance was demonstrated for eachjudge. Moreover, the three blinded judges obtained a highagreement (interrater reliability) (ICC ¼ 0.79 [0.66–0.87],P < 0.0001).

Table 2 and Figure 1 display the comparison between scoreaverages by phonatory task for each variable (outcome).P values indicate that all variables were found to have a signif-icant effect, and all of them differ significantly from each other(P < 0.0001). Therefore, VLP, A-P laryngeal compression, andpharyngeal width changed differently throughout the eightsemioccluded postures.

Results from the multivariate linear regression model includ-ing VLP, pharyngeal width, and A-P laryngeal constriction as

outcomes as well as phonatory task and its intensity as predic-tive variables (and its interactions if they exist) are shown inTable 3.

VLP significantly correlates with pharyngeal width(rho ¼ 0.578; P < 0.0001) and A-P laryngeal constriction(rho¼ 0.3364; P < 0.0001). Furthermore, pharyngeal width sig-nificantly correlates with A-P laryngeal constriction(rho ¼ 0.18; P ¼ 0.001).

DISCUSSION

The present study aimed to observe the effect of eight semioc-cluded vocal tract postures on VLP, A-P laryngeal compression,and pharyngeal width in a group of subjects diagnosed with hy-perfunctional dysphonia. This is the first study designed to com-pare the effect of a large number of semioccluded vocalexercises and different loudness levels on pharyngeal and laryn-geal activities. Result revealed that the effect on these variablesis statistically significant throughout all phonatory tasks.

All semioccluded postures produced a decrease in VLP com-pared with the resting position. Phonation with tube into thewater (10 and 3 cm below the surface) and phonation intoa long-narrow tube produced the three lowest VLPs. Interest-ingly, the same three phonatory tasks caused thewidest pharynxand the narrowest A-P laryngeal compression. In fact, the cor-relation analysis demonstrated a high correlation between allthese dependent variables.

Sovij€arvi et al38 stated that one of the most relevant effects oftube phonation is the lowering of the larynx. The degree of thiseffect would be related to the length of the tube.39–41 The sameauthor pointed out that the goal is not necessarily to reach a verylow larynx but to avoid a high VLP, especially in subjects withhyperfunctional laryngeal activity.42

Earlier investigations have demonstrated similar effects oftube phonation on VLP. In a computerized tomography study,Guzman et al43 reported lowering of the larynx during bothglass resonance tube and stirring straw phonation. This changeremained during vowel production after tube and straw. In a -videofluorographic and dual-channel electroglottographic reg-istration, Laukkanen et al44 found a lower VLP comparedwith the resting position during other semioccluded vocal tractpostures. The opposite effect of tube phonation on VLP has alsobeen demonstrated.45,46 Furthermore, two recent magneticresonance imaging studies reported no changes on the VLPduring phonation into a resonance tube and during voicedplosive consonants.47,48

It is important to highlight that no previous studies have re-ported the effect of semioccluded postures on VLP in subjects

TABLE2.

ComparisonBetw

eenScore

AveragesbyPhonatory

TaskforEachVariable

Phonatory

Task(M

ean,Standard

Deviation)

Variable

Long-W

ide

Tube

Long-N

arrow

Tube

Short-W

ide

Tube

Short-N

arrow

Tube

TubeInto

the

Water(3

cm)

TubeInto

the

Water(10cm)

HandOver

Mouth

Lip

Trill

PValue

VLP

3.96(0.67)

4.38(0.65)

3.71(0.55)

4.17(0.58)

4.28(0.56)

4.80(0.40)

3.73(0.62)

3.47(0.64)

<0.0001

Pharyngealwidth

3.63(0.65)

4.04(0.79)

3.34(0.57)

3.88(0.59)

4.19(0.51)

4.66(0.57)

3.66(0.67)

3.19(0.56)

<0.0001

A-P

laryngealcompression

3.25(0.67)

3.52(0.64)

3.23(0.55)

3.49(0.73)

3.80(0.74)

4.04(0.92)

3.19(0.73)

3.38(0.55)

<0.0001


diagnosed with nonorganic hyperfunctional dysphonia. Ahigh VLP is commonly presented in patients with this vocalcondition.10–12 Because VLP significantly decreased in ourparticipants, it can be concluded that semioccluded postures,especially tube submerged into the water and phonation intolong-narrow tube (stirring straw), are suitable therapeutic toolsfor people with high VLP because of laryngeal muscle tension.Muscle tension may be reflected in the pharyngeal activity as

well, more specifically in pharyngeal narrowing. Findings fromthe present study revealed that all phonatory tasks produceda widening of the pharynx during exercising. Earlier studieshave demonstrated similar outcomes on pharyngeal configura-tion.47–49 Several changes were observed during both glassresonance tube and narrow straw phonation in a recentstudy.43 The lower pharynx area, middle pharyngeal region,and A-P width of the hypopharynx increased during exercisingcompared with vowel phonation before the exercises. All thesechanges were larger during straw than tube phonation. More-over, in a computed tomography and finite-element modelinginvestigation, the most dominant change in the vocal tract dur-ing phonation into the tube was caused by expansion of thecross-sectional area of the oropharynx.49 An increase in thearea of the junction between the pyriform sinuses and epilar-yngeal tube was also observed. Hence, lengthening of the vocaltract may also have a positive effect on pharyngeal configura-tion in patients with hyperfunction. It may also be used as a vo-cal training exercise in normal individuals.Two possible explanations could be reasonable for the effect

on VLP and pharyngeal width of phonation with a tube sub-merged under the water and a long-narrow tube (stirring straw).First, oral pressure increases during these types of exer-cises.43,50 In a previous study, acoustic impedance and meansupraglottal resistance were raised by phonating into differenttubes (different in length and inner diameter) in the air andsubmerged under the water. The results showed that the oralpressure is higher when phonating into narrow straws (stirringstraws) than wide straws (drinking straws) and even higherwhen straws are submerged under the water. Furthermore, thedeeper the straw is under the water the higher the oralpressure.50 These findings are in line with the results reportedby Titze.51 Guzman et al43 also reported increased oral pressureduring resonance tube and stirring straw phonation, beinggreater in the latter. Interestingly, in the present study, the low-est VLP and widest pharynx were observed during the most re-sistive phonatory tasks: phonation into a straw submergedunder the water (3 and 10 cm) and phonation into a long-narrow tube (stirring straw). The other semioccluded posturesalso produced the same effect but with a lower degree. The in-creased oral pressure during semioccluded exercises may havedirectly pushed the larynx down and the pharyngeal walls later-ally. Therefore, the first possible explanation would be a me-chanical effect. The second explanation of the effect on VLPand pharyngeal widening of semiocclusions is the muscle relax-ation. The pushing effect accomplished by oral pressure mayproduce a relaxation of the laryngeal and pharyngeal muscula-ture, and this in turn may have produced a lowering of the lar-ynx and widening of the pharynx.


Another interesting result from the present study was the ar-yepiglottic narrowing (A-P compression) found during all pho-natory tasks. As occurred for the VLP and pharyngeal width,the A-P compression was also more prominent during phona-tion with a tube submerged under the water (3 and 10 cm)and during phonation with a long-narrow tube. Aryepigotticnarrowing or A-P laryngeal compression has been describedas both a sign of laryngeal hyperfunction28,29 and also asa good and desirable feature in voice performers.7,33–36

Sundberg7 reported that the aryepiglottic narrowing contributesto the formation of the singer’s formant, a cluster of the third,fourth, and fifth formants. This acoustic characteristic, typicalin trained singers, may help singers obtain a louder and brightervoice quality because of a high concentration of acoustic energyaround 3 kHz. These findings were later confirmed by Titze andStory.26

It not surprising that in the present study, VLP and A-P laryn-geal compression were correlated. In this regard, Sundberg7 hassuggested that aryepiglottic narrowing can be reached by low-ering of the larynx. According to the author, a low VLP is a wayto obtain a high ratio between the cross-sectional area of the lowpharynx and the epilaryngeal tube opening, which is the neces-sary setting to produce the singer’s formant cluster. Neverthe-less, this is not the only way to reach the high ratio. Earlierinvestigations have demonstrated that a spectral prominencenear 3 kHz could also be obtained by other vocal tractstrategies.37,47,48

A high correlation between pharyngeal width and A-P laryn-geal compression was demonstrated in the present study aswell. This means that when the pharynx widened, there wasalso a narrowing of the aryepiglottic sphincter. A greater ratioof inlet to the pharynx over the outlet of the epilarynx tubehas previously been reported using magnetic resonance imag-ing and computerized tomography examinations when using ar-tificial lengthening of the vocal tract.43,47,48 This increased ratiowould help to the formation of the singer/speaker’s formant.7,25

FIGURE 1. Comparison between score aver

Related to this, in the three previous cited studies wherea greater ratio was obtained,43,47,48 a more evident spectralprominence around 2.5 kHz (singer’s formant) was alsoreported. Therefore, it is feasible to assume that vocal tractsemiocclusions may contribute to a high concentration ofspectral energy at the singer’s formant region because of anincrement of the ratio between pharyngeal and epilaryngealtube openings.

According to Titze and Story,26 when a narrowed epilarynx(produced by an aryepiglottic narrowing) is combined witha wide pharynx, the acoustic load of the vocal tract is inertivefor all possible values of fundamental frequencies. This pro-duces strong interactions between the source and the filter. Spe-cifically, the inertance of the vocal tract facilitates vocal foldvibration by lowering the oscillation threshold pressure. Thiseffect may also be important for people with hyperfunctionaldysphonia.

In the present study, all dependent variables demonstrated thegreatest degree of change (P < 0.001) when phonating with loudvoice. No significant changes were observed during soft andmoderate loudness for pharyngeal width and A-P laryngealcompression. For VLP, all loudness levels caused a significantlylower larynx. However, as mentioned previously, the greatestchange was seen during loud voice production. This is an inter-esting result that could be related to the degree of oral pressureproduced during artificial lengthening and occlusions of the vo-cal tract. Therefore, this independent variable should be consid-ered when using these types of exercises to modify thelaryngeal and pharyngeal activities. In an earlier study, Yanagi-sawa et al34 found similar results with regard to the effect ofloudness on supraglottic activity. The authors reported thatmore aryepiglottic narrowing was obtained when loudness levelincreased across different singing voice qualities.

Because the order of the tasks in the protocol was not ran-domized, one could suspect that in fact only the first task hadan effect, which then persisted across the other tasks. However,

ages by phonatory task for each variable.

TABLE 3.

Multivariate Linear Regression Model Considering VLP, Pharyngeal Width, and A-P Laryngeal Constriction as Outcomes; and Phonatory Task and Its Intensity

as Predictive Variables

VLP Pharyngeal Width A-P Laryngeal Constriction

Variables

Coefficients (95%

Confidence Interval) t P Value Variables

Coefficients (95%

Confidence Interval) t P Value Variables

Coefficients (95%

Confidence

Interval) t P Value

Intensity

Soft 0.21 (0.11, 0.28) 3.11 0.007 Soft 0.11 (�0.05, 0.18) 1.27 0.091 (NS) Soft 0.10 (�0.08, 0.17) 1.13 0.238 (NS)

Habitual 0.23 (0.09, 0.38) 3.18 0.002 Habitual 0.14 (�0.01, 0.29) 1.84 0.067 (NS) Habitual 0.11 (�0.04, 0.28) 1.42 0.157 (NS)

Loud 0.38 (0.24, 0.53) 5.20 <0.001 Loud 0.41 (0.25, 0.56) 5.30 <0.001 Loud 0.30 (0.13, 0.46) 3.60 <0.001

Phonatory task

Long-wide

tube

0.37 (0.19, 0.55) 3.60 <0.001 Long-wide

tube

0.57 (0.25, 0.90) 3.69 <0.001 Long-wide

tube

0.78 (0.45, 1.11) 4.62 <0.001

Long-narrow

tube

0.41 (0.20, 0.62) 3.90 <0.001 Long-narrow

tube

0.41 (0.19, 0.62) 3.75 <0.001 Long-narrow

tube

0.26 (0.03, 0.50) 2.27 0.023

Short-wide

tube

�0.25 (�0.20, �0.04) �2.40 0.017 Short-wide

tube

�0.28 (�0.50, �0.06) �2.60 0.010 Short-wide

tube

�0.01 (�0.24, 0.21) �0.13 0.894 (NS)

Short-narrow

tube

0.20 (�0.001, �0.41) 1.95 0.052 Short-narrow

tube

0.25 (0.03, 0.47) 2.31 0.022 Short-narrow

tube

0.23 (0.004, 0.47) 2.01 0.045

Tube into the

water

(3 cm)

0.28 (�0.017, �0.59) 1.85 0.065 (NS) Tube into the

water

(3 cm)

0.59 (0.27, 0.91) 3.69 <0.001 Tube into the

water

(3 cm)

0.57 (0.23, 0.91) 3.30 0.001

Tube into the

water

(10 cm)

0.81 (0.50, 1.11) 5.21 <0.001 Tube into the

water

(10 cm)

1.07 (0.75, 1.39) 6.63 <0.001 Tube into the

water

(10 cm)

0.81 (0.47, 1.15) 4.67 <0.001

Hand over

mouth

�0.23 (�0.44, �0.03) �2.25 0.025 Hand over

mouth

0.03 (�0.18, 0.24) 0.29 0.773 (NS) Hand over

mouth

�0.06 (�0.29, 0.16) �0.54 0.593 (NS)

Lip trill �0.49 (�0.70, �0.28) �4.65 <0.001 Lip trill �0.44 (�0.66, �0.22) �4.04 <0.001 Lip trill 0.12 (�0.10, 0.36) 1.07 0.285 (NS)

Abbreviation: NS, not significant.

JournalofVoice,Vol.27,No.6,2013

714


this is unlikely because of the degree of the effected changethroughout the sequence in all participants. For instance, thetask that showed the most prominent effect in all variables(tube submerged 10 cm into the water) was performed sixthin the sequence. Additionally, lip trills, which were carriedout at the end of the sequence, showed the lowest degree ofchange in two of the three independent variables.

The present study may have some limitations. First, all theparticipants were diagnosed with hyperfunctional dysphonia,and none of the subjects with hypofunctional dysphonia wereincluded. A different effect could be likely between these twogroups. Second, only the short-term effect was assessed. Possi-ble long-term changes remain unknown. Finally, quantitativeimaging may be able to provide more accurate information re-garding the VLP, pharyngeal width, and A-P compression dur-ing semioccluded postures.

CONCLUSION

VLP, A-P laryngeal compression, and pharyngeal width can bemodified by semioccluded vocal tract exercises in subjects di-agnosed with nonorganic hyperfunctional dysphonia. A low lar-ynx, narrow aryepiglottic opening, and wide pharynx may bereached by using these types of exercises. Phonation intoa tube submerged under the water and a stirring straw producemore prominent changes than the other examined semioccludedpostures. Loud voice productions also demonstrated a greaterdegree of change than soft and moderate loudness levels. Twopossible explanations arise for these findings: an increase inoral pressure (mechanical effect) and/or a relaxation of the la-ryngeal and pharyngeal musculature because of semiocclu-sions. The observed effect is only short term, and theretention of the effect remains unknown.

REFERENCES1. Aronson A. Clinical Voice Disorders. An Interdisciplinary Approach. New

York, NY: Thieme Medical Publishers; 1985.

2. Colton R, Casper J. Understanding Voice Problems. A Physiological Per-

spective for Diagnosis and Treatment. Baltimore, MD: Lippincott Williams

& Wilkins; 1990.

3. Ewan WG, Krones R. Measuring larynx movement using the thryroumbr-

ometer. J Phonetics. 1974;2:327–335.

4. Riordan CJ. Control of vocal-tract length in speech. J Acoust Soc Am. 1977;

62:998–1002.

5. Iwarsson J, Sundberg J. Effects of lung volume on vertical larynx position

during phonation. J Voice. 1998;12:159–165.

6. Shipp T, Izdebski K. Vocal frequency and vertical larynx positioning by

singers and nonsingers. J Acoust Soc Am. 1975;58:1104–1106.

7. Sundberg J. Articulatory interpretations of the singing formant. J Acoust

Soc Am. 1974;55:838–844.

8. Iwarsson J. Effects of inhalatory abdominal wall movement on vertical la-

ryngeal position during phonation. J Voice. 2001;15:384–394.

9. Shipp T. Effects of vocal frequency and effort on vertical laryngeal position.

J Res Singing. 1984;6:1–5.

10. Angsuwarangsee T, Morrison M. Extrinsic laryngeal muscular tension in

patients with voice disorders. J Voice. 2002;16:333–343.

11. Rubin JS, Blake E, Mathieson L. Musculoskeletal patterns in patients with

voice disorders. J Voice. 2007;21:477–484.

12. Lowell SY, Kelley RT, Colton RH, Smith PB, Portnoy JE. Position of the

hyoid and larynx in people with muscle tension dysphonia. Laryngoscope.

2012;122:370–377.

13. Boone DR. The Voice and Voice Therapy. Englewood Cliffs, NJ: Prentice

Hall; 1977.

14. Boone DR, McFarlane SC. A critical view of the yawn-sigh as a voice ther-

apy technique. J Voice. 1993;7:75–80.

15. Elliot N, Sundberg J, Gramming P. Physiological aspects of a vocal exer-

cise. J Voice. 1997;11:171–177.

16. Casper JK, Colton RH, Woo P, Brewer D. Physiological characteristics of

selected voice therapy techniques: a preliminary research note. J Voice.

1992;1:131–141.

17. Van Lierde KM, De Bodt M, Dhaeseleer E, Wuyts F, Claeys S. The treat-

ment of muscle tension dysphonia: a comparison of two treatment tech-

niques by means of an objective multiparameter approach. J Voice. 2010;

24:294–301.

18. Mathieson L, Hirani SP, Epstein R, Baken RJ, Wood G, Rubin JS. Laryn-

geal manual therapy: a preliminary study to examine its treatment effects

in the management of muscle tension dysphonia. J Voice. 2009;23:

353–366.

19. Roy N, Leeper HA. Effects of the manual laryngeal musculoskeletal ten-

sion reduction technique as a treatment for functional voice disorders: per-

ceptual and acoustic measures. J Voice. 1993;7:242–249.

20. Shipp T. Vertical laryngeal position: research findings and their relationship

for singers. J Voice. 1987;1:217–219.

21. Sundberg J, Nordstr€om P-E. Raised and lowered larynx—the effect on

vowel formant frequencies. STL-QPSR. 1976;17:35–39.

22. Titze IR. Raised versus lowered larynx singing. NATS J. 1993;50:37.

23. Mitchell HF, Kenny DT, Ryan M, Davis PJ. Defining open throat through

content analysis of experts’ pedagogical practices. Logoped Phoniatr

Vocol. 2003;28:167–180.

24. Mitchell HF, Kenny DT. The effects of open throat technique on long term

average spectra (LTAS) of female classical voices. Logoped Phoniatr

Vocol. 2004;29:99–118.

25. Titze I. Voice research: the wide pharynx. J Singing. 1998;55:27–28.

26. Titze IR, Story BH. Acoustic interactions of the voice source with the lower

vocal tract. J Acoust Soc Am. 1997;101:2234–2243.

27. Ware C. Basics of Vocal Pedagogy: The Foundations and Process of Sing-

ing. New York, NY: McGraw-Hill; 1998.

28. Stager SV, Bielamowicz S, Gupta A, Marullo S, Regnell JR, Barkmeier J.

Quantification of static and dynamic supraglottic activity. J Speech Lang

Hear Res. 2001;44:1245–1256.

29. Stager S, Bielamowicz S, Regnell J, Gupta A, Brakmeier J. Supraglottic ac-

tivity: evidence of hyperfunction or laryngeal articulation? J Speech Lang

Hear Res. 2000;43:229–238.

30. Behrman A, Dahl L, Abramson A, Schutte H. Anterior-posterior and me-

dial compression of the supraglottis: signs of nonorganic dysphonia or nor-

mal postures? J Voice. 2003;17:403–410.

31. Hillman RE, Gress C, Hargrave J, Walsh M, Bunting G. The efficacy of

speech-language pathology intervention: voice disorders. Semin Speech

Lang. 1990;11:297–310.

32. Sama A, Carding PN, Price S, Kelly P, Wilson JA. The clinical features of

functional dysphonia. Laryngoscope. 2001;111:458–463.

33. Lawrence V. Laryngological observations on belting. J Res Singing. 1979;

2:26–28.

34. Yanagisawa E, Estill J, Kmucha S, Leder S. The contribution of aryepiglot-

tic constriction to ‘‘ringing’’ voice quality—a videolaryngoscopic study

with acoustic analysis. J Voice. 1989;3:342–350.

35. Pershall KE, Boone DR. Supraglottic contribution to voice quality. J Voice.

1987;1:186–190.

36. Hanayama E, Camargo Z, Tsuji D, Pinho S. Metallic voice: physiological

and acoustic features. J Voice. 2009;23:62–70.

37. Leino T, Laukkanen AM, Radolf V. Formation of the actor’s/speaker’s for-

mant: a study applying spectrum analysis and computer modeling. J Voice.

2011;25:150–158.

38. Sovij€arvi A, H€ayrinen R, Orden-Pannila M, Syv€anen M. A€anifysiologisten

Kuntoutusharjoitusten Ohjeita. [Instructions for voice exercises]. Helsinki,

Finland: Publications of Suomen Puheopisto; 1989.

39. Sovij€arvi A. Nya metoder vid behandlingen av r€ostrubbningar.Nordisk Tid-

skrift for Tale og Stemme. 1969;3:121–131.

http://refhub.elsevier.com/S0892-1997(13)00098-2/sref1







































































































40. Simberg S, Laine A. The resonance tube method in voice therapy: descrip-

tion and practical implementations. Logoped Phoniatr Vocol. 2007;32:

165–170.

41. Sovij€arvi A. Die Bestimmung der Stimmkategorien mittels Resonanzr€oh-

ren. Int Kongr Phon Wiss. 1965;8:532–535.

42. Sovij€arvi A. Adnifysiologiasta ja artikulaatiotekniikasta. [On Voice Physi-

ology and Articulatory Technique] [in Finnish]. Helsinki, Finland: Depart-

ment of Phonetics, University of Helsinki; 1966.

43. Guzman M, Laukkanen A-M, Krupa P, Hor�a�cek J, �Svec J, Geneid A. Vocaltract and glottal function during and after vocal exercising with resonance

tube and straw. J Voice. 2013;27:523.

44. LaukkanenAM, Takalo R, Vilkman E, Nummenranta J, Lipponen T. Simul-

taneous videofluorographic and dual-channel electroglottographic registra-

tion of the vertical laryngeal position in various phonatory tasks. J Voice.

1999;13:60–71.

45. Laukkanen A-M, Lindholm P, Vilkman E, Haataja K, Alku P. A physiolog-

ical and acoustic study on voiced bilabial fricative /ß:/ as a vocal exercise.

J Voice. 1996;10:67–77.

46. Laukkanen A, Titze I, Hoffman H, Finnegan E. Effects of a semi-occluded

vocal tract on laryngeal muscle activity and glottal adduction in a single fe-

male subject. Folia Phoniatr Logop. 2008;60:298–311.

47. Laukkanen A-M, Hor�a�cek J, Krupa P, �Svec JG. The effect of phonation into

a straw on the vocal tract adjustments and formant frequencies. A prelim-

inary MRI study on a single subject completed with acoustic results.

Biomed Signal Process Control. 2010;7:50–57.

48. Laukkanen AM, Hor�a�cek J, Havl�ık R. Case-study magnetic resonance im-

aging and acoustic investigation of the effects of vocal warm-up on two

voice professionals. Logoped Phoniatr Vocol. 2012;37:75–82.

49. Vampola T, Laukkanen A-M, Hor�a�cek J, �Svec JG. Vocal tract changes

caused by phonation into a tube: a case study using computer tomography

and finite-element modeling. J Acoust Soc Am. 2011;129:310–315.

50. Hor�a�cek J, RadolF V, Bula V, Vesel�y J, Laukkanen A-M. Experimental in-

vestigation of air pressure and acoustic characteristic of human voice. Part

1: measurement in vivo. Proceedings of the 18th International Conference

on Engineering Mechanics; 2012:403–417.

51. Titze I. How to use the flow resistant straws. J Singing. 2002;58:429–430.













































Semioccluded Vocal Tract Exercises - TUNI

Documents

Transcript of Semioccluded Vocal Tract Exercises - TUNI