Basic Acoustics + Digital Signal Processing September 11, 2014.

Post on 23-Dec-2015

218 views 1 download

Transcript of Basic Acoustics + Digital Signal Processing September 11, 2014.

Basic Acoustics + Digital Signal Processing

September 11, 2014

Road Map!• For today:

• Part 1: Go through a review of the basics of (analog) acoustics.

• Part 2: Converting sound from analog to digital format.

• Any questions so far?

Part 1: An Acoustic Dichotomy• Acoustically speaking, there are two basic kinds of

sounds:

1. Periodic

• = an acoustic pattern which repeats over time

• The “period” is the length of time it takes for the pattern to repeat

• Periodic speech sounds = voiced segments + trills

2. Aperiodic

• Continuous acoustic energy which does not exhibit a repeating pattern

• Aperiodic speech sounds = fricatives

The Third Wheel• There are also acoustic transients.

• = aperiodic speech sounds which are not continuous

• i.e., they are usually very brief

• Transient speech sounds:

• stop release bursts

• clicks

• also (potentially) individual pulses in a trill

• Let’s look at the acoustic properties of each type of sound in turn…

Pin

Fad

Fad

• How is a periodic sound transmitted through the air?

• Consider a bilabial trill:

Acoustics: Basics

What does sound look like?• Air consists of floating air molecules

• Normally, the molecules are suspended and evenly spaced apart from each other

• What happens when we push on one molecule?

What does sound look like?• The force knocks that molecule against its neighbor

• The neighbor, in turn, gets knocked against its neighbor

• The first molecule bounces back past its initial rest position

initial rest position

What does sound look like?• The initial force gets transferred on down the line

rest position #1

rest position #2

• The first two molecules swing back to meet up with each other again, in between their initial rest positions

• Think: bucket brigade

Compression Wave• A wave of force travels down the line of molecules

• Ultimately: individual molecules vibrate back and forth, around an equilibrium point

• The transfer of force sets up what is called a compression wave.

• What gets “compressed” is the space between molecules

• Check out what happens when we blow something up!

Compression Wave

area of high pressure

(compression)area of low pressure

(rarefaction)

• Compression waves consist of alternating areas of high and low pressure

Pressure Level Meters• Microphones

• Have diaphragms, which move back and forth with air pressure variations

• Pressure variations are converted into electrical voltage

• Ears

• Eardrums move back and forth with pressure variations

• Amplified by components of middle ear

• Eventually converted into neurochemical signals

• We experience fluctuations in air pressure as sound

Measuring Sound• What if we set up a pressure level meter at one point in the wave?

Time

pressure level meter

Sine Waves• The reading on the pressure level meter will fluctuate between high and low pressure values

• In the simplest case, the variations in pressure level will look like a sine wave.

time

pressure

Other Basic Sinewave concepts• Sinewaves are periodic; i.e., they recur over time.

• The period is the amount of time it takes for the pattern to repeat itself.

• A cycle is one repetition of the acoustic pattern.

• The frequency is the number of times, within a given timeframe, that the pattern repeats itself.

• Frequency = 1 / period

• usually measured in cycles per second, or Hertz

• The peak amplitude is the the maximum amount of vertical displacement in the wave

• = maximum (or minimum) amount of pressure

Waveforms• A waveform plots air pressure on the y axis against time on the x axis.

Phase Shift• Even if two sinewaves have the same period and amplitude, they may differ in phase.

• Phase essentially describes where in the sinewave cycle the wave begins.

• This doesn’t affect the way that we hear the waveform.

• Check out: sine waves vs. cosine waves!

Complex Waves• It is possible to combine more than one sinewave together into a complex wave.

• At any given time, each wave will have some amplitude value.

• A1(t1) := Amplitude value of sinewave 1 at time 1

• A2(t1) := Amplitude value of sinewave 2 at time 1

• The amplitude value of the complex wave is the sum of these values.

• Ac(t1) = A1 (t1) + A2 (t1)

Complex Wave Example• Take waveform 1:

• high amplitude

• low frequency

• Add waveform 2:

• low amplitude

• high frequency

• The sum is this complex waveform:

+

=

A Real-Life Example• 480 Hz tone

• 620 Hz tone

• the combo = ?

Spectra• One way to represent complex waves is with waveforms:

• y-axis: air pressure

• x-axis: time

• Another way to represent a complex wave is with a power spectrum (or spectrum, for short).

• Remember, each sinewave has two parameters:

• amplitude

• frequency

• A power spectrum shows:

• amplitude on the y-axis

• frequency on the x-axis

One Way to Look At It• Combining 100 Hz and 1000 Hz sinewaves results in the following complex waveform:

amplitude

time

The Other Way• The same combination of 100 Hz and 1000 Hz sinewaves results in the following power spectrum:

amplitude

frequency

The Third Way• A spectrogram shows how the spectrum of a complex sound changes over time.

frequency

time

• intensity (related to amplitude) is represented by shading in the z-dimension.

1000 Hz

100 Hz

Fundamental Frequency• One last point about periodic sounds:

• Every complex wave has a fundamental frequency (F0).

• = the frequency at which the complex wave pattern repeats itself.

• This frequency happens to be the greatest common denominator of the frequencies of the component waves.

• Example: greatest common denominator of 100 and 1000 is 100. (boring!)

• GCD of 480 and 620 Hz is 20.

• GCD of 600 and 800 Hz is 200, etc.

Aperiodic sounds• Not all sounds are periodic

• Aperiodic sounds are noisy

• Their pressure values vary randomly over time

“white noise”

• Interestingly:

• White noise sounds the same, no matter how fast or slow you play it.

Fricatives• Fricatives are aperiodic speech sounds

[s]

[f]

Aperiodic Spectra• The power spectrum of white noise has component frequencies of random amplitude across the board:

Aperiodic Spectrogram• In an aperiodic sound, the values of the component frequencies also change randomly over time.

Transients• A transient is:

• “a sudden pressure fluctuation that is not sustained or repeated over time.”

• An ideal transient waveform:

A Transient Spectrum• An ideal transient spectrum is perfectly flat:

As a matter of fact• Note: white noise and a pure transient are idealizations

• We can create them electronically…

• But they are not found in pure form in nature.

• Transient-like natural sounds include:

• Hand clapping

• Finger snapping

• Drum beats

• Tongue clicking

Click Waveform

some periodic reverberation

initial impulse

Click Spectrum

• Reverberation emphasizes some frequencies more than others

Click Spectrogram

some periodic reverberation

initial impulse

Part 2: Analog and Digital

• In “reality”, sound is analog.

• variations in air pressure are continuous

• = it has an amplitude value at all points in time.

• and there are an infinite number of possible air pressure values.

• Back in the bad old days, acoustic phonetics was strictly an analog endeavor.

analog clock

Part 2: Analog and Digital

• In the good new days, we can represent sound digitally in a computer.

• In a computer, sounds must be discrete.

• everything = 1 or 0 digital clock

• Computers represent sounds as sequences of discrete pressure values at separate points in time.

• Finite number of pressure values.

• Finite number of points in time.

Analog-to-Digital Conversion• Recording sounds onto a computer requires an analog-to-

digital conversion (A-to-D)

• When computers record sound, they need to digitize analog readings in two dimensions:

X: Time (this is called sampling)

Y: Amplitude (this is called quantization)

sampling

quantization

Sampling Example

0 20 40 60 80 100-100000

10000

nominal time

amplitude

o

o

o

o

o

o

o

o

o

o

o

o

o

o

o

o

o

o

o

o

o

Thanks to Chilin Shih for making these materials available.

Sampling Example

Sampling Rate• Sampling rate = frequency at which samples are taken.

• What’s a good sampling rate for speech?

• Typical options include:

• 22050 Hz, 44100 Hz, 48000 Hz

• sometimes even 96000 Hz and 192000 Hz

• Higher sampling rate preserves sound quality.

• Lower sampling rate saves disk space.

• (which is no longer much of an issue)

• Young, healthy human ears are sensitive to sounds from 20 Hz to 20,000 Hz

One Consideration• The Nyquist Frequency

• = highest frequency component that can be captured with a given sampling rate

• = one-half the sampling rate

Problematic Example:

• 100 Hz sound

• 100 Hz sampling rate

samples 1 2 3

Harry Nyquist (1889-1976)

Nyquist’s Implication• An adequate sampling rate has to be…

• at least twice as much as any frequency components in the signal that you’d like to capture.

• 100 Hz sound

• 200 Hz sampling rate

samples 1 2 3 4 5 6

Sampling Rate Demo• Speech should be sampled at at least 44100 Hz

• (although there is little frequency information in speech above 10,000 Hz)

• 44100 Hz

• 22050 Hz • 11025 Hz (watch out for [s])

• 8000 Hz • 5000 Hz