Fundamentals of Digital Audio. The Central Problem n Sound waves consist of air pressure changes n...

48
Fundamentals of Digital Audio

Transcript of Fundamentals of Digital Audio. The Central Problem n Sound waves consist of air pressure changes n...

Page 1: Fundamentals of Digital Audio. The Central Problem n Sound waves consist of air pressure changes n This is what we see in an oscilloscope view: changes.

Fundamentals of Digital Audio

Page 2: Fundamentals of Digital Audio. The Central Problem n Sound waves consist of air pressure changes n This is what we see in an oscilloscope view: changes.

The Central Problem

Sound waves consist of air pressure changes This is what we see in an oscilloscope view: changes in air

pressure over time

Page 3: Fundamentals of Digital Audio. The Central Problem n Sound waves consist of air pressure changes n This is what we see in an oscilloscope view: changes.

The Central Problem

Waves in nature, including sound waves, are continuous:

Between any two points on the curve, no matter how close together they are, there are an infinite number of points

Page 4: Fundamentals of Digital Audio. The Central Problem n Sound waves consist of air pressure changes n This is what we see in an oscilloscope view: changes.

The Central Problem

Analog audio (vinyl, tape, analog synths, etc.) involves the creation or imitation of a continuous wave.

Computers cannot represent continuity (or infinity).

Computers can only deal with discrete values. Digital technology is based on converting

continuous values to discrete values.

Page 5: Fundamentals of Digital Audio. The Central Problem n Sound waves consist of air pressure changes n This is what we see in an oscilloscope view: changes.

Digital Conversion

The instantaneous amplitude of a continuous wave is measured (sampled) regularly. The measurement values, samples, may be stored in a digital system.

Page 6: Fundamentals of Digital Audio. The Central Problem n Sound waves consist of air pressure changes n This is what we see in an oscilloscope view: changes.

Digital Conversion

The instantaneous amplitude of a continuous wave is measured (sampled) regularly. The measurement values, samples, may be stored in a digital system.

0.9925

0.9945

0.9961

0.99750.9986

0.99930.9998 1.0

0.99980.9993

0.99860.9975

0.9961

0.9945

0.9925

Page 7: Fundamentals of Digital Audio. The Central Problem n Sound waves consist of air pressure changes n This is what we see in an oscilloscope view: changes.

Digital Conversion

The amplitude of a continuous wave is measured (sampled) regularly. The measurement values, samples, may be stored in a digital system.

[ 0.9925, 0.9945, 0.9961, 0.9975, 0.9986, 0.9993, 0.9998, 1.0, 0.9998, 0.9993, 0.9986, 0.9975, 0.9961, 0.9945, 0.9925 ]

Page 8: Fundamentals of Digital Audio. The Central Problem n Sound waves consist of air pressure changes n This is what we see in an oscilloscope view: changes.

Digital Audio

Digital representation of audio is analogous to cinema representation of motion.

We know that “moving pictures” are not really moving; cinema is simply a series of pictures of motion, sampled and projected fast enough that the effect is that of apparent motion.

With digital audio, if a sound is sampled often enough, the effect is apparent continuity when the samples are played back.

Page 9: Fundamentals of Digital Audio. The Central Problem n Sound waves consist of air pressure changes n This is what we see in an oscilloscope view: changes.

Digital Audio

Con:– It is, at best, only an approximation of the wave

Pros:– Significantly lower background noise levels– Sounds are more reliably stored and duplicated– Sounds are easier to manipulate:

Rather than worry about how to change the shape of a wave, engineers need only perform appropriate numerical operations.e.g., changing the volume level of a digital audio file is simply a matter of multiplication: each sample value is multiplied by a value that raises or lowers it by a certain percentage.

Page 10: Fundamentals of Digital Audio. The Central Problem n Sound waves consist of air pressure changes n This is what we see in an oscilloscope view: changes.

Digital Audio

The theory behind digital representation has existed since the 1920s.

It wasn’t until the 1950s that technology caught up to the theory, and it was possible to implement digital audio.

Page 11: Fundamentals of Digital Audio. The Central Problem n Sound waves consist of air pressure changes n This is what we see in an oscilloscope view: changes.

Digital Audio

Bell Labs produced the first digital audio synthesis in the 1950s.

For computer synthesis, a series of samples was calculated and stored in a wavetable.

The wavetable described, in connect-the-dots fashion, the shape of a wave (i.e., its timbre).

Reading through the wavetable at different rates (skipping every n samples, the sampling increment) allowed different pitches to be created.

Audio was produced by feeding the samples that were to be audified through a digital to analog converter (DAC).

Page 12: Fundamentals of Digital Audio. The Central Problem n Sound waves consist of air pressure changes n This is what we see in an oscilloscope view: changes.

Digital Audio

Contemporary computer sound cards often contain a set of wavetable sounds.

The function is the same: a library of samples describing different waveforms.

They are triggered by MIDI commands. (These will be covered fully in a few weeks.) For example, a given note number will translate to the table being read at a certain sampling increment to produce the desired pitch.

Page 13: Fundamentals of Digital Audio. The Central Problem n Sound waves consist of air pressure changes n This is what we see in an oscilloscope view: changes.

Digital Audio

Digital recording became possible in the 1970s.

Voltage input from a microphone is fed to an analog to digital converter (ADC), which stores the signal as a series of samples.

The samples can then be sent through a DAC for playback.

Page 14: Fundamentals of Digital Audio. The Central Problem n Sound waves consist of air pressure changes n This is what we see in an oscilloscope view: changes.

Digital Audio

Thus, the ADC produces a “dehydrated” version of the audio.

The DAC then “rehydrates” the audio for playback.

(Gareth Loy, Musimathics v. 2)

Page 15: Fundamentals of Digital Audio. The Central Problem n Sound waves consist of air pressure changes n This is what we see in an oscilloscope view: changes.

Characteristics of Digital Audio

With digital audio, we are concerned with two measurements:– Sampling rate– Quantization

With these measurements, we can describe how well a digitized audio file represents the analog original.

Page 16: Fundamentals of Digital Audio. The Central Problem n Sound waves consist of air pressure changes n This is what we see in an oscilloscope view: changes.

Sampling Rate

This number tells us how often an audio signal is sampled, the number of samples per second.

The more often an audio signal is sampled, the better it is represented in discrete form:

Page 17: Fundamentals of Digital Audio. The Central Problem n Sound waves consist of air pressure changes n This is what we see in an oscilloscope view: changes.

Sampling Rate

This number tells us how often an audio signal is sampled, the number of samples per second.

The more often an audio signal is sampled, the better it is represented in discrete form:

Page 18: Fundamentals of Digital Audio. The Central Problem n Sound waves consist of air pressure changes n This is what we see in an oscilloscope view: changes.

Sampling Rate

This number tells us how often an audio signal is sampled, the number of samples per second.

The more often an audio signal is sampled, the better it is represented in discrete form:

Of course, this staircase-shaped wave needs to be smoothed.

This process will be covered during the discussion on filtering.

Page 19: Fundamentals of Digital Audio. The Central Problem n Sound waves consist of air pressure changes n This is what we see in an oscilloscope view: changes.

Sampling Rate

So we want to sample an audio wave every so often.The question is: how “often” is “often enough”?

Harry Nyquist of Bell Labs addressed this question in a 1925 paper concerning telegraph signals.

Page 20: Fundamentals of Digital Audio. The Central Problem n Sound waves consist of air pressure changes n This is what we see in an oscilloscope view: changes.

Sampling Rate

Given that a wave will be smoothed by a subsequent filtering process, it is sufficient to sample both its peak and its trough:

••••

Page 21: Fundamentals of Digital Audio. The Central Problem n Sound waves consist of air pressure changes n This is what we see in an oscilloscope view: changes.

Sampling Rate

To represent digitally a signal containing frequency components up to X Hz, it is necessary to use a sampling rate of at least 2X samples per second.

Thus, we have the sampling theorem(also called the Nyquist theorem):

Conversely, the maximum frequency contained in a signal sampled at a rate of SR is SR/2 Hz.

The frequency SR/2 is also termed the Nyquist frequency.

Page 22: Fundamentals of Digital Audio. The Central Problem n Sound waves consist of air pressure changes n This is what we see in an oscilloscope view: changes.

Sampling Rate

In theory, since the maximum audible frequency is 20 kHz, a sampling rate of 40 kHz would be sufficient to re-create a signal containing all audible frequencies.

Page 23: Fundamentals of Digital Audio. The Central Problem n Sound waves consist of air pressure changes n This is what we see in an oscilloscope view: changes.

Sampling Rate For most frequencies, we will oversample

(the audio frequency is below the Nyquist frequency):

Page 24: Fundamentals of Digital Audio. The Central Problem n Sound waves consist of air pressure changes n This is what we see in an oscilloscope view: changes.

Sampling Rate For most frequencies, we will oversample

(the audio frequency is below the Nyquist frequency):

Page 25: Fundamentals of Digital Audio. The Central Problem n Sound waves consist of air pressure changes n This is what we see in an oscilloscope view: changes.

Sampling Rate If we sample at precisely the Nyquist frequency,

our critically sampled signal runs the risk of missing peaks and troughs:

••••

or

•••••

This problem is also addressed by filtering.

Page 26: Fundamentals of Digital Audio. The Central Problem n Sound waves consist of air pressure changes n This is what we see in an oscilloscope view: changes.

•••••••••

Sampling Rate More serious is the problem of undersampling a

frequency greater than the Nyquist frequency:

Audio signal at 30 kHz,sampled at 40 kHz

RESULT:

Page 27: Fundamentals of Digital Audio. The Central Problem n Sound waves consist of air pressure changes n This is what we see in an oscilloscope view: changes.

•••••••••

Sampling Rate More serious is the problem of undersampling a

frequency greater than the Nyquist frequency:

Audio signal at 30 kHz,sampled at 40 kHz

RESULT:The frequency is misrepresented at 10 kHz, at reverse phase

Misrepresented frequencies are termed aliases.

Page 28: Fundamentals of Digital Audio. The Central Problem n Sound waves consist of air pressure changes n This is what we see in an oscilloscope view: changes.

Sampling Rate

In general, if a frequency, F, sampled at a sampling rate of SR, exceeds the Nyquist frequency, that frequency will alias to a frequency of:- (SR - F)

The minus sign indicates that the frequency is in opposite phase

Page 29: Fundamentals of Digital Audio. The Central Problem n Sound waves consist of air pressure changes n This is what we see in an oscilloscope view: changes.

Sampling Rate It is useful to illustrate sampled frequencies on a polar

diagram, with 0 Hz at 3:00 and the Nyquist frequency at 9:00:

0 HzNyquist

f

-f

The upper half of the circle represents frequencies from 0 Hz to the Nyquist frequency

The lower half of the circle represents negative frequencies from 0 Hz to the Nyquist frequency (there is no distinction in a digital audio system between ±NF)

Any audio frequency above the Nyquist frequency will alias to a frequency shown on the bottom half of the circle, a negative frequency between 0 Hz and the Nyquist frequency.

Frequencies above the Nyquist frequency do not exist in a digital audio system

Page 30: Fundamentals of Digital Audio. The Central Problem n Sound waves consist of air pressure changes n This is what we see in an oscilloscope view: changes.

Sampling Rate

In the recording process, filters are used to remove all frequencies above the Nyquist frequency before the audio signal is sampled.

This step is critical since aliases cannot be removed later.

Provided these frequencies are not in the sampled signal, the signal may be sampled and later reconverted to audio with no loss of frequency information.

Page 31: Fundamentals of Digital Audio. The Central Problem n Sound waves consist of air pressure changes n This is what we see in an oscilloscope view: changes.

Sampling Rate

The sampling rate for audio CDs is 44.1 kHz. The origin of this rate lies in video formats. When digital audio recording began, audio tape was

not capable of handling the density of digital signals. The first digital masters were stored on video as a

psuedo video signal, in which binary values of 1 and 0 were stored as video levels of black and white.

Page 32: Fundamentals of Digital Audio. The Central Problem n Sound waves consist of air pressure changes n This is what we see in an oscilloscope view: changes.

Sampling Rate

Video is drawn left to right, starting from the top of the screen and moving down.

First the odd numbered lines are drawn, then the even numbered lines.

Each video frame has two fields: the odd field and the even field.

The fields are adjacent to each other on the video tape.

O

O

O

O

O

E

E

E

E

E

Frame n, odd

Frame n, even

Frame n+1, odd

Frame n+1, even

Frame n+2, odd

Page 33: Fundamentals of Digital Audio. The Central Problem n Sound waves consist of air pressure changes n This is what we see in an oscilloscope view: changes.

Sampling Rate There are two video formats:

– 525 lines, 30 frames per second (USA)Minus 35 blank lines, leaving 490 lines per frame60 fields per second, 245 lines per field

– 625 lines, 25 frames per second (European)Minus 37 blank lines, leaving 588 lines per frame50 fields per second, 294 lines per field

Three samples could be stored on each line, allowing:

60 x 245 x 3 = 44,100 samples per secondor50 x 294 x 3 = 44,100 samples per second

44.1 kHz remains the standard sampling rate for CD audio.

Page 34: Fundamentals of Digital Audio. The Central Problem n Sound waves consist of air pressure changes n This is what we see in an oscilloscope view: changes.

Quantization

This has a few names:– Sample size– Bit depth– Word size

The term “quantization” takes its origin from quantum physics:– Electrons orbit an atom’s nucleus in one of a number of

well-defined layers;– An electron may be knocked from one layer to another,

but it can never stay between one of the layers.

Page 35: Fundamentals of Digital Audio. The Central Problem n Sound waves consist of air pressure changes n This is what we see in an oscilloscope view: changes.

Quantization

In the discussion of sampling rate, we only considered how often the amplitude of the wave was measured.

We did not discuss how accurate these measurements were. The effectiveness of any measurement depends on the precision

of our ruler. (Measuring the thickness of something with many small indentations with a ruler only marking feet will probably not give a very accurate measurement; we have to estimate many measurements.)

Just as there are limits to how often we can sample, there are limits to the resolution of our ruler.

Page 36: Fundamentals of Digital Audio. The Central Problem n Sound waves consist of air pressure changes n This is what we see in an oscilloscope view: changes.

00011011•••••••••••••••••••

Quantization Like all numbers stored in computers, the amplitude values are stored as

binary numbers. The value that gets stored is the closest available binary number - akin to

the nearest marking on a ruler. The accuracy of our measurement depends on how many bits we have to

represent these values. Clearly, the more bits we have, the finer the resolution of our ruler.

2 bits

Each change of bit represents a change in voltage level

Page 37: Fundamentals of Digital Audio. The Central Problem n Sound waves consist of air pressure changes n This is what we see in an oscilloscope view: changes.

000001010011100101110111•••••••••••••••••••

Quantization Like all numbers stored in computers, the amplitude values are stored as

binary numbers. The value that gets stored is the closest available binary number - akin to

the nearest marking on a ruler. The accuracy of our measurement depends on how many bits we have to

represent these values. Clearly, the more bits we have, the finer the resolution of our ruler.

3 bits

Each change of bit represents a change in voltage level

Page 38: Fundamentals of Digital Audio. The Central Problem n Sound waves consist of air pressure changes n This is what we see in an oscilloscope view: changes.

0000000100100011010001010110011110001001101010111100110111101111•••••••••••••••••••

Quantization Like all numbers stored in computers, the amplitude values are stored as

binary numbers. The value that gets stored is the closest available binary number - akin to

the nearest marking on a ruler. The accuracy of our measurement depends on how many bits we have to

represent these values. Clearly, the more bits we have, the finer the resolution of our ruler.

4 bits

Each change of bit represents a change in voltage level

Page 39: Fundamentals of Digital Audio. The Central Problem n Sound waves consist of air pressure changes n This is what we see in an oscilloscope view: changes.

Quantization

CD audio uses 16-bit quantization.

Page 40: Fundamentals of Digital Audio. The Central Problem n Sound waves consist of air pressure changes n This is what we see in an oscilloscope view: changes.

Quantization

While aliasing is eliminated if our signal contains no frequencies above the Nyquist frequency, quantization error can never be completely eliminated.

Every sample is within a margin of error that is half the quantization level (the voltage change represented by the least significant bit).

Page 41: Fundamentals of Digital Audio. The Central Problem n Sound waves consist of air pressure changes n This is what we see in an oscilloscope view: changes.

Quantization

For a sine wave signal represented with n bits, the signal to error ratio is:

S/E (dB) = 6.02n + 1.76 The problem is that low-level signals do not use all

available bits, and therefore the error level is greater.

Page 42: Fundamentals of Digital Audio. The Central Problem n Sound waves consist of air pressure changes n This is what we see in an oscilloscope view: changes.

Quantization While quantization error may be masked at high

audio levels, it can become audible at low levels:0000000100100011010001010110011110001001101010111100110111101111•••••••••••••••••••

Worst case: a sine wave fluctuating within one quantization increment is stored as a square wave

Thus, unlike the constant hissing noise of analog recordings, quantization error is correlated with the signal, and is thus a type of distortion, rather than noise.

Page 43: Fundamentals of Digital Audio. The Central Problem n Sound waves consist of air pressure changes n This is what we see in an oscilloscope view: changes.

0000000100100011010001010110011110001001101010111100110111101111•••••••••••••••••••

Quantization

The problem of quantization distortion is addressed by dither.

Dither is low-level noise added to the audio signal before it is sampled.

Low level audio signal with dither added

Page 44: Fundamentals of Digital Audio. The Central Problem n Sound waves consist of air pressure changes n This is what we see in an oscilloscope view: changes.

Quantization

Dither adds random errors to the signal, therefore the quantization results in added noise, rather than distortion.

The noise is a constant factor, not correlated with the signal like quantization distortion.

The result is a noisy signal, rather than a signal broken up by distortion.

Page 45: Fundamentals of Digital Audio. The Central Problem n Sound waves consist of air pressure changes n This is what we see in an oscilloscope view: changes.

Quantization

The auditory system averages the signal at all times. We do not hear individual samples.

With dither, this averaging alows the musical signal to co-exist with the noise, rather than be temporarily eliminated due to distortion.

Page 46: Fundamentals of Digital Audio. The Central Problem n Sound waves consist of air pressure changes n This is what we see in an oscilloscope view: changes.

Quantization

Dither allows resolution below the least significant quantization bit.

Without dither, digital recordings would be far less satisfactory than analog recordings - a plucked guitar string, for example, fades into something close to a sine tone. Without dither, a guitar sound would gradually turn into the sound of a square wave.

With dither, there is significantly less noise in digital recordings than in analog recordings.

Page 47: Fundamentals of Digital Audio. The Central Problem n Sound waves consist of air pressure changes n This is what we see in an oscilloscope view: changes.

Quantization and Sampling Rate

The sampling rate determines the signal’s frequency content.

The number of quantization bits determines the amount of quantization error.

Page 48: Fundamentals of Digital Audio. The Central Problem n Sound waves consist of air pressure changes n This is what we see in an oscilloscope view: changes.

Size of Audio Files

44,100samples

per secondbytes per sample

(16 bits)

channels(for stereo

audio)

secondsper minute

x 2 x 2 x 60 ≈ 10 MB/minute