The Secret Lives of MP3 Files
-
Upload
dkaye -
Category
Economy & Finance
-
view
4.011 -
download
0
description
Transcript of The Secret Lives of MP3 Files
The Secret Livesof MP3 Files
Doug KayeThe Conversations Network
and GigaVox Media
Formats & Encoders
• Lossless (WAV, AIFF)
• Lossy
- MPEG 1, Layer 3 (MP3)
- AAC (AAC, M4A, M4B)
- MPEG I, Layer 2 (MP2)
MPEG Confusion
• Lossy Perceptual/Psychoacoustical Codecs
• MP3 = MPEG-I Layer 3
• MP2 = MPEG-I Layer 2 (not MPEG-II)
Motion Picture Experts Group
• MPEG-1: Video CDs, MP3 Audio
• MPEG-2: Digital TV, Set-Top Boxes
• MPEG-4: Online Multimedia (Video)
• MPEG-7: Audio and Video Search
• MPEG-21: Multimedia Framework
MPEG-1 for Geeks
• Layer 1
• Simple 32-Band Algorithm
• Philips DCC (Digital Compact Cassette)
• Layer 2 (a.k.a. MUSICAM)
• Also 32 Bands
• International Standard for Broadcasting
MPEG-1 Layer 3 (MP3)for Geeks
• Psychoacoustic Masking
• 32 Bands Divided into 576 Subbands
• More Accurate Masking Thresholds
• Redundancy Reduction
• Lossless Huffman Encoding
• Bit-Reservoir Buffering
• Joint Stereo
Sample Rate for Geeks
• The Nyquist Theorem
• Sample at 2x the Highest Frequency
• 22.05kHz Sample Rate for 11kHz Audio
• Sample Rate Is a Property of Uncompressed Source (WAV or AIFF)
Sample Rate in Practice
• Standardize on 44.1kHz Sample Rate
• Flash & Other Players Require n*11.025kHz
• Resample if Source is 48kHz from DVDs
Bit Rate for Geeks
• Independent of Sample Rate
• Specifies Encoder Output File Size (CBR)
• @64kbps, 1 hour ≈ 27MB
• Variable Bit Rate (VBR)
• For Higher Bit Rates Only
• Not Universally Supported (Avoid It)
Bit Rate in Practice
• “Use Higher Bit Rates for Music?”
• It’s a Myth!
• Human Voices Are Complex
• Music Masks Its Own Artifacts
• 64kbps is Most Common Today
• 96kbps is Gaining
Podcasting Bit-Rate History
• June 2003: 32kbps. “Files too large”
• April 2004: 48kbps. “No problem”
• September 2004: 64kbps. “Quality is low”
• Today: Still 64kbps.
• Tomorrow??
Stereo Encoding
• “Stereo MP3s are twice as large as mono.”
• It’s a Myth!
• Only Bit Rate Specifies Output File Size
• You May Want to Use Higher Bit Rates for Stereo
Stereo Encoding for Geeks
• Dual Channel or Independent Channel (IC)
- Entirely Separate Left and Right
• But Most L/R Information is Redundant
• Intensity Stereo (IS)
• Mid/Side Stereo (MS)
• Joint Stereo (JS) Allows IS/MS Combination
Stereo Encoding(Even Geekier)
• JS Encodes L+R and L-R
• If L=R then L-R=0
• Since Bit Rate is ConstantL=R Uses Fewer Bits for Stereo Information
Stereo Encoding in Practice
• Stereo vs. Mono (not Music vs. Voice) is a Good Reason to Use Higher Bit Rates
• Greater Separation Suggests Higher Rates
• If Mostly Speech, Consider 100% Mono
• If Mono, Make L&R Digitally Identical
• Always Encode in Stereo for Compatibility
Mastering for MP3
• Help the Encoder: Eliminate Unnecessary Data
- High-Pass Filter at 80Hz
- Low-Pass Filter at 11kHz (@64kbps encoding)
- Normalize
Which is Louder?
• It’s Not the Height of the Peaks (voltage)
• It’s the Area Under the Curve (power)
Loudness
• What’s the Standard?
• We Asked:
- Podcasters
- Audio Engineers
- Radio Engineers
• Answer: There Isn’t One
• It’s a Hard Problem to Solve
Normalization
• Peak Normalization (common)
- Maximizes Voltage, not Power
• RMS Normalization
- Maximizes Power (=Loudness)
• Determine a Standard Loudness Level
Avoid Recording to MP3!
• MP3 is a final/release format.
• Not designed to be decoded and re-encoded.
• Use MP2 Instead...
• or the highest MP3 bit rate possible.
AAC/M4B Files?
• Yes, AAC is Better Than MP3
• We Added AAC to Support iPod Bookmarks
• Painful: Only iTunes Could Encode M4B
• Doubled Much of Our Workflow
• Can’t Be Easily Assembled
MP2: Why and When?
• MPEG-1 Layer 2
• Designed as an Intermediate Format
• The Standard in Broadcast Radio
• 128kbps per Track
• 44.1kHz Sample Rate Preferred
Audio Lessons Learned
• MP3 Options
• Audio-File Myths
• RMS Normalization (Loudness)
• AAC/M4B Files (iTunes & iPods)
• MP2 Files
To Summarize
• Record at 44.1kHz Sample Rate (not in MP3!)
• Mastering
- RMS Normalization (Pick a Standard Level)
- 80Hz Hi-Pass, 11kHz Low Pass (for voice)
- If Mono, Make L&R Digitally Identical
• Encoding
- 64kbps when L=R
- Consider ≥96kbps for L≠R
- Always Use Joint Stereo