Post on 29-Aug-2018
Opus, the Swiss Army Knife of Audio codecs
Jean-Marc ValinKoen VosTimothy B. TerriberryGregory Maxwell
Mozilla, Xiph.Org Foundation
2
What Is the Opus Codec?● IETF standard under development● Targets interactive audio over the Internet● Aims to be royalty-free: BSD code with free
license to all patents● Effort involves: Xiph.Org, Mozilla, Skype,
Octasic, Broadcom and more● Combination of the SILK and CELT codecs
3
History● January 2007: SILK codec gets started at Skype
● November 2007: CELT codec gets started
● January 2009: CELT presented at LCA
● March 2009: Skype asks IETF to create a WG to standardize an “Internet wideband audio codec” (SILK)
● February 2010: After heated debate, IETF codec working group created
● July 2010: First prototype of a SILK+CELT hybrid codec
● March 2011: Opus beats HE-AAC and Vorbis in HA test
● Nov 2011: WGLC, last minor bitstream changes
4
Characteristics● Sampling rate: 8 – 48 kHz (narrowband-fullband)● Bitrates: 6 – 510 kb/s● Frame sizes: 2.5 – 20 ms● Mono and stereo support● Speech and music support● Seamless switching between all of the above● It just works for everything
5
Codec Landscape
Vorbis, AAC, MP3
08040
AMR-WB+
AAC-LD
OpusOpusG.729
80
40
Bitrate (kbps/channel)
Delay (m
s)
20
narrowband wideband > wideband
200
≈
≈
Speex (NB, WB)
G.722.1C
Sto
rage
Re
al-tim
e (live
)
Phone quality High fidelity
G.729.1
6
Applications● VoIP and videoconference● Music/video streaming and storage● Remote music jamming● Wireless speakers/headphones/mic● Audio books● Virtualization/sound servers● Everything except:
– Lossless (use FLAC)
– Ultra low bitrate satellite/ham radio (use codec2)
7
Architecture● Three operating modes:
– SILK-only (speech up to wideband)
– Hybrid (super-wideband/fullband speech
– CELT-only (music)
8
Technology (SILK)● Speech codec● Based on linear prediction (LPC)
– A bit like Speex, but much better
● Very good at coding narrowband and wideband speech
– Up to ~32 kb/s
● Not very good on music● Heavily modified to integrate within Opus
– Not compatible with the original SILK codec
9
Technology (CELT)● “Constrained-Energy Lapped Transform”● Speech+music codec
– Can work with very low delay
● Uses modified discrete cosine transform (MDCT)● Most efficient on fullband (48 kHz) audio
– Useful for 40 kb/s and above
● Not very good on low bit-rate speech
10
CELT Overview● Transform codec (MDCT)
– Long blocks up to 20 ms, short blocks of 2.5 ms
● Key is preserving the energy in each Bark band● Algebraic VQ for band “details”● Minimal side information
Window MDCT /
Bandenergy
Q2 xPost-filter
Q1
Input Output
Encoder Decoder
MDCT-1
band energy
residualPre-filter WOLA
Side information(period and gain)
11
CELT Presentation, LCA 2009
12
CELT Presentation, LCA 2009
13
Bitstream Changes● Many changes required by Opus
– Changes to band layout
– 20 ms frames
● Static bit allocation tuning– Stop starving the high frequencies
14
Static Bit Allocation Tuning● Comparison for 64 kb/s stereo
15
Bitstream Changes● Many changes required by Opus
– Changes to band layout
– 20 ms frames
● Static bit allocation tuning– Stop starving the high frequencies
● Anti-collapse
16
Anti-Collapse● Pre-echo avoidance can cause collapse
– Solution: fill holes with noise
No anti-collapse With anti-collapse
17
Bitstream Changes● Many changes required by Opus
– Changes to band layout
– 20 ms frames
● Static bit allocation tuning– Stop starving the high frequencies
● Anti-collapse● Per-band time-frequency modifications
– Long vs short blocks on a per-band basis
18
Time-Frequency Resolution● Tones and transients can happen simultaneously
Good frequencyresolution
Good timeresolution
freq
uenc
y
Time
freq
uenc
y
Time
Standard shortblocks
per-band TFresolution
∆T*∆f ≥ constant
(also known as Heisenberg'suncertainty principle)
19
Time-Frequency Resolution Example
Time
Fre
quen
cy
=
=
20
CELT Presentation, LCA 2009
21
CELT Presentation, LCA 2009
22
Dynamic Allocation● CELT still has mostly static allocation
– Part of the bit-stream, tuned since 2009
● Now two ways to deviate from static allocation– Allocation tilt
● Controls HF vs LF allocation trade-off
– Band boost● Gives more bits to a band in particular● WIP: Use for leakage compensation
23
CELT Presentation, LCA 2009
24
CELT Presentation, LCA 2009
25
Stereo Coupling● Three modes: Dual, mid-side, intensity● Mid-side in the normalized domain
– Safe, cannot cause cross-talk or bad artefacts
– Based on preservation of the mid/side magnitude ratio
–
– Bit allocation depends on theta
● Same mechanism now used to split bands with more bits than largest codebook
26
CELT Presentation, LCA 2009
27
CELT Presentation, LCA 2009
28
Pitch prefilter/postfilter● Contributed by Broadcom● Shapes noise for highly harmonic content
Prefilter Postfilter
29
Subjective Testing● Comparison with other codecs
– AMR-NB, AMR-WB, Speex, Vorbis, AAC, ...
● Many tests performed during development● Tests on the final version:
– Google (7 MUSHRA tests)
– Nokia (2 MOS tests)
– HydrogenAudio (ABC/HR test)
30
Google Tests● Narrowband tests (English+Mandarin)
– Opus clearly better than Speex and iLBC
– Opus better than AMR-NB at 12 kb/s
● Wideband/fullband tests (English+Mandarin)– Opus clearly better than Speex, G.722.1, G.719
– Opus better than AMR-WB at 20 kb/s
● Opus clearly better than MP3 on music, inconclusive with AAC
● No transcoding issues with AMR-NB/AMR-WB
31
Nokia (clean+noisy speech)● Narrowband – fullband MOS speech test
Anssi Rämö, Henri Toukomaa, "Voice Quality Characterization of IETF Opus Codec", Proc. Interspeech, 2011.
32
HydrogenAudio● 64 kb/s stereo music ABC/HR test
33
Demo● Music at 64 kb/s
– u-law (G.711)
– Opus
– Reference
– MP3
● Bitrate sweep– 8 kb/s to 64 kb/s
34
Current Development● Tools
– Ogg encoder/decoder
– Matroska encoder/decoder
– Firefox support
● Quality improvements– Better tuning of encoder decisions
– Improved unconstrained VBR
– Automatic speech/music detection
35
Coming Up● IETF process
– IETF Last call
– RFC
● Industry adoption– RTCWeb
– Browser support (streaming/HTML5)
– Skype
– World domination
36
Resources● Website: http://www.opus-codec.org/● Git repository: git://git.opus-codec.org/opus.git● Mailing list: codec@ietf.org● IETF website: http://www.ietf.org/● IRC: #opus on irc.freenode.net
37
Questions?