VoIP QOS

Voice quality is critical to the business of the network

Voice over Internet ProtocolVoice Quality of Service

This white paper address the four key elements inproviding and maintaining good voice quality in aVoice over Internet Protocol network:

• Network design and configuration

• Call admission control

• Security

• Voice quality monitoring

Voice over Internet Protocol is rapidly growingworldwide due to the new services it can provide

2

Introduction .......................................................................................3

Network Design and Configuration.............................................................3

Call Admission Control ...............................................................................3

Security........................................................................................................4

Voice Quality Monitoring ............................................................................5

Voice Quality Standards......................................................................5

Mean Opinion Scores ..................................................................................5

Perceptual Models .......................................................................................6

E-Model.......................................................................................................7

Network Planning Using the E-Model ...............................................10

Voice Quality Monitoring ..................................................................12

References........................................................................................13

About the authors: ...........................................................................14

Contents

3

Introduction

Voice over Internet Protocol (VoIP) is rapidly growing worldwide due tothe new services that it can provide more easily as well as the cost savingsderived from using a converged IP data network. However, as with anynew technology, the methods and procedures used to design and managethe network must change to reflect new issues and constraints. One issuein particular that merits special attention is the fact that voice quality forcalls that get admitted into the network can be degraded for manyreasons. It is important to not only monitor the network to measure voicequality, but it is important to predict expected voice quality under variousnetwork conditions and traffic loads so that steps can be taken in advanceto block potential problems.

Voice quality is critical to the business of the network. For serviceproviders, poor voice quality may cause some customers to switch carriersand result in significant lost revenue. Another, more subtle point, is thatlower voice quality will eventually result in shorter average call holdingtimes as conversation is just not as natural. Some enterprise customersconduct a significant percentage of their business over the phone andexcellent voice quality is essential to the success of their business.

There are four key elements in providing and maintaining good voicequality in a VoIP network; network design and configuration, calladmission control, security, and voice quality monitoring.

Network Design and Configuration

The first step in delivering good voice quality is to provide enoughcapacity in the network to handle the voice calls. This starts with aforecast of point-to-point traffic loads for voice as well as the other datatraffic on the network and performance and restoration requirements foreach type of traffic. It is important that standards based methods are usedto reserve bandwidth on each link for voice calls and that the voicepackets be marked accordingly. Policing is required at the edge of thenetwork to guarantee that packets are marked correctly and that traffic iswithin contract limits. In addition, Multi-Protocol Label Switching (MPLS)should be used for traffic engineering and restoration service. For largernetworks, multi-class network design algorithms and tools will be used toplan a network that meets the performance and restoration requirementsand minimizes capital expenditures. In addition, signaling traffic,softswitch and gateway sizing and placement, messaging servers, etc needto be planned and implemented. For more details on VoIP networkdesign, please refer to [1].

Call Admission Control

Another key element for providing Quality of Service (QoS) for VoIP is theCall Admission Control (CAC) capabilities of the Session Management /Call Session Control Function or gateway. Even though the network maybe designed to meet a given performance and restoration objective for theengineered traffic loads, the actual traffic may be significantly higher.While the famous overloads such as during September 11 or after the LoPrieta earthquake in Oakland during the World Series, may come to mindas extraordinary events, there are significant traffic overloads on a localizedportion of the network on a regular basis. Without a CAC function in a

4

VoIP network during overloads, links become congested and new callskeep getting admitted. All calls in progress, not just the new calls startdropping packets and experiencing longer delays. Contrast this to a circuit-switched network, where new calls get blocked, but calls in progressexperience good call quality. In the VoIP case, packet loss could becomelarge enough that calls become unintelligible, callers hang-up their call,and most will reattempt. Historical studies have shown reattempt rates ofabout 80% independent of the number of previous reattempts. This cancause the number of call attempts in the network to skyrocket and causesignificant switch processor overloads. The net effect is a seriousdegradation or even total collapse of the network. During disasters oremergencies that might cause such an overload, this is exactly the timewhen the proper operation of such a critical national infrastructure is mostcritical. Reference [2] gives an overview of potential CAC approaches,highlighting four basic alternatives; based on endpoint performancemeasurements, path-based bandwidth management, link-based bandwidthmanagement, and per-call bandwidth reservation. That paper recommendsa link bandwidth management approach for its scalability and efficacy.Reference [3] demonstrates that the endpoint measurement approach canperform quite well under some circumstances. In addition [4] gives end-to-end VoIP architectures and methods to guarantee Absolute QoS. In fact,patents associated with this reference have been named one of the mostimportant patents of the year 2003 by the MIT Technology Review. Thereare many alternatives that meet different customer needs and properchoice requires some careful network modeling.

Security

Another key element to ensure voice quality is security. With the newarchitectures, protocols, riding on a shared network, there are many newsecurity threats. Some VoIP security threats include:

• Theft of service

– Only authorized subscribers can receive service

• Integrity

– Caller identity is verified

– Calls will be routed to the intended party

– Call content will not be tampered

• Confidentiality

– Call content is kept private

• Availability

– Authorized subscribers should be able to receive service

– The underlying network and the servers should be highly available

• Spam

– Undesired calls should be restricted

The International Telecommunication Union (ITU) standard X.805, basedon a Bell Labs model, provides a comprehensive framework upon which aVoIP security audit could be performed. Reference [5] provides moredetails surrounding the issues of VoIP security.

5

Voice Quality Monitoring

The last element of voice quality is the ability to predict and monitor voicequality in an IP network. This is the main focus of the remainder of thispaper, which is organized as follows. Section 2 will review the definitionsof voice quality and various standards that have been adopted. Section 3shows how the E-Model can be used for planning the voice quality in anetwork while Section 4 shows how the E-Model, supplemented withsome end-to-end measurements, can be used to monitor the voice qualityin an operational network.

Voice Quality Standards

In circuit-switched networks, good voice quality, for those calls admittedinto the network, was generally a given. However, IP networks introducemany new sources of distortion that can degrade voice quality. Beforeadding new services or network components to the existing networkinfrastructure, a service provider needs to determine the potential impactof the changes.

The end-to-end call quality consists of voice quality, call setup time, callblocking rate, call tear down time, and other call or service related defects.After a call is properly set up, voice quality is probably the most importantcharacteristics for the entire call duration. The end-to-end voice qualitymust be maintained for the entire call duration.

Voice quality can be affected by various impairment factors such ascodecs, delay, and packet loss. Such impairments are caused by theconfiguration of network equipment, network performance, and routingpath of calls. Among them, the network performance must be monitoredcontinuously due to its dynamic changes. A well-managed network isnecessary to provide the desired level of VoIP service. If the voice qualitywere below the desired level, it would be necessary to perform root-causeanalysis based on the measurements of network performance. Once thecauses of the degraded voice quality are diagnosed, the problems must befixed and it is important to ensure the solution really fixed the problemsand did not cause any new problems.

Mean Opinion Scores

Voice quality is a subjective measure of how individual users perceive thespeech quality and ease of conversing. The gold standard for measuringvoice quality is specified by ITU Recommendation P.800 and is known asthe Mean Opinion Score (MOS). The Mean Opinion Score (MOS) definesa method to derive a mean opinion score of voice quality after collectingscores between 1 (bad) and 5 (excellent) from human listeners. (seeFigure 1) This is a form of subjective testing because human listeners areinvolved. In subjective testing, subjects (human listeners) are required toclassify the perceived quality into categories (excellent, good, fair, poor,bad). In each subjective experiment, the MOS scores may differ, even forthe same condition, depending on the design of the experiment, the rangeof conditions included in the study, etc.

6

Figure 1: MOS Score Ratings

A rating of 4.0 or higher is often referred to as “toll quality” even thoughmany Public Switch Telephone Network (PSTN) connections would berated at about a 4.3. The measurements have to be done very carefully ina lab setting and require many subjects to be statistically valid. Thus thistest may useful in rating specific pieces of equipment or a stable referenceconnection, but it is expensive, time-consuming and inappropriate forgeneral network measurements.

Perceptual Models

There has been great interest in developing objective measures of voicequality that approximate the subjective human measures, and could bedeployed in a network setting. In the mid 1990s, the ITU began tostandardize objective speech quality measures designed to estimatesubjective voice quality. A robust objective speech quality measure shouldcorrelate well with subjective speech quality. There are two types ofobjective speech quality measures: perceptual models and the E-Model.Perceptual models estimate the voice quality by comparing the receivedspeech signal to the sent speech signal in a psychoacoustic domain. (SeeFigure 2)

Figure 2: Perceptual Model Diagram

Jitter buffer

“MOS”

IPNetwork

Encoder Packetizer De-Packetizer Decoder

talkspurt talkspurt talkspurt talkspurt

silence

PerceptualModels

1 2 3 4 5

1 2 3 4 5

Source Codec Impairment

Rating Speech Quality Level of Distortion5 Excellent Imperceptible

4 Good Just perceptible but not annoying

3 Fair Perceptible and slightly annoying

2 Poor Annoying but not objectionable

1 Bad Very annoying and objectionable

MOS of 4.0 = Toll Quality

7

The perceptual models focus on the effects of one-way speech distortionand they do not consider other impairments related to two-wayinteraction such as delay. The perceptual models are not scalable becausethey need to inject the speech samples at one end point and receive themat another end point in order to measure voice quality between two endpoints. If the voice quality becomes degraded, the perceptual models donot show the causes of degradations. These measures only get a snapshotof system performance by monitoring synthetic calls or average calls, not“real” calls. Additionally, by adding synthetic calls on the network, thesemeasures can exacerbate conditions being tested by increasing load on thenetwork. This tends to make the perceptual models more suitable for labor prototype environments for capacity planning type activities.

One of the first perceptual models standardized by the ITU was thePerceptual Speech Quality Measure (PSQM) model, P.861. It quicklybecame apparent that the PSQM model had many problems and was notaccurate enough for use in VoIP networks. A competition was held amongthe many competing algorithms and Perceptual Evaluation of SpeechQuality (PESQ) was the clear winner [6]. This was adopted as ITURecommendation P.862 and made the P.861 standard obsolete.

PESQ, while achieving some success, has been shown to lack the desiredaccuracy needed to, for example, determine whether a given ServiceLevel Agreement (SLA) had been met. Pennock [7] has shown that thereare limitations to using PESQ for verification of speech qualityperformance, competitive analysis, and system optimization.

E-Model

The ITU has developed another class of objective measures, known as theE-Model and specified in Recommendation G.107. [8] The E-model is atool for predicting how an “average user” would rate the voice quality of aphone call with known characterizing transmission parameters. Itestimates the user satisfaction of a narrowband, handset conversation, asperceived by the listener. The E-Model calculates the transmission ratingfactor R, using the network impairment factors, which were obtainedafter an extensive set of subjective experiments. Typical networkimpairment factors used in VoIP are codecs, delay, and packet loss. Aftercomputing the R-value based on the impairment factors, the R-value isconverted into an MOS score. Since the E-Model is based on themeasurements of impairments, it is appropriate for root-cause analysis interms of impairment factors as well as network segments, and can beeasily incorporated within the Network Management System. The E-Model is also scalable because it does not require the speech samplesbetween many pairs of nodes to estimate the voice quality.

8

The E-Model consists of several models that relate specific impairmentparameters and their interactions to end-to-end performance. The totalend-to-end performance, taking into account all factors, is estimated usingthe Impairment Factor method. The equation for the transmission ratingfactor R is:

R = R0 – Is – Id – Ie + A

Where,

• R0: the basic signal-to-noise ratio based on sender and receiver loudnessratings and the circuit and room noise

• Is: the sum of real-time or simultaneous speech transmissionimpairments, e.g. loudness levels, sidetone and PCM quantizingdistortion

• Id: the sum of delay impairments relative to the speech signal, e.g.,talker echo, listener echo and absolute delay

• Ie: the equipment impairment factor for special equipment, e.g., low bit-rate coding (determined subjectively for each codec and for each %packet loss and documented in Appendix I to ITU-TRecommendation G.113) [9]

• A: the advantage factor adds to the total and improves the R-value fornew services.

Assuming that echo is properly controlled by echo cancellation modules,let us look at the impairments of the E-Model in terms of delay, codecimpairments, and packet loss.

The curve in Figure 3 plots the transmission rating factor R versus one-way delay for the reference connection. The right-hand side of Figure 3includes the “User Satisfaction” scale for reference. The referenceconnection curve uses the E-Model default value (93.19) for allparameters except the variable delay. This gives the best possibleperformance for a narrowband handset conversation, over the range ofone-way delay, and therefore will be used as the “relative reference”.Based on this curve, if the delay is the only VoIP impairments, a “verysatisfied” rating requires a one-way delay less than about 140 ms.

9

Figure 3. Delay impairment of Reference Connection

The E-Model is flexible to deal with the impairments introduced byspeech codecs and packet loss via the equipment impairment factor (Ie).Ie values for several codecs are listed in ITU-T Recommendation G.113and Table 1 lists a few of them. The Ie values in Table 1 were determinedin subjective experiments with ideal software implementation of thecodecs; the performance provided by commercial codecs may vary.

Table 1. Speech codecs and their Ie values

Packet loss impacts various codecs in different ways and must beevaluated with subjective testing to compute the impairment factors. Forexample, the equipment impairment factors (Ie) for G.711, G.729A, andG.723.1 codecs under conditions of packet loss are listed in Table 2. Packetloss concealment (PLC) algorithm is strongly recommended when a G.711codec is used under conditions of packet loss.

Codec Type Codec Bit Rate (Kbps) Ie Value

PCM G.711 64 0

ADPCM G.726 40 2

G.726 32 7

G.726 24 25

LD-CELP G.728 16 7

CS-ACELP G.729-A + VAD 8 11

RPE-LTP GSM-Full Rate 13 20

VCELP GSM-Half Rate 5.6 23

ACELP GSM-EFR 12.2 5

MP-MLQ G.723.1 5.3 19

MP-MLQ G.723.1 6.3 15

50

60

70

80

90

100

0 100 200 300 400 500

One-way delay (ms)

R

Nearly all users dissatisfied

Very satisfactory

Satisfactory

Some usersdissatisfied

Many usersdissatisfied

User Satisfaction

10

Table 2. Packet loss impairment and their Ie values

Network Planning Using the E-Model

Using the information from the above section, one can use the E-Model toplan new VoIP networks or predict the impact of various potentialchanges. From the network design, router configurations, VoIP trafficmodel, and the call routing, one can estimate the delay and packet lossacross the network using approximate queueing models. Thus trade-offscan be explored between equipment/codec choices and networkconfigurations to deliver good quality voice. For example, Figure 4 showsthe R-factor as a function of delay for various codec choices.

Figure 4: Voice Quality Versus Delay For Different Codec Choices

50

60

70

80

90

100

0 100 200 300 400 500

One-way Delay (ms)

eulaV-

R

G.711(64K) G.726(32K), G.728(16K) G.729(8K) G.723.1(6.3K)

TollQuality

Packet Loss (%)G.711 without

PLC

G.711 + PLCRandom Packet

Loss (10msspeech packet)

G.711 + PLCBursty Packet

Loss (10msspeech packet)

G.729A + VAD 8Kbps (2 speechframes/ packet)

G.723.1 + VAD6.3 Kbps (1 speech

frame/packet)

0 0 0 0 11 15

1 25 5 5 15 19

2 35 7 7 19 24

3 45 10 10 23 27

4 — — — 26 32

5 55 15 30 — —

11

Thus we see that the question, “What is the delay requirement in a VoIPnetwork?” is not so well-defined. For a fixed voice quality measure, weget different delay requirements, depending on the codec used. In thisfigure, all other factors were assumed to be at their nominal ideal factors.With other impairments, voice quality could be worse. In particular,packet loss can vary dynamically and needs to be monitored. Figure 5shows an example with G.729A and silence suppression for differentpacket loss assumptions. Here we see that in a network with no packetloss, the delay requirement would be about 120 ms and if there was 1%packet loss then no delay can be tolerated. Since some delay is alwayspresent this will no longer provide toll quality voice. Of course theseperformance estimates and the E-Model are approximate measures ofvoice quality, but gives a reasonable assessment of future performance.Thus the E-Model is a handy way to evaluate design trade-offs whenplanning a new VoIP network or major changes to an existing one. Thusone can evaluate design parameters such as codec choice, packet size, linkutilization, and jitter buffer design in terms of expected voice quality. Itshould be noted that a large jitter buffer will result in excessive delay, thusreducing the voice quality, while a jitter buffer that is too small will resultin larger packet loss. Most equipment vendors implement some form of adynamic jitter buffer to account for this problem.

Figure 5: G.729A With Voice Activity Detection

50

60

70

80

90

100

0 100 200 300 400 500

One-way Delay (ms)

eulaV-

R

TollQuality

0% PL 1% PL 2% PL 3% PL 4% PL

12

Voice Quality Monitoring

In an operating VoIP network it is no longer necessary to estimate delayand packet loss, but rather they can be measured directly. The sameE-Model can be used to monitor voice quality for any specified referenceconnection. Here a network management system would poll the routersfor packet loss for the voice class of traffic on each interface. (See Figure 6)

Figure 6: E-Model Analysis of VoIP Network

The delay measurement is available on some routers such as Cisco SAA.Otherwise probes will be required. In this case selected endpoint pairs willbe used to measure delay across high priority paths. When links areshowing high utilization or packet loss, paths that use them would begood candidates to monitor for delay and deteriorated performance. Sincethe E-Model is based on lab testing of specific equipment models, itwouldn’t know if a specific codec, for example, was not working properly.Thus in some cases it will be useful to supplement the E-Model resultswith some PESQ tests on a routine or on-demand basis.

Another item of interest is that of root-cause analysis when the voicequality is insufficient. The E-Model is ideally suited for this purpose as itexplicitly identifies each of the impairment sources and can point out themajor degradation causes. Again the predictive model could be used tounderstand the impact of potential remedial actions. The E-Modelmeasures the average voice quality along a specific reference connection.Additional modeling would be required to estimate the impact toindividual calls. Alternatively, when Real Time Control Protocol (RTCP) isavailable, performance measurements on a call-by-call basis would bemore useful for that purpose. Again, these measurements would feed intothe E-Model to estimate voice quality. Since these measurements aren’ttypically available until the end of the call, the additional delay wouldslow down any responsive actions that might be taken. They still offer agood measure of past voice quality.

WirelessAccess

WirelessAccess

PSTNAccess

PSTNAccess

DSLAccess

CableAccess Enterprise

LAN

DSLAccess

CableAccessEnterprise

LAN

IPNetwork

Router

NetworkManagement

System

Router

Router

EdgeRouterGW

DSLAM DSLAM

CMTS CMTSIP PBX IP PBX

GW TDMSwitch

TDMSwitch

EdgeRouter

E-Model

13

In summary, voice quality is a key metric in a VoIP network (along withcall blocking and call set-up delay). Objective measures of voice qualityare typically used as an estimate of subjective voice quality. Of theseobjective models, the E-Model has proven to be the most useful andaccurate in a network setting. In the predictive setting, the E-Model isused as part of a VoIP Network Readiness Assessment Service. In anoperational network, the E-Model is used as part of a Voice QualityAssessment service.

References1.Jayant Deshpande and David Houck, “Challenges in VoIP Network Design

for Service Providers”,

2.David Houck and Gopal Meempat, “Call Admission Control and LoadBalancing for Voice over IP,” Performance Evaluation 47, (2002) 243-253.

3.David Houck, Eunyoung Kim, Huseyin Uzunalioglu, and Larry Wehr, “AMeasurement-Based Admission Control Algorithm for VoIP,” Bell LabsTechnical Journal 8 (2) 97-110, 2003.

4.Doshi, Eggenschwiler, Rao, Samadi, Wang, and Wolfson, “VoIP NetworkArchitectures and QoS Strategies,” Bell Labs Technical Journal 7(4) 41-59,2003.

5.ITU-T Recommendation P.862, “Perceptual evaluation of speech quality(PESQ), an objective method for end-to-end speech quality assessment ofnarrowband telephone networks and speech codecs,” 02/2001.

6.Scott Pennock, “Accuracy of the Perceptual Evaluation of Speech Quality(PESQ) algorithm,” Proceedings of the Measurement of Speech and AudioQuality in Networks On-line Workshop, MESAQIN’02, January 2002.

7.ITU-T Recommendation G.107, “The E-model, a computational model foruse in transmission planning,” 03/2003.

8.ITU-T Recommendation G.113, “Transmission impairments due to speechprocessing,” 02/2001.

9.Timothy Hall, “Objective Speech Quality Measures for InternetTelephony,” in Voice over IP (VoIP) Technology, Petros Mouchtaris, Editor,Proceedings of SPIE, Vol. 4522, 2001

To learn more about our comprehensive portfolio, please contact your Lucent Technologies Sales Representative.

Visit our web site at www.lucent.com.

This document is for planning purposes only, and is not intended to modify or supplement any Lucent Technologies specifications or warranties relating to these products or services. The publication of information in this document doesnot imply freedom from patent or other protective rights ofLucent Technologies or others.

Cisco is a registered trademarks of Cisco Systems, Inc.

Copyright © 2004Lucent Technologies Inc.All rights reserved

LWS VoIP v2 09/04

About the authors:

David J. HouckLucent Technologies– Bell Laboratories

David is a technical manager in the QoS Management and AssessmentGroup in Holmdel, New Jersey. David leads a team that focuses onperformance modeling and traffic management of converged packetnetworks with QoS requirements. Dave received a B.A. in mathematics in1970 and a Ph.D. in operations research in 1974, both from The JohnsHopkins University.

Wonho YangLucent Technologies– Bell Laboratories

Wonho is a member of Technical Staff at Lucent Technologies Bell Labs,Holmdel, NJ, working on QoS management and assessment in the areasof VoIP, MPLS, and cable telephony. Wonho received a B.S. in 1989 inPhysics from Seoul National University, Seoul, Korea. He received anM.Eng. in 1996 and Ph.D. in 1999, both in electrical and computerengineering from Temple University, Philadelphia, PA.

VoIP QOS

Documents

Transcript of VoIP QOS