VOIP Seminar Report

Voice over Internet Protocol

Contents

• Introduction

• Benefits of Integrating Voice and Data Networks

• Converting Voice to Data

• Efficient and Reliable Network Infrastructures

• Conclusion

Introduction

The human voice is still one of the most effective forms of communication. Therefore, it

comes as no surprise that recent advances in voice-driven technology and the ever-

increasing need to stay competitive in the business world are driving the development of

Voice over Internet Protocol (VoIP) as a dependable union of telephony and data

networks.

Many businesses, from large corporations to smaller e-commerce start-up companies, are

hoping this emerging technology will ultimately lead to greater efficiency, productivity

and cost savings. Not only will VoIP reduce redundant network expenditures, but

information will transfer smoothly from: computer-to-computer, computer-to-phone,

phone-to-computer, and fax-to-computer. While the benefits of integrating data and voice

networks are numerous, the network infrastructure modifications required are often

misunderstood. This is especially true when considering data cabling’s role in bandwidth

management.

What is VoIP?

VoIP is a type of network convergence. Specifically, it is the unification of voice and

data onto a single network infrastructure. The idea behind VoIP is to successfully convert

digital and analog communications (such as phone calls and faxes) into IP packets to be

sent through one network instead of a separate telephony network. The voice signals first

1

become digitized, then are split up into information packets and sent through an IP

network.

Definition

A voice-over—Internet protocol (VoIP) application meets the challenges of combining

legacy voice networks and packet networks by allowing both voice and signaling

information to be transported over the packet network. A fax-over—Internet protocol

(FoIP) application enables the interworking of standard fax machines with packet

networks. It accomplishes this by extracting the fax image from an analog signal and

carrying it as digital data over the packet network.

VoIP is the ability to make telephone calls and send faxes over IP-based data networks

with a suitable quality of service (QoS) and superior cost/benefit. Everyone is talking

about VoIP and everyone wants a piece of the pie.

• Equipment developers and manufacturers see a window of opportunity to

innovate and compete. They are busy developing new VoIP-enabled equipment

attempting to break into the market in time.

• Internet service providers see the possibility of competing with the PSTN for

customers.

• Users are interested in the integration of voice and data applications in addition to

the cost savings.

Although VoIP seems to be most attractive, the technology has not been developed to the

point where it can replace the services and quality provided by the PSTN. First it must be

clear that VoIP will indeed be cost effective. In order to compete with today's PSTN,

there must be significantly lower total cost of operation. This savings should initially be

seen in the area of long distance calls. VoIP provides a competitive threat to providers of

traditional telephone services that will clearly stimulate improvements in cost and

function throughout the industry.

Another immediate application for IP telephony is real-time facsimile transmission. Fax

transmission quality is typically affected by network delays, machine compatibility and

2

analog signal quality. To send faxes over packet networks, an interface unit must convert

the data to packet form, handle the conversion of signalling and control protocols and

ensure complete delivery of the scan data in the correct order. Packet loss and end-to-end

delay are even more critical here than in voice applications.

Many other applications have been identified to be implemented by VoIP. For example,

voice messages can be prepared using a telephone and then delivered to an integrated

voice/data mailbox using Internet or intranet services. Voice annotated documents,

multimedia files, etc. can easily become standard within office suites in the near future

Overview

Organizations around the world seek to reduce rising communications costs. The

consolidation of separate voice, fax, and data resources offers an opportunity for

significant savings. Accordingly, the challenge of integrating voice, fax, and data is

becoming a rising priority for many network managers. Organizations are pursuing

solutions that will enable them to take advantage of excess capacity on broadband

networks for voice, fax, and data transmission, as well as utilize the Internet and

company Intranets as alternatives to costlier mediums. The principles related to

implementing real-time voice- and fax-over-packet networks. An overview of the

embedded software architecture is presented, and a system is described for sending voice,

fax image, data, and signaling information over the packet network. Benefits to designers

and manufacturers of this embedded approach are lower cost of goods sold, faster time to

market, and lower development costs. Customers can gain a considerable advantage in

time to market in building their communication systems. A general class of packet

networks, as the modular software objects allow networks such as asynchronous transfer

mode (ATM), frame relay, and Internet/intranet (Internet protocol [IP]) to transport voice

and fax. Currently, the Frame Relay Forum and the International Telecommunication

Union (ITU) have defined protocols for transmission of fax over a packet network.

However, the principles described are equally applicable to ATM networks.

.

3

The main justifications for development of VoIP can be summarized as follows:

1. Cost reduction. As described, there can be a real savings in long distance

telephone costs which is extremely important to most companies, particularly

those with international markets.

2. Simplification. An integrated voice/data network allows more standardization and

reduces total equipment needs.

3. Consolidation. The ability to eliminate points of failure, consolidate accounting

systems and combine operations is obviously more efficient.

4. Advanced Applications. The long run benefits of VoIP include support for

multimedia and multiservice applications, something which today's telephone

system can't compete with.

Growth in the VoIP market is expected to be considerable over the next 5 years.

Estimations put the annual growth rate for IP-enabled telephone equipment at 132%

between 1997 and 2002 with an expected market of some $3.16B in 2002. Annual

revenues for the IP fax gateway market are expected to increase to over $100M by the

year 2000 (from less than $20M in 1996).

This expected growth is encouraging to prospective developers of VoIP products.

However, there still remain many challenges facing developers of VoIP equipment, both

in terms of voice quality, latency and packet loss as well as call control and system

management. For more information on these and more issues regarding VoIP, see the full

article: Voice over IP (VoIP) in techguide.com, sponsored by Telogy Networks.

Benefits of Integrating Data and Voice Networks

VoIP eliminates redundant networks and network resources, simplifies network

management and facilitates communication.1 Consequently, the major advantages of

VoIP are cost savings and increased productivity.

• Reduction of Redundant Networks—Because typical PBXs (public branch

4

http://www.techguide.com/comm/voiceip.shtml

exchanges) are also costly to operate, VoIP would result in substantial savings in

infrastructure costs even within a single building or facility.

• Toll Bypass—The corporate environment anticipates that convergence will result in

savings on long-distance toll calls, especially international voice calls where a substantial

part of the cost derives from regulatory fees. In most cases, these surcharges don’t apply

to circuits carrying data traffic. VoIP would result in a much less expensive way to make

voice calls.

• Diverse Voice Call Routing—Despite very reliable telephone service, large

companies often purchase multiple, diverse circuits to the local telephone company’s

exchange to act as a reserve system. However, these circuits are rarely utilized and

ultimately double the cost per month the company must pay for leased-line telephone

access circuits. However, with VoIP, if a failure occurs on the primary telephone circuit

at one location, the data network could be used to route calls temporarily to PBXs at

other company locations. This eliminates the need for a redundant telephone circuit by

leveraging the company’s data network.4

• Facilitation of Adds/Moves/Changes—It’s costly for a company to maintain

a typical work area that includes data and telephone connectivity, especially

when individuals move from one desk to another. With a DHCP (Dynamic Host

Configuration Protocol, which enables dynamic assignment of IP addresses to

devices on a network), moving a PC from one LAN to another will become simpler due

to auto-configuration functionality. However, moving a phone extension from one desk

to another (or to another building) is not as simple, usually requiring reconfiguration of

the office PBX systems. With the increased popularity of VoIP, a new type of PBX and

IP telephone station has entered the marketplace. An IP Phone can automatically

configure itself as a DHCP client. Now a desk requires only a single data jack, and the IP

Phone can act as a hub or switch to provide a data port for the user’s PC. One or more

PCs, or perhaps an IP fax machine, can be directly connected to the IP Phone, and all of

them can utilize the local IP network. Moving employees from one desk to another is

now less costly.5

• Falling IP Equipment Costs—Packetizing voice puts computing power for

5

data networking equipment near the endpoints of the network, where the packet

switching equipment shows a faster improvement in price/performance than

switched equipment. With many companies feverishly developing new features for

IP equipment, the price/performance of packet-switching equipment should

continue to improve. Packet-switching equipment technology is able to keep pace

with the increase in demand for IP.6

• Data Manipulation—A single network connection on the Web (via VoIP) also

allows the Internet to host multiple forms of communication such as voice mail,

e-mail and video images instead of using a conventional (and costly) telephone

line. Information will be able to be accessed and delivered in several forms.

Internet-based information (such as e-mail and e-commerce) can now be accessed

by telephone by using voice commands. A desktop computer (in addition to

receiving e-mail) can now function as a business telephone and fax machine.

Therefore, because of VoIP, organizations now recognize the potential for closer

interaction with their customers (and potential customers). For example, instead of

breaking a phone connection to transfer a potential buyer to customer service

(or worse, losing them to a competitor’s customer service rep or Web site), a sales

representative can access the appropriate customer service rep or Web page and

provide the information while maintaining the original connection.

Converting Voice to Data

In order for an existing TCP/IP network to successfully carry VoIP traffic, several

modifications must be made to make the network act like a circuit-switched network.8

Traditional voice networks are circuit-switched. They use a dedicated communications

path for every call, with a constant bit rate of 64 kbps. However, silence consumes more

than half of the average telephone call and if no calls are made, this bandwidth remains

unavailable for other traffic. This unused capacity is an inefficient use of bandwidth.9

However, this inefficiency is also what makes telephony networks reliable. Compared

6

to a telephony system, a data network is more likely to experience problems such as

server crashes or other similar delays. According to Lucent Technologies, telephony

problems are rare because 80 percent of voice network issues are resolved without

human intervention. This is partly the result of telephony networks circuit-switched

design and the dedicated bandwidth for each line. Since there’s a fixed amount of

bandwidth used during a circuit-switched voice call, voice data is never received out

of sequence at the end of the connection. When data travels over the network it’s

divided up and transmitted through information packets. If these packets are lost, they

can end up being transmitted late or out of order because they took different paths.

Today, one of the most important design considerations in implementing voice is

minimizing one-way, end-to-end delay. Retransmission is not an option with voice and

video traffic flows. It needs to stream in real-time; if there is too long a delay in

packet delivery, the data is unrecognizable.

Sequence and real-time transmission make bandwidth management crucial. With

VoIP, the data network infrastructure will need to be able to handle more traffic—

consistently—than ever before.

Data compression, routing and processing are crucial.

Compression—The bit rate for an uncompressed telephone call approaches 64 Kbps

(and uses a single Digital Service, Level 0 channel). This degree of bandwidth is

considered excessive, and is usually reduced for delivery over a typical data network.

Several encoding and compression algorithms are available to reduce the bandwidth

consumed by a telephone call. These compression mechanisms are usually found at the

VoIP gateways and not within the data network routers or switches.10

Network Quality of Service—A TCP/IP network must have mechanisms in place

to prioritize VoIP traffic above all other traffic on the network (except other real-time

application traffic such as video). A protocol called Resource Reservation Protocol

(RSVP) has been designed to reserve resources across the network for real-time

transmissions. Quality of Service (QoS) mechanisms within TCP/IP have also recently

been implemented by a number of TCP/IP router and switch vendors. ATM networks

and, to a lesser degree, Frame Relay networks have QoS functionality already built

into them. Generally, TCP/IP routers and switches will use a priority queuing system to

buffer non-VoIP packets and send them only after all of the VoIP packets have been

7

transferred to the next network element. Large IP packets (non-VoIP) are buffered to the

side so that the smaller VoIP packets can be sent on time. Other mechanisms will predict

times of congestion over the wide area link and throttle back bandwidth demands from

non-real-time applications.11

IP Packet Precedence—IP precedence bits should be set at the edge of the

network, with VoIP traffic given the highest possible precedence. Data networks with

protocols other than TCP/IP running on them are not as well suited for VoIP as a

purely TCP/IP network because it’s more difficult to give traffic priority when it is not

a TCP/IP packet. Whenever possible, all bridged traffic should all be segregated

from the TCP/IP WAN links where VoIP will flow. Bridging of any sort over the wide

area will hinder TCP/IP QoS implementations.12

Weighted Fair Queuing—Weighted fair queuing (WFQ) is a buffering mechanism

that will buffer TCP/IP packets, classify them based on a number of different criteria,

and then de-buffer the packets based on IP precedence or traffic flow. The

classifications available are: source and destination address, protocol, and session

identifier. During the de-queuing procedure, packets are given privilege based on the

three IP precedence bits in the packet’s IP header.13

Voice Data Processing—Because of voice portals and voice sites in VoIP, voice

recognition is now processed on the network server and not on the telephone itself.

This allows the system to support millions of calls and to also recognize the various

ways callers may state or ask for information. VoIP is handled by a voice gateway

dedicated to PC-grade equipment outfitted especially with the hardware necessary for

Internet telephony. The gateway is placed between the public switched telephone

network (PSTN) and the Internet protocol (IP) network. The gateway assists with the

signals between the phone networks, reception of telephone numbers, conversion

between telephone numbers and the IP addressing in the IP network and the

conversion of voice to packets.14

While these applications will certainly reduce data traffic problems, it is doubtful

that they alone will address increasing bandwidth demands.

Efficient and Reliable Network Infrastructures

Presently, more than 50 percent of data network problems are caused by the

8

infrastructure.15 With VoIP, the data network infrastructure will need to be able to

handle more data traffic consistently than ever before. Successful implementation of

VoIP applications require an efficient and reliable data network infrastructure.

Fortunately efficient and reliable data cabling solutions are available to help manage

data traffic better.

The true gauge of a network’s performance is its throughput efficiency. Throughput

refers to the amount of data that is transferred from server to user. If the network is

supposed to transmit data at 100 Mbps, but it’s transferring data at a lesser rate, it isn’t

working at optimum levels, and that network’s efficiency is compromised as a result.

These delays in data transference are usually the result of a network infrastructure that

degrades a signal to the point it is no longer intelligible to the receiver. This requires a

retransmission of the signal and results in delays and inefficient network performance.

Today, this scenario is relatively common, but with VoIP it won’t be acceptable.

Retransmissions are not an option with voice and video traffic flows. It needs to stream in

real-time—in fact, a delay of more than 200 milliseconds is considered unacceptable.

While packet prioritizing and routing applications will help, compression programs often

make data signals more susceptible to degradation. This issue will be exacerbated by

the increased data traffic and bandwidth requirements VoIP will bring.

Choosing the right data cabling solution requires not just an understanding that

throughput efficiency is a necessity, but also that reliability is directly related to

headroom. Once the expected data traffic has been estimated, the level of

reliability—the amount of headroom—needs to be determined. These two factors

determine which high-performance cabling solution is required.

Certain network infrastructures are designed to maximize data throughput well beyond

minimum standard requirements. Unfortunately, no data cabling standard adequately

addresses the performance requirements associated with VoIP. However there are

other sources for information, like Anixter’s Levels® Program, which can provide

guidance. Anixter’s Levels Lab® tests network infrastructures by running real-world

applications in real-world networking environments to determine which solutions

provide enough data throughput and headroom. Taking the time to ensure that the

cabling infrastructure can consistently provide the necessary bandwidth, is crucial to

implementing VoIP successfully.

9

.

1. VoIP Applications

A wide variety of applications are enabled by the transmission of VoIP networks

The first application, shown in Figure 1, is a network configuration of an organization

with many branch offices (e.g., a bank) that wants to reduce costs and combine traffic to

provide voice and data access to the main office. This is accomplished by using a packet

network to provide standard data transmission while at the same time enhancing it to

carry voice traffic along with the data. Typically, this network configuration will benefit

if the voice traffic is compressed as a result of the low bandwidth available for this access

application. Voice over packet provides the interworking function (IWF), which is the

physical implementation of the hardware and software that allows the transmission of

combined voice and data over the packet network. The interfaces the IWF must support

in this case are analog interfaces, which directly connect to telephones or key systems.

The IWF must emulate the functions of both a private branch exchange (PBX) for the

telephony terminals at the branches, as well as the functions of the telephony terminals

for the PBX at the home office. The IWF accomplishes this by implementing signaling

software that performs these functions.

Figure 1. Branch Office Application

10

A second VoIP application, shown in Figure 2, is a trunking application. In this scenario,

an organization wishes to send voice traffic between two locations over the packet

network and replace the tie trunks used to connect the PBXs at the locations. This

application usually requires the IWF to support a higher-capacity digital channel than the

branch application, such as a T1/E1 interface of 1.544 or 2.048 Mbps. The IWF emulates

the signaling functions of a PBX, resulting in significant savings to companies'

communications costs.

Figure 2. Interoffice Trunking Application

11

A third application of VoIP software is interworking with cellular networks, as shown in

Figure 3. The voice data in a digital cellular network is already compressed and

packetized for transmission over the air by the cellular phone. Packet networks can then

transmit the compressed cellular voice packet, saving a tremendous amount of

bandwidth. The IWF provides the transcoding function required to convert the cellular

voice data to the format required by the public switched telephone network (PSTN).

Figure 3. Interoffice Trunking Application

12

2. VoIP QoS Issues

The advantages of reduced cost and bandwidth savings of carrying voice-over-packet

networks are associated with some quality-of-service (QoS) issues unique to packet

networks.

Delay

13

Delay causes two problems: echo and talker overlap. Echo is caused by the signal

reflections of the speaker's voice from the far-end telephone equipment back into the

speaker's ear. Echo becomes a significant problem when the round-trip delay becomes

greater than 50 milliseconds. As echo is perceived as a significant quality problem, voice-

over-packet systems must address the need for echo control and implement some means

of echo cancellation.

Talker overlap (or the problem of one talker stepping on the other talker's speech)

becomes significant if the one-way delay becomes greater than 250 milliseconds. The

end-to-end delay budget is therefore the major constraint and driving requirement for

reducing delay through a packet network.

The following are sources of delay in an end-to-end, voice-over-packet call:

Accumulation Delay (Sometimes Called Algorithmic Delay)

This delay is caused by the need to collect a frame of voice samples to be processed by

the voice coder. It is related to the type of voice coder used and varies from a single

sample time (.125 microseconds) to many milliseconds. A representative list of standard

voice coders and their frame times follows:

• G.726 adaptive differential pulse-code modulation (ADPCM) (16, 24, 32, 40

kbps)—0.125 microseconds

• G.728 LD–code excited linear prediction (CELP)(16 kbps)—2.5 milliseconds

• G.729 CS–ACELP (8 kbps)—10 milliseconds

• G.723.1 Multirate Coder (5.3, 6.3 kbps)—30 milliseconds

Processing Delay

This delay is caused by the actual process of encoding and collecting the encoded

samples into a packet for transmission over the packet network. The encoding delay is a

function of both the processor execution time and the type of algorithm used. Often,

multiple voice-coder frames will be collected in a single packet to reduce the packet

network overhead. For example, three frames of G.729 code words, equaling 30

milliseconds of speech, may be collected and packed into a single packet.

14

Network Delay

This delay is caused by the physical medium and protocols used to transmit the voice

data and by the buffers used to remove packet jitter on the receive side. Network delay is

a function of the capacity of the links in the network and the processing that occurs as the

packets transit the network. The jitter buffers add delay, which is used to remove the

packet-delay variation to which each packet is subjected as it transits the packet network.

This delay can be a significant part of the overall delay, as packet-delay variations can be

as high as 70 to 100 milliseconds in some frame-relay and IP networks.

Jitter

The delay problem is compounded by the need to remove jitter, a variable interpacket

timing caused by the network a packet traverses. Removing jitter requires collecting

packets and holding them long enough to allow the slowest packets to arrive in time to be

played in the correct sequence. This causes additional delay.

The two conflicting goals of minimizing delay and removing jitter have engendered

various schemes to adapt the jitter buffer size to match the time-varying requirements of

network jitter removal. This adaptation has the explicit goal of minimizing the size and

delay of the jitter buffer, while at the same time preventing buffer underflow caused by

jitter.

Two approaches to adapting the jitter buffer size are detailed below. The approach

selected will depend on the type of network the packets are traversing.

The first approach is to measure the variation of packet level in the jitter buffer over a

period of time and incrementally adapt the buffer size to match the calculated jitter. This

approach works best with networks that provide a consistent jitter performance over time,

such as ATM networks.

The second approach is to count the number of packets that arrive late and create a ratio

of these packets to the number of packets that are successfully processed. This ratio is

then used to adjust the jitter buffer to target a predetermined, allowable late-packet ratio.

15

This approach works best with the networks with highly variable packet-interarrival

intervals—such as IP networks.

In addition to the techniques described, the network must be configured and managed to

provide minimal delay and jitter, enabling a consistent QoS.

Lost-Packet Compensation

Lost packets can be an even more severe problem, depending on the type of packet

network that is being used. Because IP networks do not guarantee service, they will

usually exhibit a much higher incidence of lost voice packets than ATM networks. In

current IP networks, all voice frames are treated like data. Under peak loads and

congestion, voice frames will be dropped equally with data frames. The data frames,

however, are not time sensitive, and dropped packets can be appropriately corrected

through the process of retransmission. Lost voice packets, however, cannot be dealt with

in this manner.

Some schemes used by voice-over-packet software to address the problem of lost frames

are as follows:

• interpolate for lost speech packets by replaying the last packet received during the

interval when the lost packet was supposed to be played out; this scheme is a

simple method that fills the time between noncontiguous speech frames; it works

well when the incidence of lost frames is infrequent; it does not work well if there

are a number of lost packets in a row or a burst of lost packets

• send redundant information at the expense of bandwidth utilization; this basic

approach replicates and sends the nth packet of voice information along with the

(n+1)th packet; this method has the advantage of being able to correct for the lost

packet exactly; however, this approach uses more bandwidth and also creates

greater delay

• use a hybrid approach with a much lower bandwidth voice coder to provide

redundant information carried along in the (n+1)th packet; this reduces the

problem of the extra bandwidth required but fails to solve the problem of delay

16

Echo Compensation

Echo in a telephone network is caused by signal reflections generated by the hybrid

circuit that converts between a four-wire circuit (a separate transmit and receive pair) and

a two-wire circuit (a single transmit and receive pair). These reflections of the speaker's

voice are heard in the speaker's ear. Echo is present even in a conventional circuit-

switched telephone network. However, it is acceptable because the round-trip delays

through the network are smaller than 50 milliseconds and the echo is masked by the

normal side tone every telephone generates.

Echo becomes a problem in voice-over-packet networks because the round-trip delay

through the network is almost always greater than 50 milliseconds. Thus, echo-

cancellation techniques are always used. ITU standard G.165 defines performance

requirements that are currently required for echo cancellers. The ITU is defining much

more stringent performance requirements in the G.IEC specification.

Echo is generated toward the packet network from the telephone network. The echo

canceller compares the voice data received from the packet network with voice data

being transmitted to the packet network. The echo from the telephone network hybrid is

removed by a digital filter on the transmit path into the packet network.

3. VoIP–Embedded Software Architecture

Two major types of information must be handled to interface telephony equipment to a

packet network: voice and signaling information.

As shown in Figure 4, VoIP software interfaces to both streams of information from the

telephony network and converts them to a single stream of packets transmitted to the

packet network. The software functions are divided into four general areas.

17

Figure 4. VoIP Software Architecture

Voice Packet Software Module

This software, also known as the voice-processing module, typically runs on a digital-

signal processor (DSP), prepares voice samples for transmission over the packet network.

Its components perform echo cancellation, voice compression, voice-activity detection,

jitter removal, clock synchronization, and voice packetization.

Telephony-Signaling Gateway Software Module

18

This software interacts with the telephony equipment, translating signaling into state

changes used by the packet protocol module to set up connections. These state changes

are on-hook, off-hook, trunk seizure, etc. This software supports ear, mouth, earth, and

magneto (E&M) Type I, II, III, IV, and V; loop or ground start foreign exchange station

(FXS); foreign exchange office (FXO); and integrated services digital network (ISDN)

basic rate interface (BRI) and primary rate interface (PRI).

Packet Protocol Module

This module processes signaling information and converts it from the telephony-signaling

protocols to the specific packet-signaling protocol used to set up connections over the

packet network (e.g., Q.933 and voice-over-frame relay signaling). It also adds protocol

headers to both voice and signaling packets before transmission into the packet network.

Network-Management Module

This module provides the voice-management interface to configure and maintain the

other modules of the voice-over-packet system. All management information is defined

in American National Standards Institute (ANSI).1 and complies with signaling network-

management protocol (SNMP) V1 syntax. A proprietary voice packet management

information base (MIB) is supported until standards evolve in the forums.

The software is partitioned to provide a well-defined interface to the DSP software usable

for multiple voice packet protocols and applications. The DSP processes voice data and

passes voice packets to the microprocessor with generic voice headers.

The microprocessor is responsible for moving voice packets and adapting the generic

voice headers to the specific voice packet protocol that is called for by the application,

such as real-time protocol (RTP), voice over frame relay (VoFR), and voice telephony

over ATM (VToA). The microprocessor also processes signaling information and

converts it from supported telephony-signaling protocols to the packet network signaling

protocol [e.g. H.323 IP, frame relay, or ATM signaling].

19

This partitioning provides a clean interface between the generic voice-processing

functions, such as compression, echo cancellation, and voice-activity detection, and the

application-specific signaling and voice protocol processing.

4. Voice Packet Module

This section describes the functions performed by the software in the voice packet

module, also known as the voice-processing module, which is primarily responsible for

processing the voice data. This function is usually performed in a DSP. The voice-

processing module consists of the following software:

• PCM interface—This receives pulse code modulation (PCM) samples from the

digital interface and forwards them to appropriate DSP software modules for

processing, forwards processed PCM samples received from various DSP

software modules to the digital interface, and performs continuous phase re

sampling of output samples to the digital interface to avoid sample slips.

• tone generator—This generates dual-tone multi frequency (DTMF) tones and

call progress tones under command of the host (e.g., telephone, fax, modem,

PBX, or telephone switch) and is configurable for support of U.S. and

international tones.

• echo canceller—This performs G.165–compliant echo cancellation on sampled,

full-duplex voice port signals. It has a programmable range of tail lengths.

• voice activation detector/idle noise measurement—This monitors the received

signal for voice activity. When no activity is detected for the configured period of

time, the software informs the packet voice protocol. This prevents the encoder

output from being transported across the network when there is silence, resulting

in additional bandwidth savings. This software also measures the idle noise

characteristics of the telephony interface. It reports this information to the packet

voice protocol to relay this information to the remote end for noise generation

when no voice is present.

20

• tone detector—This detects the reception of DTMF tones and performs voice/fax

discrimination. Detected tones are reported to the host so that the appropriate

speech or fax functions are activated.

• voice codec software—This compresses the voice data for transmission over the

packet data. It is capable of numerous compression ratios through the modular

architecture. A compression ratio of 8:1 is achievable with the G.729 voice codec

(thus, the normal 64–kbps PCM signal is transmitted using only 8 kbps).

• fax software—This performs a fax-relay function by demodulating PCM data,

extracting the relevant information, and packing the fax-line scan data into frames

for transmission over the packet network. Significant bandwidth savings can be

achieved by this process.

• voice playout unit—This buffers voice packets received from the packet network

and sends them to the voice codec for playout.

The following features are supported:

• a first in, first out (FIFO) buffer that stores voice code words before playout

removes timing jitter from the incoming packet sequence

• a continuous-phase resampler that removes timing-frequency offset without

causing packet slips or loss of data for voice- or voiceband-modem signals

• a timing jitter measurement that allows adaptive control of FIFO delay

The voice-packetization protocols use a sequence-number field in the transmit packet

stream to maintain temporal integrity of voice during playout. Using this approach, the

transmitter inserts the contents of a free-running, modulo-16 packet counter into each

transmitted packet, allowing the receiver to detect lost packets and to reproduce silence

intervals during playout properly.

• packet voice protocol—This encapsulates compressed voice and fax data for

end-to-end transmission over a backbone network between two ports.

• control interface software—This coordinates the exchange of monitor and

control information between the DSP and host via a mailbox mechanism.

Information exchanged includes software downline load, configuration data, and

status reporting.

21

• real-time portability environment—This provides the operating environment

for the software residing on the DSP. It provides synchronization functions, task

management, memory management, and timer management.

Figure 5 diagrams the architecture of the DSP software. The DSP software processes

PCM samples from the telephony interface and converts them to a digital format suitable

for transmission through a packet network.

Figure 5. Voice Packet Module

22

5. Signaling, Protocol and Management Modules

23

The VoIP software performs telephony signaling to detect the presence of a new call and

to collect address (dial digit) information, which is used by the system to route a call to a

destination port. It supports a wide variety of telephony-signaling protocols and can be

adaptable to many environments. The software and configuration data for the voice card

can be downloaded from a network-management system to allow customization, easy

installation, and remote upgrades.

The software interacts with the DSP for tone detection and generation, as well as mode of

operation control based on the line supervision, and interacts with the telephony interface

for signaling functions. The software receives configuration data from the network-

management agent and utilizes operating-system services.

Telephony-Signaling Gateway Module

Figure 6 diagrams the architecture of the signaling software, which consists of the

following components:

• telephony interface unit software—This periodically monitors the signaling

interfaces of the module and provides basic debouncing and rotary digit collection

for the interface.<P< li>>

• signaling protocol unit—This contains the state machines implementing the

various telephony-signaling protocols, such as E&M.

• network control unit—This maps telephony-signaling information into a format

compatible with the packet voice session establishment signaling protocol.

• address translation unit—This maps the E.164 dial address to an address that

can be used by the packet network (e.g., an IP address or a data link connection

indentifier (DLCI) for a frame-relay network).

• DSP interface driver—This relays control information between the host

microprocessor and DSPs.

• DSP downline loader—This is responsible for downline load of the DSPs at

start-up, configuration update, or mode changes (e.g., switching from voice mode

to fax mode when fax tones are detected).

24

Figure 6. Signaling Modules

25

Network-Protocol Module

• IP signaling stack—This involves H.323 call control and transport software,

including H.225, H.245, RTP/real-time conferencing protocol (RTCP) transport

protocol, transmission control protocol (TCP), IP, and user datagram protocol

(UDP).

• ATM signaling protocol stack—ATM Forum VToA voice-encapsulation

protocol. ATM Forum–compliant, user-network interface (UNI) signaling

protocol stack for establishing, maintaining, and clearing point-to-point and point-

to-multipoint switched virtual circuits (SVCs).

• frame-relay protocol stack—This includes Frame Relay Forum VoFR voice-

encapsulation protocol, permanent virtual circuit (PVC) and SVC support, local

management interface (LMI), congestion management, traffic monitoring, and

committed information range (CIR) enforcement.

Network-Management Module

The network-management software consists of three major services addressed in the

MIB:

1. physical interface to the telephone endpoint

2. voice channel service for the following:

o processing signaling on a voice channel

o converting between PCM samples and compressed voice packets

3. call-control service for parsing call-control information and establishing calls

between telephony endpoints

The VoIP software is configured and maintained through the use of a proprietary voice

service MIB.

26

6. VoIP Summary

A VoIP software architecture has been described for the interworking of legacy telephony

systems and packet networks. Some of the key features enabling this application to

function successfully are as follows:

• an approach that minimizes the effects of delay on voice quality

• an adaptive playout to minimize the effect of jitter

• features that address lost-packet compensation, clock synchronization, and echo

cancellation

• a flexible DSP system architecture that manages multiple channels per single DSP

Carrying VoIP networks provides the most bandwidth-efficient method of integrating

these divergent technologies. While the challenges to this integration are substantial, the

potential savings make the investment in a quality implementation compelling.

7. FoIP Applications

Traditionally, there have been two approaches for sending FoIP networks: real-time

methods and store-and-forward methods. The primary difference in service between these

two approaches lies in the delivery and method of receipt confirmation. The Frame Relay

Forum has defined a real-time protocol for the transmission of fax-over–frame relay

networks. Likewise, the ITU and Internet Engineering Task Force (IETF) are working

together to continue to evolve both the real-time FoIP network standard (T.38) as well as

the store-and-forward FoIP network standard (T.37). Both T.37 and T.38 were approved

by the ITU in June, 1998. Furthermore, T.38 is the fax transmission protocol selected for

H.323.

27

There are tremendous opportunities for cost savings by transmitting fax calls over packet

networks. Fax data in its original form is digital. However, it is modulated and converted

to analog for transmission over the PSTN. This analog form uses 64 kbps of bandwidth in

both directions.

The FoIP IWF reverses this analog conversion, transmitting digital data over the packet

network and then reconverting the digital data to analog for the receiving fax machine.

This conversion process reduces the overall bandwidth required to send the fax, because

the digital form is much more efficient, and the fax transmission is half-duplex (i.e., only

one direction is used at any time). The peak rate for a fax transmission is 14.4 kbps in one

direction. A representation of this process is shown in Figure 7.

28

Figure 7. FoIP Conversion Process

An application for fax over packet, shown in Figure 8, is a network configuration of a

company with numerous branch offices that wants to use the packet network, instead of

the long-distance network, to provide fax access to the main office. The IWF is the

physical implementation of the hardware and software that enables the transmission of

fax over the packet network. The IWF must support analog interfaces that directly

interface to fax machines at the branches and to a PBX at the central site. The IWF must

emulate the functions of a PBX for the fax machines.

Figure 8. FoIP Application

8. PSTN Fax-Call Procedure

29

This module will describe the stages of a standard fax call over the PSTN so that the

processing required for a reliable fax transmission over a packet network can be explored.

Fax machines in common use today implement the ITU recommendations T.30 and T.4

protocols. The T.30 protocol describes the formatting of non-page data, such as messages

that are used for capabilities negotiation. The T.4 protocol describes formatting of page

image data.

T.30 and T.4 have evolved substantially over time and are now quite complex, given that

they attempt to describe the behavior of an evolving set of fax machines. The timing

related to the message interaction and phases of the call is critical and is one of the major

causes of problems in the transmission of FoIP networks.

The PSTN fax call is divided into five phases, as shown in Figure 9. This example

assumes that the call is accomplished without errors. The procedure becomes somewhat

more complicated if errors occur or if there is a need for modem retraining. The five

phases are as follows:

• call establishment

• control and capabilities exchange

• page transfer

• end of page and multipage signaling

• call release

Figure 9. PSTN Fax-Call Flow

30

Call Establishment

31

The fax call is established either through a manual process, according to which someone

dials a call and puts the machine into fax mode, or by automatic procedures, according to

which no human interaction is required. In both cases, the answering fax machine returns

an answer tone, called a CallEd station IDentification (CED), which is the high-pitched

tone that you would hear when you call a fax machine. If the call is automatically dialed,

the calling station will also indicate the fax call with a calling tone (CNG), which is a

short, periodic tone that begins immediately after the number is dialed. These tones are

generated to allow a human participant to realize that a machine is present on the other

end call. They are also, although not reliably, used to recognize the presence of a fax call.

Control and Capabilities Exchange

The control and capabilities exchange phase of the fax call is used to identify the

capabilities of the fax machine at the other end of the call. It also negotiates the

acceptable conditions for the call. The exchange of control messages throughout the fax

call are sent using the low-speed (300 bps) modulation mode. Every control message is

preceded by a one-second preamble, which allows the communication channel to be

conditioned for reliable transmission.

The called fax machine begins the procedure by sending a digital identification signal

(DIS) message, which contains the capabilities of the fax machine. An example of a

capability that could be identified in this message is the support of the V.17 (14,000 bps)

data signaling rate. At the same time, the called subscriber information (CSI) and

nonstandard facilities (NSF) messages are optionally sent.

NSF are capabilities that a particular fax manufacturer has built into a fax machine to

distinguish its product from others. These facilities are not required to be supported for

interoperability.

Once the calling fax machine receives the DIS message, it determines the conditions for

the call by examining its own capabilities table. The calling machine responds with the

digital command signal (DCS), which defines the conditions of the call.

32

At this stage, high-speed modem training begins. The high-speed modem will be used in

the next phase of the fax call to transfer page data. The calling fax machine sends a

training check field (TCF) through the modulation system to verify the training and

ensure that the channel is suitable for transmission at the accepted data rate. The called

fax machine responds with a confirmation to receive (CFR), which indicates that all

capabilities and the modulation speed have been confirmed and the fax page may be sent.

Page Transfer

The high-speed modem is used to transmit the page data that has been scanned in and

compressed. It uses the ITU T.4 protocol standard to format the page data for

transmission over the channel.

End-of-Page and Multipage Signaling

After the page has been successfully transmitted, the calling fax machine sends an end-

of-procedures (EOP) message if the fax call is complete and all of the pages have been

transmitted. If only one page has been sent and there are additional ones to follow, it

sends a multipage signal (MPS). The called machine would respond with message

confirmation (MCF) to indicate the message has been successfully received and that it is

ready to receive more pages.

Call Release

The release phase is the final phase of the call, in which the calling machine sends a

disconnect message (DCN). While the DCN message is a positive indication that the fax

call is over, it is not a reliable indication, as the fax machine can disconnect prematurely

without ever sending the DCN message.

9. FoIP QoS

33

The advantages of reduced cost and bandwidth savings of carrying FoIP networks are

associated with some QoS issues that are unique to packet networks and can affect the

reliability of the fax transmission.

Timing

A major issue in the implementation of FoIP networks is the problem of inaccurate

timing of messages caused by delay through the network. The delay of fax packets

through a packet network causes the precise timing that is required for many portions of

the fax protocol to be skewed and can result in the loss of the call. The FoIP protocol in

the IWF must compensate for the loss of a fixed timing of messages over the packet

network so that the T.30 protocol operates without error.

There are two sources of delay in an end-to-end, FoIP call: network delay and processing

delay.

1. network delay—This is caused by the physical medium and protocols that are

used to transmit the fax data and by buffers used to remove packet jitter on the

receiving end. This delay is a function of the capacity of the links in the network

and the processing that occurs as the packets transit the network. The jitter buffers

add delay when they remove the packet-delay variation of each packet as it

transits the packet network. This delay can be a significant part of the overall

delay, as packet-delay variations can be as high as 70 to 100 milliseconds in some

frame-relay networks and even higher in IP networks.

2. processing delay—This is caused by the process of demodulating and collecting

the digital fax information into a packet for transmission over the packet network.

The encoding delay is a function of both the processor execution time and the

amount of data collected before sending a packet to the network. Low-speed data,

for instance, is usually sent out with a single byte per packet, as the time to collect

a byte of information at 300 bps is 30 milliseconds.

Jitter

34

Delay issues are compounded by the need to remove jitter, a variable interpacket timing

caused by the network that a packet traverses. An approach to removing the jitter is to

collect packets and hold them long enough so that the slowest packets to arrive are still in

time to be played in the correct sequence. This approach, however, causes additional

delay. In most FoIP protocols, a time stamp is incorporated in the packet to ensure that

the information is played out at the proper instant.

Lost-Packet Compensation

Lost packets can be an even more severe problem, depending on the type of packet

network that is being used. In a VoIP application, the loss of packets can be addressed by

replaying last packets and other methods of interpolation. A FoIP application, however,

has more severe constraints on the loss of data, as the fax protocol can fail if information

is lost. This problem varies, depending on the type of fax machine used and whether

error-correction mode is enabled.

Two schemes that are used by FoIP software to address the problems of lost frames are as

follows:

• repeating information in subsequent frames so that the error can be corrected by

the receiver's playout mechanism

• using an error-correcting protocol such as TCP to transport the fax data at the

expense of added delay

10. FoIP Software Architecture

The facsimile interface unit (FIU) is the software module that resides within a FoIP IWF.

It demodulates voiceband signals from an analog interface and converts them to a digital

format suitable for transport over a packet network. It also remodulates data received

from the packet network and transmits it to the analog interface. In doing so, the FIU

35

performs protocol conversion between Group-3 facsimile protocols and the digital

facsimile protocol employed over the packet network.

The FIU, shown in Figure 10, consists of the following three units: fax-modem unit

(FM)—This processes PCM samples based on the current modulation mode and supports

the following functions:

• V.21 Channel 2 (300 bps) binary signaling modulation and demodulation

• high-level data link control (HDLC) framing (0 bit insertion/removal, cyclic

redundancy check (CRC) generation/checking)

• V.27 ter (2400/4800 bps) high-speed data modulation and demodulation

• V.29 (7200/9600 bps) high-speed data modulation and demodulation

• V.17 (7200/9600/12000/14400 bps) high-speed data modulation and

demodulation

• V.33 (12000/14400 bps) high-speed modulation and demodulation

• CED detection and generation

• CNG detection and generation

• V.21 Channel-2 detection

Figure 10. FoIP Module

36

fax protocol unit (FP)—This compensates for the effects of timing and lost packets

caused by the packet network. The FP prevents the local fax machine from timing out

while waiting for a response from the other end by generating HDLC flags. If, after a

time out, the response from the remote fax machine is not received, it also sends a

command repeat (CRP) frame to resend the frame. This unit monitors the facsimile

transaction timing, the direction of current transmission, and the proper modem

configuration. It performs the following functions:

• protocol processing (group-3 facsimile)

• examination/alteration of binary signaling messages to ensure compatibility of the

facsimile transfer with the constraints of the transmission channel

• network channel interface data formatting

• line state transitions

fax network driver unit (FND)—This assembles and disassembles fax packets to be

transmitted over the network and is the interface unit between the FP and network

modules. The control information packets consist of header and time stamp information.

In the direction of the PCM to the packet network, the FND collects the specified number

of bytes and transmits the packet to the network. In the receive direction, the FND

provides data with the proper timing (as generated on the transmit side and reproduced

through the received time stamp information) to the rest of the FIU. The FND formats the

network packets for transmission to the network based on the specific network protocol.

The FND delays the data to remove timing jitter from the packet arrival times and

performs the following functions:

• formatting of control information

• formatting of fax data

• properly timed playout of data

• elastic (slip) buffering

• lost-packet compensation

11. FoIP Summary

37

A fax-over-packet software architecture has been described for the interworking of fax

machines and packet networks. Some of the key features enabling these applications to

function successfully are as follows:

• an approach that addresses the effect of delay through the network

• a process that minimizes the effect of jitter

• features that address lost-packet compensation

Though the QoS issues associated with carrying FoIP networks are significant, the future

of this approach will be driven by the substantial cost savings and exciting applications

made possible with FoIP software technology.

Conclusion

Many businesses are hoping VoIP will ultimately lead to greater efficiency,

productivity and cost savings. VoIP eliminates redundant networks and network

resources, simplifies network management and facilitates communication.

However, for an existing TCP/IP network to successfully carry VoIP traffic, several

modifications must be made to the network infrastructure. While data compression

and QoS mechanisms will help reduce data traffic, an efficient and reliable data

cabling solution is needed.

Data network reliability concerns are not insurmountable. There are data cabling

solutions available that provide efficient throughput and sufficient headroom to ensure

that even bandwidth-hungry applications will run smoothly.

Migration and Adoption

Some businesses are already in the process of implementing VoIP.16 Other companies

are preparing for the eventual integration of voice and data networks.

38

In both cases identifying the right high performance data cabling system can be

difficult. No data cabling standard adequately addresses the performance

requirements associated with VoIP.

Fortunately, there are other sources for information on throughput efficiency and

headroom performance, like Anixter’s Levels Program. Taking the time to ensure that

the data cabling can provide the necessary throughput and headroom is crucial to

implementing VoIP successfully

Glossary

ADPCM adaptive pulse-code modulation

ANSI American National Standards Institute

ATM asynchronous transfer mode

BRI basic rate interface

CED CallED station IDentification

CELP code excited linear prediction

CFR confirmation to receive

CIR committed information range

CNG calling tone

CRC cyclic redundancy check

CRP command repeat

CSI called subscriber information

DCN disconnect message

DCS digital command signal

DIS digital identification signal

39

DLCI data-link connection identifier

DSP digital signal processor

DTMF dual-tone multifrequency

E&M ear & mouth, earth, and magneto

EOP end of procedures

FIFO first in, first out

FIU facsimile interface unit

FM fax modem unit

FND fax network driver unit

FoIP fax over Internet protocol

FP fax protocol unit

FXO foreign exchange office

FXS foreign exchange station

HDLC high-level data link control

IETF Internet Engineering Task Force

IP Internet protocol

ISDN integrated services digital network

ITU International Telecommunications Union

IWF interworking funtion

LMI local management interface

MCF message confirmation

MIB management information base

MPS multipage signal

NSF nonstandard facilities

PBX private branch exchange

PCM pulse code modulation

POTS plain old telephone service

PRI primary rate interface

PSTN public switched telephone network

PVC permanent virtual circuit

QoS quality of service

40

RTCP real-time conferencing protocol

RTP real-time protocol

SVC switched virtual circuit

TCF training check field

TCP transmission control protocol

UDP user datagram protocol

UNI user network interface

V/FoIP voice and fax over Internet protocol

VoFR voice over frame relay

VoIP voice over Internet protocol

VToA voice telephony over ATM

41

VOIP Seminar Report

Engineering

Transcript of VOIP Seminar Report