Overview of Video Streaming Services

8/10/2019 Overview of Video Streaming Services

http://slidepdf.com/reader/full/overview-of-video-streaming-services 1/29

Prepared by: Date: Document:Dr. Irina Cotanis June 20, 2013

© Ascom (2013) All rights reserved. TEMS is a trademark of Ascom. All other trademarks are the property of their respective holders.

An Overview on Delivering and TestingMobile Video Streaming Services

White Paper



© Ascom (2011) Document:2(29)

Contents

1 Video Services Drive Today’s Mobile Data Trends ................3

2 At a Glance: Mobile Video Streaming Service and i tsTesting Implications .................................................................5

2.1

Mobi le Video Signal ..................................................................5

2.2

Mobile Streaming Services: Arch itectures, Featuresand Test Requirements ............................................................7

2.3 Mobile Video Streaming Client Protocols Stack .....................9

3

Mobi le Video Streaming Degradation ....................................11

3.1 Customer Perceived Video Degradations and theirRoots .......................................................................................11

3.2 Network-Centric Video Degradation ......................................12

3.3

Device - Video Player Client-Based Video Degradations ..... 13

3.4

Limitations of Using QoS/SLA as QoE Quantif iers ..............14

4

Customer Experience Metrics for the Evaluation ofMobile Video Streaming Integrity ..........................................16

4.1 Perceptual Mobile Video Quality QoE ...................................19

4.2

A Parametric Evaluation Approach .......................................20

5

Evaluation of the Quality of Experience for VideoStreaming Services ................................................................21

5.1 Mobile Video Streaming Services: KPI, QoS and QoE:relationship, dependencies and evaluation ..........................21

5.2

An Example of Mobile Video Streaming Service QoEIntegri ty Evaluation and Analysis ..........................................25

6

Conclusions ............................................................................27

7

References ..............................................................................27




1 Video Services Drive Today’s Mobile DataTrends

In the last few years, infrastructure vendors, operators, phone vendors, andapplication developers have witnessed the results of their commoncontributions to the fast evolution of 3G networks and the development of4G/LTE. Myriad converged services have been developed – from voiceand data (e.g., web browsing, email, and FTP) – to video telephony, MMS,video/multimedia streaming and sharing, mobile TV, and gaming. Thesustained focus of infrastructure vendors and operators on optimizing 3Gnetworks, and developing 4G/LTE networks for low latencies and high andstable throughputs, allowed phone vendors and application developers todeploy smartphones that have changed how end users perceive and usethe mobile networks and their services.

Analysts’ estimates [1] show an estimated 82.5 % smart device penetrationrate (Figure 1), and a 78% increase of mobile data traffic consumption(Figure 2) by 2016. With the introduction of smartphones like iPhone and Android-based platforms, the emergence of new tablets like the iPad, andthe continued growth of netbooks and laptops, there is an explosion ofpowerful mobile devices capable of displaying high-quality video content.In addition, these devices can support various streaming applications andinteractive video applications like videoconferencing, and they can alsocapture video for video sharing and video broadcasting applications.Therefore, commercial mobile network traffic is expected to be dominatedby video services (Figure 2), such that by 2016, 70.5% of the mobile datawill be made up of video applications.

Figure 1 Devices Distribution Figure 2 Mobile Data Distribution

As a consequence, modern mobile networks have to be optimized fordelivery of a broad range of video content and video-based applications.Subscribers taking advantage of new multimedia content, applications anddevices will consume all available bandwidth and still expect the same, oreven better, quality that they were used to from the fixed-line services (interms of video quality, start-up time, reactivity to interaction) and they will




not expect mobility to be used as an excuse for an unacceptable quality ofexperience (QoE).

Therefore, bandwidth limitations and challenges in maintaining highreliability, high quality and low latency demands of rich multimediaapplications means that a new way must be found to optimize modernmobile networks sto support higher user capacity along with enhancedquality of experience. In addition, the optimization process needs to besuited to the network and the application protocols specific to each mobilevideo service. Video steaming services like Netflix, Hulu, and YouTube aremoving from the real-time protocol - RTP/UDP/IP [12] - to the more qualityand robust Transport Control Protocol - TCP/IP - over HTTP [13], 14],which provides the appropriate delay required for high-quality, real-timestreaming applications by using advanced TCP congestion mechanisms. Inaddition, the adaptive HTTP streaming techniques ensure significant

bandwidth efficiency while maintaining high video quality. Unlikestreaming, conversational video - voice integrated service over IMS inseamless Rich Communication Suite [25] - uses delay-free RTP/UDP-based applications at the expense of quality sensitivity to transmissionerrors. The IMS profiles for conversational integrated voice and video aredefined by GMSA IR -92 [26] and respectively GSMA IR94 [27] andimplemented based on the 3GPP specifications for Multimedia Telephonyover IMS [28].

The mobile video service QoE emerges not only from the network (e.g.,latency, delay, jitter, and loss), but also from video processing techniqueslike video coding and compression, and video resolutions (e.g., 240p, 360p,

480p, 640p, 720p) that require different bit rates ranging from tens tothousands of kbps, and frame rates from 5 to 30 frames per second (fps).Mobile devices can go from typical phone 240p resolution, which needsabout 300kbps, to the iphone 5 Retina display with 640p that requires about700kbps. Depending on the video content, the combination of all thesefactors contributes to the subscriber’s perception of the quality of the video.In addition, the performance of the video player clients implemented on thedevices (e.g. Flash Player for Android-based phones, QuickTime foriphone) as well as the devices’ characteristics (e.g. display size, formfactor) are key factors in understanding video streaming QoE whilepursuing the most efficient bandwidth optimization. However, as it isdiscussed later on in this paper, assuring video streaming service quality

does not only depend on the integrity QoE, but also on the accessibility ofthe video content and retainability of the session.

Thus, the whole ecosystem including content providers, network operators,service providers, device manufacturers and infrastructure vendors needsto ensure that these demands can be met.

This whitepaper refers to the mobile video streaming service as defined by3GPP [13], [14] and it is organized as follows Section 2 provides a generaldescription of the mobile video signal structure and of the streaming serviceas supported on mobile networks. Network architecture, service featuresand their test implications are discussed. The protocol stack used bydifferent video streaming services’ solutions is described. Section 3




discusses video streaming degradations from the subscriber, network anddevice perspective. In addition, a subjective Mean Opinion Score test onsimulated video streaming scenarios is used to explain why only using

network-based quality of service (QoS) service level agreements (SLAs)will be insufficient when trying to accurately assess the QoE perceived bysubscribers.

Section 4 addresses the QoE models best suited for the evaluation of themobile video streaming service integrity. Section 5 discusses the keyperformance indicators (KPIs), QoS and QoE metrics that impact qualityassurance for the video streaming service; their relationships, and howthese can be collected and evaluated. An example of the performanceevaluation of the HTTP/TCP-based video streaming service is discussed.

2 At a Glance: Mobile Video Streaming Serviceand its Testing Implications

2.1 Mobile Video Signal

With mobile video services, subscribers bring high expectations learnedfrom their fixed-line video services; mobility is taken for granted. However,mobile video not only uses completely different signal processingtechniques than fixed video (like TV broadcasting), but it also uses differenttransport methods (e.g. RTP/UDP, MPEG 2/4, HTTP/TCP), and it isdisplayed by devices with various form factors.

Video signal is transmitted as a series of frames, typically at a rate between5-30 frames per second (fps) for mobile applications. The display size fordigital video is expressed as the number of horizontal and vertical pixelsthat comprise the screen; higher resolution and frame rates ensure higherquality, at the cost of higher video transmission bandwidth.

In order to be sent across the network, the video frames are compressed(encoded). They are decompressed (decoded) at the receiving side sothey can be played back. The compression technique used needs to suitvideo quality requirements, the available bandwidth, and whether real-timevideo delivery is required (e.g., video conferencing or live TV broadcasts) or

not (e.g., video streaming, sharing) [2].

During the compression process, the video frames are arranged into aseries called a Group of Pictures (GOP) of a specific structure and length. A single GOP consists of an ordered sequence of one or more of thefollowing three frame types: intracoded frame (I frame), forward predictivecoded frame (P frame), and bidirectional predictive coded frame (B frame).The I frame is the first in the GOP, and it is independently coded withoutreference to any other frame. Therefore, the I frame is the largest,requiring the greatest number of bits, and can be decoded on the receivingend without requiring data from any other frame. The P frame encodesmotion changes from the most recent I or P frame. The B frame encodes

motion changes from the most recent I or P frame, the following I or P




frame, or a combination of both. Therefore, B frames require the fewestnumber of bits, but QoE can be affected if too many B frames are used.

Each encoded video stream consists of a successive series of GOPs.GOP structure and length are variable, with the typical length for a singleGOP being 15 to 250 frames. A GOP contains one I frame and somenumber of P and potentially B frames (e.g., I, B, B, P, B, B, P, B, B, P, B, B,P, B, B) sent at 30 frames per second, depending on the amount and typeof redundancy that can be moved without affecting the video quality at thereceiving side. The video frames contain spatial redundancy (similaritybetween neighboring pixels in a single frame), as well as temporalredundancy (similarity between neighboring frames in the video sequence).

The video codecs take advantage of both types of redundancy to compressthe video in two ways:they use a combination of intraframe compression

(image compression) to reduce spatial redundancy in individual frames,and use interframe compression (motion estimation) to reduce temporalredundancy between frames. In the GOP, I frames are encoded usingintraframe compression, and P and B frames are encoded using interframecompression.

Most video codecs use both compression types in order to meet bandwidthrequirements while maintaining the best possible video quality. Videoquality is highly dependent on the GOP frame type, since the dependencyon the GOP structure is caused by the dependency on the frame typescontained in the GOP structure. If an I frame is corrupted (e.g., due toencoding or packet loss), the error will propagate through all remaining B

and P frames in the GOP, causing degradation that might be perceived forseveral seconds. Corrupted P frames will affect only the remaining B and Pframes. A corrupted B frame will only affect that frame (about 15-30 ms)and it might not be seen via subjective observation.

In addition, video quality is affected by the GOP length; longer GOP couldreduce bandwidth consumption at the expense of exposing the videoquality to degradations due to encoding errors or packet loss/discard.Shorter GOP could reduce the impact of errors, but the cost would be alarger number of I frames, increasing both the bandwidth requirements andthe processor load on the encoder and decoder. Regardless of thecompression or frame type, each frame is divided into blocks (typically

16x16 pixels, transformed using a Discrete Cosine Transform (DCT)). Thecoefficients are quantized and compressed further into the actual entitiesthat are packetized and then sent out.

Mobile video compression involves a variety of standardized codecs (e.g.,MPEG2, H.261/263, MPEG4-part10/H.264/Advanced Video Codec AVC+and its successor High Efficiency Video Codec HVC) and commercial videocodecs (e.g.,Microsoft VC1, or Google VP6/8) that use different errorconcealment techniques suitable for the different types of protocols thatsupport mobile video services. This variety in video compression types isexpected to result in different performances requiring various optimizationtechniques as well as comprehensive analysis and quality evaluation.




The compressed frames are divided into some number of transport unitsand then encapsulated in transport packets at three layers: (1) RTP orMPEG-2/4, (2) UDP or TCP, and (3) IP.

A typical video IP stream structure is:

[IP header] [UDP or TCP header][RTP header or MPEG2 Transport] [Video payload]

In order to encompass errors and/or loss, Forward Error Correction isapplied to the packet stream, allowing some proportion of lost packets to bereplaced at the receiving end. In addition, protocols use retransmission toreplace lost packets (e.g., reliable UDP, TCP, or multicast with unicastretransmission). Details of this are discussed in section 5.

2.2 Mobile Streaming Services: Archi tectures, Features andTest Requirements

The mobile streaming services defined by 3GPP [11] cover a multitude ofdifferent applications that can generally be classified into on-demand (e.g.music, news, movies) and live (e.g. radio, TV programs). The 3GPPdefined IP-based framework specifies protocols for control signalling,capability exchange, media transport, rate adaptation and protection, aswell as codecs for voice, audio, video, still images, bitmap graphics, vectorgraphics, timed text and text. The mobile streaming services introducedwith 3GPP Release 4 referred to simple streaming (downloading) andevolved with various releases (Release 6-11) with additional features andprotocols allowing high-quality, bandwidth-efficient mobile video-audiostreaming [12], [13]. These features include:

Link-aware bandwidth adaptation, which adjusts the sessionbandwidth to the potentially time-varying cellular network bandwidth;

Mechanisms to gather streaming session QoS directly related toQoE behavior (e.g. HTTP request/response transitions; averagethroughput, representation of user-based switch events, mediadescription presentation (MDP), initial buffering - see section 5.1)

Capability exchange mechanisms which enable Packet SwitchStreaming (PSS) servers to provide a wide range of devices withcontent suitable for the particular device involved in the mobilevideo streaming service.

All these complex features designed to ensure high-performance mobilevideo streaming services require thorough monitoring and troubleshooting.Monitoring the adjustments to time-varying mobile bandwidth,correlatedwith QoEcentric performance indicators, can help assess an optimal trade-off between quality and bandwidth allocation, and consequently optimizecapacity based on QoE requirements. Monitoring the capabilitiesexchanges can help to easily detect poor service accessibility that is not

rooted in the network itself, but rather is due to a particular device(s) that




does not have the proper capabilities for the mobile video service itattempts to access.

Last, but not least, the mechanism gathering QoE-centric metrics allow fasttroubleshooting and service optimization when used in correlation with QoEevaluation metrics (described in section 4) running simultaneously with themobile video streaming service.

Figure 3 [13] shows the most important service-specific entities involved ina 3G (GRPS core) packet-switched streaming service. In addition to thecontent server placed behind the Gi interface and the streaming client onthe device, various components also located behind the Gi interface mightbe involved in providing additional services or to improve the overall servicequality. Portals allowing convenient access to streamed media contentstored on content servers and user and device profile servers used to store

user preferences and device capabilities, contain information that can beused to control the presentation of streamed media content to a mobileuser.

Figure 3

3GPP Release 8 already introduces streaming services over IMS [14],which brings enablers and features that can enhance the experience ofPSS services as seamless IMS services, while audio-video codecs, fileformats and protocols remain as defined by PSS services specifications[11], [12], [13]. The 4G (Enhanced Packet Core) network elements

involved in this scenario are presented in Figure 4 [14].

Figure 4

SGi

S12

PCRF

Gx

Operator’sIP network

Rx

SGSN

LTE-Uu

E-UTRAN

P-GW

S1-U

S4

UTRAN

GERAN

Contentservers

Contentcaches

User andterminalprofiles

Portals

UE PSSClient

S1-MME

S6a

S10

S11

HSS

MME

Streaming

Client

ContentServers

User and

terminal

profiles

Portals

IP Network

Content Cache

UMTS

Core Network

UTRAN

GERAN

SGSN GG SN Gi

Gb

Iups

Streaming

Client




In addition to IMS enablers, PSS IMS introduces the time-shifting streamingfeature, which is designed to enhance access to live streaming sessions.

The PSS server maintains a time-shift buffer for each live feed and allowsthe PSS client to pause live sessions and even navigate (rewind, fastforward) in the offered time-shift buffer range. This VCR-like feature in themobile environment contributes to the mobile user’s overall QoE, and itrequires dedicated testing and evaluation (details are presented in section5.1). Therefore, evaluating mobile video services over IMS involves, on topof typical 3G techniques, testing related to the IMS core as well as to itsenablers and features.

2.3 Mobile Video Streaming Client Protocols Stack

Mobile streaming services require a PSS client to be implemented on the

device invoking the service. The client’s protocol stack (Figure 5) is dividedin two sections; the lower one embeds the RAN protocols (RLC, MAC,PHY) and the upper part the EPS Evolved Packet System protocols(IP/UDP/RTP or IP/TCP/HTTP) which are mainly controlled by the EvolvedPacket System Session Management. The stack’s upper part istransparent to the RAN stack, which can be either 3G (WCDMA/EV-DO) or4G (LTE) and supports the following functions: control, scene description,media codecs (for video, still images, vector graphics, bitmap graphics,text, timed text, natural and synthetic audio, and voice) and the transport ofmedia and control data.

Figure 5

The control-related elements are session establishment (invoking a PSSsession), capability exchange (adaptation of media stream to devicecapabilities) and session control (set-up and control of the individual mediastreams between a PSS client and one or several PSS servers). The userequipment (UE) is expected to have an active PDP context or other type ofradio bearer (e.g. Dedicated Bearer establishment/Deactivation procedurefor LTE/EPC mobile video streaming) that enables IP packet transmissionat the start of session establishment signalling. The client may be able to




ask for more information about the content and then initiates theprovisioning of a bearer with appropriate QoS for the streaming media.

The scene description consists of spatial layout and a description of thetemporal relation (synchronization) between different media (video, audio)that is included in the media presentation description (MPD) depending onthe protocol used for transport. In the case of the HTTP/IP transport, theMPD also provides an overview on elements and attributes (e.g. bit rate,resolution, quality ranking, codec) that describes components andproperties of the media. Transport of media and control data consists ofthe encapsulation of the coded media and control data in a transportprotocol.

Initial streaming until Release 9 uses push-based streaming based on astateful protocol, such as RSTP/RTP which uses an RTP payload format

[12]. Once a client connects to the streaming server, the server keepstrack of the client’s state until the client disconnects. In this case, frequentcommunication happens between the parties, the server sending the mediaas a continuous stream of packets over UDP. The UDP transport is suitedfor real-time (live) video streaming (e.g. news), which from the user’s pointof view needs to be delay-free at the possible price of lower content quality.However, lately it has become clear that TCP’s congestion controlmechanism and reliability requirement, which were thought to drasticallyaffect video streaming, actually do not necessarily hurt the performance ofmobile video streaming. In addition, a well-designed video player adaptingto large throughput variations can easily compensate for any possibleglitches in the TCP’s congestion mechanism.

The need for higher user capacity and enhanced QoE for a rich set of videoapplications required the development of adaptive streaming, whichoptimizes and adapts the video configurations over time in order to deliverthe best possible quality video to the user at any given time taking intoaccount changing link or network conditions, device capabilities, andcontent characteristics. Therefore, the video client holds the central role bycarrying the intelligence that drives the optimization and adaption of thevideo delivered stream. This scenario can be supported only by a statelessprotocol which follows the pull-based streaming paradigm, rather thanRTSP push-based. If an HTTP client requests some data, the serverresponds by sending the data and the transaction is terminated. Each

HTTP request is handled as a completely standalone one-time transaction,freeing the network from high client-server signaling.

The 3GPP TCP over HTTP video streaming service specifies [13]progressive download (media starts playing while still downloading) anddynamic adaptive streaming (DASH) as solutions for adaptive streaming. Although supporting a variety of today’s mobile video streamingapplications (e.g. YouTube), progressive download solution has someweaknesses that may prevent it from being a definitive solution for mobileapplications. One major issue affecting the mobile network case is that it isnot bit-rate adaptive because all clients receive the same encoding of thevideo, despite large variations in the underlying available bandwidth acrossdifferent clients and across time for the same client. Frequent occurrences




of re-buffering, which translate in significant perceived video qualitydegradation, occur due to the absence of video quality and/or bit rateadaptation, especially when the network is unable to support the fixed rate

during low throughput caused by unfavorable link conditions. In addition,progressive download does not support live video streaming (e.g. news,sports). The dynamic adaptive streaming encompasses all these; and it isespecially effective in tackling mobile bandwidth limitations, and it alsoallows for more intelligent video steaming that is device-aware andcontent-aware. The DASH client can dynamically select the videorepresentation, ensuring continuous playback while also optimizing qualityfor a given link throughput. In this way, the client finds the best possiblecompromise between high video quality and minimal occurrences of re-buffering events.

More of today’s applications like Netflix and YouTube are moving to DASH

solutions. In addition, there is a strong industry push towards this approach[15], [16], [17], [18], [19], [20].

Regardless, of the solution, evaluating mobile video streaming requiresQoE metrics that decode the protocol stack and process IP and TCP QoE-centric performance and are then used to score and troubleshootsubscriber-perceived quality . Due to significant role played by the client ina HTTP/TCP based video streaming solution, it is also very important to beable to analyze the client’s behavior and to correlate it to the networkperformance. In this way, an efficient bandwidth-QoE trade-off can becontrolled and optimized.

3 Mobile Video Streaming Degradation

3.1 Customer Perceived Video Degradations and their Roots

The most common customer-perceived video degradations are blockiness,blurriness, colorfulness, freezing, blackout, jerkiness, and noise [2].

Blockiness is an artifact of small blocks on the image, generally due toprocessing less than the entire video stream during the encoding process.Therefore, because details of each image cannot accurately berepresented, single “mean values” are used to represent all pixels in a largearea.

Blurriness is an artifact that shows objects with less sharp edges on theimage. This effect can be generated due to high-frequency attenuationduring recording and/or encoding, or to content with extreme movement(e.g., sports or movies) sent under bandwidth constraints where sufficientbit rate is not available to encode the content action properly.

Colorfulness is perceived as an intensity of saturation of colors and theirspread and distribution in the video image. This can be caused by eitherthe recording or the encoding process.




Freezing (with skipping) occurs when the image freezes for a time andthen reappears with lost content. This is typical when experiencing packetloss. If the lost frames are I frames, then the image will reappear only

when the first uncorrupted I frame arrives. At the extreme, the video canrun into blackout, where complete loss of signal occurs due to drasticnetwork failure (dropped connection), generally caused by bandwidthissues or due to handover to areas with little or no coverage (mobile-casespecific).

Freezing (without skipping) happens when the image reappears at thesame point at which it was left. This re-buffering effect is caused bybandwidth limitations.

Jerkiness is perceived as a jerky image by the user; the image sequencesappear as a series of “jumps” rather than a display of smooth motion. This

could be due to bandwidth issue (reduced frame rate, as low as 5 fps inmobile cases) or a network jitter effect (overflow or underflow of the buffer).

Noise typically occurs as visible “snow” over most of the image, generallycaused by quantization processes during compression.

Therefore there are three main sources of video degradation: theprocessing of the video signal, the network and the device. Each of thesesources comes into play with different dimensions and impacts customer-perceived quality, depending on the video application and content.Therefore, cost-efficient mobile video service optimization relies onthorough evaluation techniques/methods that not only provide an estimator

of the customer experience, but also can help diagnose sources.

3.2 Network-Centr ic Video Degradation

The customer-perceived video degradation discussed in the previoussection are mainly rooted in network performance. The QoS metrics knownto affect the video quality are jitter, limited bandwidth and packet loss.

Jitter is a short-term variation in packet arrival time. This is typically due tonetwork congestion, but buffers in player clients (including set-top boxes inthe IPTV case) can compensate for the most frequently seen jitter values(5-20ms). Larger delay variations, like server congestion, can cause

problems due to play-out buffer starvation. In these scenarios, adaptivebuffers are recommended (see section 4).

Limited bandwidth (or low available dedicated throughput) causespackets to be discarded, which leads to quality degradation.

Packet loss impact can differ dramatically depending on a combination offactors including the video content itself (texture, brightness, amount ofmotion, etc.), the compression standard used and the transmissionprotocol. Therefore, not all packets are equally important to video quality.Packet loss can be bursty, with periods of high loss, and it might occur dueto network congestion, link failure, insufficient link bandwidth, or

transmission errors. The type of quality degradation that occurs due to




packet loss will depend on the protocol being used to carry the video (UDPor TCP, reliable UDP, or Forward Error Correction [FEC] schemes onUDP).

3.3 Device - Video Player Client-Based Video Degradations

As with any other mobile service, the device plays a key role in theperceived quality; the implemented codec, the video player client as well asthe phone’s characteristics (form factor, display size and resolution) arekey. Depending on the video content and video frame structure, thecodec’s bit rates and compression scheme can have various levels ofrobustness to withstand error values and patterns and, consequently, willimpact the quality of the video stream.

Designed to compensate for network degradation, various clients do

perform differently to the task. Generally, the video client’s characteristicsare defined by the transport protocols supporting a particular mobile videoservice, such as RTSP/UDP or RTP/UDP for traditional pushed basedstreaming, HTTP/TCP for progressive download and adaptive streaming.Regardless of the client’s characteristics, its ability to cope with variouserror patterns (e.g. bursty or random, out-of-order received packet), valuesand lengths can impact the video quality as perceived by users. Thecompensation is seen as re-buffering events, which if too frequent or toolong (generally longer than 1 sec) can negatively impact the perception ofthe video quality.

The clients’ behavior and performance have a more significant impact onthe video quality, particularly in scenarios of dynamic adaptive HTTP/TCP-based application due to the client interaction with the TCP layer. Inadaptive streaming, the server maintains multiple profiles of the samevideo, encoded with different bitrates and quality levels. The video object ispartitioned in fragments (typically a few seconds long) and a player canthen request different fragments at different encoding bitrates, dependingon the underlying network conditions. Therefore, the video player client(e.g. Apple QuickTime X, Adobe Dynamic Streaming for Flash, MicrosoftSmooth Streaming – Silverlight) is the one deciding on bit rate requestedfor any fragment, and it controls the playback buffer size by dynamicallyadjusting the rate at which new fragments are requested. Although

ensuring improved scalability at the server side, these features, if not welldesigned, can significantly affect the quality of the video stream, especiallyunder dynamic network conditions with varying short-term availablebandwidth; available bandwidth showing positive and negative spikeslasting only few seconds. . It has been shown that the video and audiostreams behave completely differently in this scenario [19]. Therefore,each stream exhibits different quality which, depending on the content,affects the overall QoE in different ways.

Therefore, there are several topics that should be evaluated in order toensure that the client’s performance behavior does not cause amisapplication of a network optimization process for mobile video

streaming. Three main topics that should be analyzed:




Client’s reaction (time to reach the maximum sustainable bitrate) to either persistent or short-term changes in the underlying

network bandwidth availability. If the maximum sustainable bitrate is not reached quickly enough, then the available bandwidthis poorly used. On one hand, performing large changes toquickly reach the maximum bit rate could be annoying to theviewer. On the other hand, reacting too late to short–termavailable bandwidth spikes can cause sudden drops in theplayback buffer or unnecessary bit-rate reductions.

Client’s behavior, exhibited as rate-adaptation oscillations, in thepresence of a competitive client sharing the same resources. Ifthese oscillations occur, they can be falsely interpreted as hightraffic load and misleadingly used for bandwidth re-allocationwhen not really needed.

Client’s performance in the presence of live streaming definedby its ability to sustain short playback delay.

In addition to all these client performance aspects, it is important tounderstand and control the interaction between the client’s adaptivemechanism -which runs as a nested feedback loop within the TCP - withthe TCP’s congestion control. Several studies have been performed [19]and there is still ongoing work to ensure TCP streaming that leads tosatisfactory video performance.

Therefore, monitoring and evaluating the TCP performance (e.g. window

length, throughput) in correlation to the client’s behavior helps highlightpossibly meaningful degradations of the mobile video service as perceivedby mobile subscribers. In addition, IP and RAN parameters - such asbandwidth allocation and multiuser scheduling - target QoS for IP core andRAN modulation. These, in addition to coding schemes and time-frequencyresources allocation, need to be evaluated in order to properly troubleshootand optimize video streaming service.

QoE parametric metrics capturing parameters at the session, transport, andIP network layers along with RAN analysis (section 5) play an importantrole in delivering high quality mobile video streaming with efficientbandwidth allocation.

3.4 Limitations of Using QoS/SLA as QoE Quantifiers

Thoroughly understanding the video quality behavior and how this impactssubscriber perception is essential for cost-efficient video service delivery,troubleshooting and optimization.

A subjective audio-video test [9] on a set of simulated network scenariosresulting in various QoS performance proved the need to go beyond QoSevaluation in order to understand the QoE behavior and to cost-efficientlytroubleshoot and optimize the mobile network. The subjective tests havebeen performed on the audio-visual quality [10] of a set of video clips ofabout 30 seconds in length. Four original clips (sports, film, music, news)




with QCIF (Quarter Common Intermediate Format) resolution have beendegraded with various network conditions applied separately in order toaccurately evaluate a single dimension of the degradation. The outputs of

the tests are the total audio-visual mean opinion score (MOS) value foreach of the degraded clips.

The tests showed that regardless of the video content (film, music, news,and sports), encoding QCIF resolution above 200 kb/s does not provideany significantly perceived video quality increase. It has been shown [24]that the optimal encoding bit rate threshold is higher for higher resolutions(e.g. QVGA, HVGA) up to 500 kb/s. A quality comparison between an un-encoded movie video clip: movie-high-bitrate and its 256 kb/s encodedversion can be viewed here: movie-256kpps.

The test has also proved that for the same video content, packet lossvalues higher than about 10% compress the video quality within a 0.5 MOS

range below 2 MOS for ten times the range of the throughput. The QoE issignificantly sensitive to throughput variations below roughly 5% packetloss; above this threshold, the perceived degradation become sosignificantly bad that human perception cannot differentiate anymore.video clips encoded at 128kb/s: “ Automatic_128-0-PL” and its respectively10% packet loss video can be reviewed here.

One can conclude, therefore, that optimization techniques and errorconcealment schemes should aim to maintain packet losses below 5% inorder to efficiently use the available bandwidth.

The impact of the re-buffering and initial buffering effect have been alsostudied (Figure 6).

Figure 6

A re-buffer event is triggered in the video player client whenever jitter orpackets received out of order need to be compensated for. It has been

http://www.ascom.com/downloads/tems/flash/resources/movie-wp.wmv


http://www.ascom.com/downloads/tems/flash/resources/movie-wp-low.wmv





http://www.ascom.com/downloads/tems/flash/resources/automatic-wp.wmv



http://www.ascom.com/downloads/tems/flash/resources/automatic-wp-low.wmv











found that audio-video MOS drops by more than 0.5 MOS – or a full 10% ofthe MOS scale - if more than one re-buffering event occurs. In comparisonto re-buffering, the dependency on the initial buffering (time elapsed before

the actual video starts playing out) showed a milder QoE impact and thatmilder impact was generally noticed only for higher throughput. It has beenfound that for an initial buffering time ranging from 1second to 14 seconds,the MOS scores decrease range from 0.1MOS to 0.5MOS for double theincrease in throughput. In addition, audio-visual MOS drops by 40% of theMOS scale for half throughput with one re-buffering event. Moreimportantly, regardless of the throughput, if the re-buffering percentagegoes above roughly 15%, then the perceived quality reaches an asymptoticlow performance level - meaning that no significant degradation isperceived anymore beyond this threshold.

The above presented results prove that using just the QoS/SLA throughput

as a quantifier of the QoE is not enough to accurately evaluate andunderstand how users perceive the video streaming quality. In addition,when it comes to the QoS/SLA throughput, it is very important tounderstand that, due to human perception, the higher the throughput, theperceived video quality becomes more sensitive to errors. This makes therole of all the other factors (e.g. video content and resolution, optimalencoding bit rates, packet loss rates, re-buffering effects caused by delayson the transmission path), as well as their interactions andinterdependencies, equally important in ensuring a high QoE.

Therefore, the optimal audio-video QoE requires not only high QoS/SLAthroughput, but also an optimal bandwidth in correlation with network

performance, compression bit rates and media content. Last, but not least,using the right quantifier for an accurate evaluation of the QoE is alsosignificantly important for cost efficiently ensuring a high quality videostreaming service. The recommended QoE evaluation metrics for mobilevideo services are discussed in the next section.

4 Customer Experience Metrics for the Evaluationof Mobile Video Streaming Integrity

The significant increase in the volume of video content processed andtransmitted over mobile networks, in different forms (e.g. broadcasting,

streaming) and services (e.g. streaming, Multimedia Broadcast MulticastService MBMS, video sharing, video on demand, video telephony) hasimposed the need on service providers, operators, and infrastructurevendors for real-time video quality assessment metrics that can accuratelyestimate end-user perception.

Several types of metrics have been developed through the years for TVbroadcast, mobile video services as well as IPTV (IP Broadcast, InternetTV, Video on Demand) [3].

These metrics are algorithms developed, trained, and tested againstsubjective video clip databases scored by real users in ITU-T standardized

subjective tests [10]. The objective algorithms provide an objective




estimator of the end-user perception of video quality. The algorithms areclassified as either: perceptual - based on the video signal; parametric -using network parameters; and hybrid - based both on the video signal and

network parameters.

The perceptual algorithms process the video signal based on human visualand cognition models, and can work with full, reduced or without access tothe original (or reference) video samples. The algorithms that have full orreduced access to the reference are called full [6], and reduced reference[7], respectively, and they are intrusive metrics, which require that teststimuli be sent over the network. In the latter case, the algorithm is callednon-reference [8], and it is a non-intrusive solution. All three solutions [5],[6], [7] provide only the video quality score and video signal centricperformance metrics (e.g. blurriness, blockiness). The usage of the humanvisual and cognitive model, as well as full access to the reference signal,

ensures high accuracy for the full reference model making it suitable forend-to-end user perception evaluation. Although having the samealgorithmic complexity as the full reference solution, the non-reference andreduced reference metrics are less accurate. For the scenarios wherevideo has been strongly distorted, these metrics are expected to exhibitaccuracy comparable to, or even lower than, the parametric models.

Currently, ongoing standardization work is focused on perceptual-basedmodels for audio and audio-visual quality [3].

The parametric models use transport and IP network parameters, as wellas codec/player/client information, to provide an estimation of the

subscriber’s perceived QoE. Unlike the current available perceptualsolutions, these algorithms give all three MOS scores, audio, video andaudio-video providing a complete picture of the video service quality asperceived by users. Being based both on the network and the device’svideo player client parameters, these algorithms allow a direct networkdiagnosis, ensuring both easy detection and easy ruling out of non-networkbased degradations (depending on the cause of perceived video qualitydegradation). In addition, the parametric solutions do not require teststimuli, so they are non-intrusive [4], [5] and thus perfectly suitable fortesting video streaming services (e.g. YouTube) as consumed by real userswith no additional test set-up requirements. Testing without using teststimuli becomes crucial for the evaluation of the streaming services which

are delivered on highly loaded networks under bandwidth constrains. Thealgorithms’ elegant architecture allows for fast processing andstraightforward implementation suitable for real-time testing and serving asan excellent proxy for real subscriber experience, such as with drive testtools and/or solutions that measure QoE directly on mobile broadbanddevices. Unlike perceptual metrics, the parametric solution allows hightime and space granularity (e.g. every 1 sec), which has a twofoldadvantage. First, it is suitable for drive testing and on-devicemeasurements when channel conditions are fast and non-stationary, andper-second scores help illustrate where and when the quality started todeteriorate. Secondly, it provides a much more advance notice of videodegradation than a perceptual metric.




All these features make parametric solutions suitable not only for videoservice quality monitoring and troubleshooting, but also for resourceallocation and/or SLA agreements management. As parameter- (not media

signal-) based, the parametric algorithms are sensitive to accuracy,especially for scenarios where video degradations emerge from elementsthat the algorithm didn’t compensate for, like the type of video terminal ordisplay being used by the subscriber. Even so, parametric algorithms doshow very high correlation to MOS gradients, so declines and increases inperceived QoE are accurately captured and fully adequate for propercharacterization of QoE. Thus parametric-based algorithms will accuratelyhighlight instances of degraded QoE, and, therefore, where networkoptimizing engineers should focus and prioritize troubleshooting resources.

In the last few years, video experts [3] decided to develop a new type ofmetric that has access to the distorted video signal, as well as to the IP bit

stream. This metric is called hybrid, and work on its development isongoing. The benefit of this type of metric is expected to be an optimalcompromise between accuracy, complexity, processing time and networktrouble shooting capabilities.

However, until the hybrid work is finalized, tested and validated, an optimalassessment of the quality of the rapidly growing mobile video streamingservice is needed now. The optimal solution has to meet requirementsrelated to operators’ goals to cost-efficiently deliver, troubleshoot andoptimize a high-quality video streaming service. Therefore, operators needto use mobile video quality assessment that provides not only a highlyaccurate MOS estimate, but also the means for troubleshooting and

diagnosis as well as quick correlation of the MOS score with measurementsperformed by methods that reflects transport/IP network characteristicsrepresenting possible root causes of the video distortions. Based on thediscussion above, it can be concluded that an optimal solution embeds bothparametric (e.g. ITU-T P.1201/1202 like, [4], [5]) and perceptual (e.g.PEVQ, [6]) metrics, as presented in Figure 7 below.




Figure 7

Although it only provides the video score, the perceptual metric is veryaccurate and transparent to the protocols and clients supporting theevaluated mobile video service. In addition, it can be used for in-depthtroubleshooting of the most sensitive dimension of the media, the videocomponent, in correlation with the parametric methods. Due to the videosignal complexity, and the human perception of the video, the videocomponent is more prone to network degradations than the audiocomponent. While the perceptual solution is the unique ITU-T standard [6],in the case of the parametric solution there are both newly standardizedsolutions [4], [5] as well as proprietary ones like Telchemy’s VQmon. Inorder to evaluate a suitably robust parametric solution, it is very importantto understand if it copes with the protocols (e.g. HTTP/TCP), clients (e.g.DASH) and codecs used by the majority of today’s mobile streaming video

service. . In addition, as described earlier in section 2, the video contenthas a significant impact on the subscriber-perceived quality. Therefore, , itis very valuable to use a parametric solution that at least uses limitedcontent information, such as time and space variability betweenconsecutive frames, as is the case of Telchemy’s VQmon parametricsolution.

4.1 Perceptual Mobile Video Quality QoE

The perceptual intrusive PEVQ algorithm provides a highly accurate meanopinion score (MOS) estimate for the end-to-end link of a large range ofvideo services such as streaming, mobile IP TV, and video

conferencing/telephony).




PEVQ is a full-reference, end-to-end measurement based on signalanalysis. The degraded video signal output from a network is analyzed bycomparing the undistorted original reference video signal on a perceptual

basis. PEVQ detects anomalies in the video signal based on the humanvisual system model, and quantifies them according to a multitude of keyperformance video indicators.

The algorithm is built of four separate blocks. The first block represents thepre-processing phase, and it performs the spatial and temporal alignmentof the reference signal and the impaired signal. The second blockcalculates the perceptual difference of the aligned signals (“perceptual”means that only differences that can be perceived by a human viewer areconsidered in the calculations). The third block classifies the previouslycalculated indicators and detects certain types of distortions. In the fourthblock, all the appropriate indicators according to the detected distortions

are aggregated, forming the final result — the estimator of the MOS.

In addition to MOS scores, the perceptual PEVQ designed for highlyaccurate end-to-end quality evaluation provides a set of end-user centric(QoE) video quality factors including jerkiness, blurriness, blockiness,brightness, and effective frame rate. These QoE factors can represent animportant starting point in understanding where and why video degradationoccurs. Then correlation with the network-centric QoS metrics – likepacket loss and, jitter, the client information and IP parameters provided bya parametric solution during the same mobile video session – can be usedfor network diagnosis based on user perception of the video quality.

4.2 A Parametric Evaluation Approach

As discussed, parametric methods of evaluation of video QoE afford highspace and time granularity, fast processing, and testing real-life mobilevideo applications (no test stimuli required), which ensure the evaluationnot only as perceived, but also as consumed by subscribers.

The three MOS scores (video, audio and video-audio, multimedia) allow astraightforward troubleshooting and root cause analysis as perceived byusers. A poor video quality on a sports clip will affect subscriber perceptionmore than poor audio on the same clip, while the reverse is true of a newsclip.

In addition, correlating each of these separate scores with the throughput atdifferent layers of the network allows an operator to determine a judiciousbandwidth allocation per user and per service and can, therefore, providemeaningful input for capacity management.

The quality prediction of the mobile video service takes buffering with andwithout skipping into account, as well as corruption duration which iscorrelated with the GOP and the slicing of the video. Degradation from bitand frame rate variations and packet loss values and patterns also used topredict scores for both QVGA and QCIF formats combined with variouscodecs/players such as H.264, H.263, REAL, and MPEG4. Similarly, theaudio quality is estimated for a variety of audio codecs (e.g. AMR NB,




AMR-WB+, AAC-LC, HE-AAC, MPEG-2 /4, MPEG-1 Layer 2) and bit rates. All these formats and codecs/players can be used in various mobile videoservices. As already mentioned, parametric solutions using content

information such as Telchemy-VQmon are a preferred solution for mobilevideo services.

A high level description of the VQmon solution is given by the functionbelow:

QI (MOSa, v, a-v) = F* (PLR/BLER, Buffering Type and Characteristics,Client State**, Client Parameters - Player/codec, Error Concealment Technique,GOP & Slice Structure, Video/Audio Bit Rate/Bandwidth,Video Format/Frame Rate, content indicators)

Where:QI (MOSa,v,a-v) is the quality index which represents the MOSestimate determined by the VQmon.

F is a non-linear function with analytical expression and coefficientsdetermined based on high order optimization techniques developedon subjective MOS and correspondent parametric output scores’databases.

Client State means one of the following: initially waiting for data,reproducing the stream, waiting for data after draining the jitterbuffer, or frozen due to bad data.

Buffering Type and Characteristics and Error ConcealmentTechnique comprise the buffering with or without skipping and,respectively, the corruption duration.

GOP and Slice Structure helps increase the algorithm’s accuracyrelated to the video content by allowing a weight on qualitydegradation if more I frames than P or B frames are corrupted.Content indicators are represented by the time and spacevariability between consecutive frames which gives the amount ofmotion embedded in the clip. High amounts of motion weights moredegraded evaluated parameters; while low amounts of motionleaves degraded parameters without weighting.

Since the MOS estimate is based on a series of parameters, all theseparameters are available as outputs of the Techemy VQmon parametricsolution and they can be used for a straightforward network diagnosis anda root cause analysis. An example is presented in section 5.2.

5 Evaluation of the Quality of Experience for VideoStreaming Service

5.1 Mobile Video Streaming Services: KPI, QoS and QoE:relationship, dependencies and evaluation




The evaluation of the mobile video streaming service quality as perceivedby users requires not only the assessment of the video quality (QoEintegrity), but also the accessibility of the video service (QoE accessibility)

and the streaming session’s retainability (QoE retainability). As it can beseen in Figure 8, the three QoE dimensions are impacted by one or more ofthe streaming service’s phases; player session, video service access, videodownload and display ([21], [22]); respectively by the phases’ overall QoS.The individual phases’ overall QoS is defined by a series of KPIs asdepicted in Figure 8.




Figure 8

As it can be seen, a detailed mapping of the overall streaming phases’ QoSmetrics to the three dimensional (accessibility, integrity, reliability) QoEmetrics (Figure 8, north bound) as well as the main KPIs contributing toeach of these QoS metrics (Figure 8, south bound). As shown in thepyramid in Figure 8, the higher the hierarchy, - meaning respectively closerto the subscriber perception (QoE) the smaller the number of metricsdefining the service performance. Therefore, a cost efficient evaluation ofthe mobile video streaming service involves a top-down analysis approach.

The video player session’s overall QoS contributes to the QoE accessibilityof the video streaming service (Figure 8) and it is defined by the followingKPIs: player service IP accessibility and player download data transfer rate([21], [22]). The player service IP accessibility is defined by the DNSresolution time and by the HTML context set up. Although not related to thevideo streaming performance itself, the DNS resolution is subjectivelyperceived as time to get access to the service. The HTML context set-uprefers to HTML context and the player configuration download request. Theplayer download consist of the HTML context information and the playerconfiguration script download. The download transfer data depends on theTCP KPI configuration values such as window size, and/or TCPperformance like the TCP congestion control mechanism’s KPIs.

The video download / display overall QoS defines the QoE integrity of thevideo streaming service and it is defined by a series of KPIs (Figure 8,right) which are media-centric (e.g. video/audio resolution, number ofinterruptions and skips and their duration) and network client-centric, whichare mainly reflected in the client buffer’s behavior such as the number of re-buffering events, re-buffering time, and re-buffering failure ratio. Since thebuffer is expected to compensate for some of the possible video relateddegradations, if these degradations occur, they will be also reflected in thebuffering behavior. For example, the perceived video interruptions (orfreezing without skipping) and/or video skips (or freezing with skipping) areidentified by re-buffering events occurring during the video service -

phenomenon caused by long delays and/or packet losses possibly rootedin either IP congestion or poor RAN performance. (e.g. lack of capacityresources, coverage, and interference). However both media and bufferKPIs need to be evaluated to assess and troubleshoot network and serviceperformance. While the client buffer KPI translates into an overall network-centric degradation caused by various possible factors (such as. IP trafficload / bandwidth limitations or poor RAN performance), the media KPIs canfurther help to identify the likely cause of the degradation. In the givenexample, the presence of video skips indicates packet loss, whileinterruptions more likely point to longer delays. In addition, the two video-centric KPIs -skips and interruptions - also provide an indicator of the levelof impact on QoE. As expected, video skips representing information loss

negatively impact the QoE more than video interruptions that show that




information has been delayed rather than lost. In addition, media-centricKPIs are used to estimate the perceived video quality, while client-bufferrelated KPIs can also provide additional information on the bandwidth

availability and allocation. The client buffer’s KPIs also express the impactof the buffer’s characteristics and performance on the video service quality.

The QoE video streaming retainability can be defined by two components;one is the streaming reproduction failure during the video service’s accessphase when the video fails to start playing possibly due to the TCPperformance. If it does not fail to start playing, but rather it is just delayed,then it is perceived as initial buffering and it impacts the QoE integritydimension being reflected in the MOS score of video stream’s quality. Theother component of the QoE retainability has its roots in the videodownload (display) phase, and it is defined by the streaming reproductioncut-off ratio. This scenario can occur either due to TCP congestion and/or

lower layer performance such as IP and RAN resource allocation, orphysical layer performance (e.g. coverage or interference). Therefore, forboth QoE accessibility and retainability, it is recommended to perform first aTCP analysis before going into lower layer details.

Similar to the voice service, the QoE accessibility and retainability of videostreaming are generally expressed as percentage of success ratio, ratherthan a direct MOS score equivalent. The QoE integrity is expressed by theMOS estimate of the perceived video (media) quality based on either aperceptual metric (e.g. PEVQ, section 4.1) directly using the video signal ora parametric solution (e.g. Telchemy’s VQmon) using the IP stream’sparameters.

The QoS of each of the video streaming phases are determined based onthe contributing KPIs. These KPIs (Figure 8) contributing to the QoS andrespectively the QoE of the video streaming service can be determined bycapturing and decoding the packet stream, by collecting various transportrelated events, or can be also derived from client’s report of the videostreaming session’s QoS directly related to QoE [13], [20] information.However, the latter case applies only for HTTP/TCP based videostreaming; progressive download and adaptive streaming as well, whichproves to be the near future for all mobile video streaming services (e.g.YouTube, Netflix), due to bandwidth efficiency and the opportunity ofdelivering higher video quality as discussed in the previous sections.

These client-reported QoS metrics are HTTP request/response transitionsand initial play out delay to be used for estimating QoE accessibility, andthe average throughput observed at client and contributing to QoE integrity.In the case of adaptive streaming, the Media Description Presentation(MDP) can be used to determine media characteristics (e.g. bit rate,resolution, quality ranking, codec). In addition, the adaptive client reportedRepresentation Switch Events metric can be used to test and evaluate theVCR-like feature (section 2.2) which enhances the user experience in themobile environment allowed by video over IMS. The RepresentationSwitch Events metric contributes to the QoE integrity dimension.




5.2 An Example of Mobile Video Streaming Service QoEIntegrity Evaluation and Analysis

The evaluation and understanding of the QoE integrity behavior for the best

QoE value of the mobile video streaming service requires the followingmain steps:

Evaluate QoE based on the appropriate QoE evaluation metric,using either the QoE perceptual and/or parametric evaluationmetric suited for the HTTP/TCP streaming application undertest. While the perceptual metric is transparent to the protocol,the parametric solution has to be able to decode and to betrained for HTTP/TCP; such as Telchemy’s VQmon solution.

Draw correlations and mappings between the contributing KPIs,and QoS to the QoE integrity performance based on therelationships and dependencies as described in section 5.1.

Use the defined mappings and correlations to buildtroubleshooting/diagnosis flowcharts.

Analyze the bandwidth availability, allocation and usage.

Identify client and server performance.

The example considers a YouTube session (adaptive streaming basedvideo) in a drive test scenario using Ascom’s TEMS™ Investigation. Thetest uses a device-based video client in order to accurately emulate thesubscriber experience. The scope is to evaluate and analyze the QoEintegrity of the streaming session. During the session, audio-video

parameters, TCP, IP and RAN parameters are collected. A parametricaudio-video metric (Telchemy VQmon) is used to decode the packet streamand to calculate and report parameters at media application level (includingclient buffer centric KPIs), HTTP/TCP and IP levels. The drive test tooldecodes the RAN levels and calculates the related KPIs. Therefore, duringthe video streaming session the entire protocol stack (Figure 5) isevaluated. The Telchemy VQmon algorithm is also used to provideestimated MOS values for the audio, video and audio-video QoE. Thereporting, evaluation and analysis can be performed during the post-processing phase when troubleshooting/diagnosis flowcharts can be built inTEMS™ Discovery, Ascom’s post-processing analysis and reportingsoftware.

The VQmon estimation results show the following QoE integrity values:

MOSa= 3.0 MOSv=2.4 MOSa-v=2.9

These results indicate that the video stream has been affected, while theaudio-video component still exhibits good performance. This can beexplained by the presence of news-type of content. Although poorlyviewed, the audio has been well-perceived, leaving the users satisfied withthe overall quality of the audio-video. The reason is that, in the case of anews clip, the audio component is the component with rich QoE messageinformation. Therefore, good quality of this component determines anoverall good QoE audio-video quality. Alternatively, a sports clip has the




video component as its rich QoE message information component.Therefore, if a YouTube sports clip had been downloaded in the samenetwork conditions, then the video component would have impacted more

significantly the overall QoE audio-video quality. A consistently good video streaming service QoE requires both audio andvideo components to show good comparative quality. Therefore, there isneed to troubleshoot the QoE of the video streaming service anytime one ofthe components is affected, regardless of the overall audio-video QoE. Thepresented example discusses the video component, since this showedpoorer performance.

Although not exhaustive, guidance for an at-a-glance troubleshooting of thescenario under discussion is presented in Figure 9. The KPIs used toidentify the more likely main cause of degradation as well as a high levelanalysis is provided. In addition, further recommended investigations are

presented. The KPIs are reported in the post-processing tool (TEMSDiscovery) and an automated analysis can be performed creating post-processing scripts based on trouble shooting/diagnosis flowcharts andKPIs-QoS-QoE mappings as discussed in section 5.1.

It can be seen that a top–down approach is considered, and moreimportantly, that media-centric KPIs are recommended to be analyzed firstfor a more efficient troubleshooting process. Then, bandwidth usageindicators as well as client-buffer centric KPIs need to be considered forunderstanding the network behavior.




Figure 9

6 Conclusions

The predicted exponential increase of mobile video streaming servicesconsumption comes not only with the urgent need for bandwidth-efficientservice delivery, but also the mobile subscriber’s expectations of a highlyreliable, high-quality and low-latency multimedia experience. TheHTTP/TCP transport using advanced TCP congestion mechanisms is themost appropriate for adaptive bandwidth consumption depending on thevideo service’s requirements and network’s performance. Adaptive videostreaming clients implemented on devices are critical for a high- quality,bandwidth-efficient delivery of the mobile video service. Therefore, device-based video clients along with network and video content, resolution, framesize, and encoding characteristics (bit rate, error concealment schemes)represent the key components of the video quality perceived by users. Theinteraction of these components, as well as the human implication ofperceiving the video experience in the mobile environment, make theQoS/SLA metrics (e.g. throughput) poor and even misleading QoEquantifiers. Therefore, it is required to use either perceptual and/orparametric video quality algorithms to assess the QoE of the videosteaming service. The parametric solutions are more suitable for themobile environment due to several reasons, among which are the following:

minimal complexity and test intrusiveness, ensuring testing like a real user(real life video service, on-device clients, no test stimuli), providing all threeMOS scores (audio, video and audio-video), and crucially, also providingthe attendant measured network information element parameters used fordetermining the MOS score. Having access to these information elementswill be crucial in allowing troubleshooting of the wireless network.

The optimization and troubleshooting of the mobile video streaming servicerequires understanding of the mapping of various video streaming phases’KPIs and overall QoS to the three dimensions of QoE: retainability,accessibility, and integrity. Top-down analysis, from QoE to underlyingKPIs, and correlations between client-based, network layers-based and

video characteristics-based KPIs/QoS allow bandwidth-efficient, high videoquality troubleshooting.

7 References

[1] CISCO VNI Mobile 2011

[2] S.Winkler, “Digital Video Quality”, John Willey & Sons. Inc, 2005.

[3] Video Expert Group VQEG, www.its.bldrdoc.gov/vqeg

[4] ITU-T P.1201, “IP header based parametric video quality evaluationmodel for low and high bit rates”, consent September 2012

[5] ITU-T P.1202, Bit stream based parametric video quality evaluationmodel for low and high bit rates”, consent September 2012




[6] ITU-T, J.247, “Objective perceptual multimedia video qualitymeasurement in the presence of a full reference, PEVQ “, August2008

[7] ITU-T J.246, “Objective perceptual multimedia video qualitymeasurement in the presence of a reduced reference “, August2008

[8] VQEG, Multimedia Models - Test Results, January 2009

[9] “Video Streaming Subjective Testing”, Ericsson Research InternalReport, December 2008

[10] ITU-T P. 911, “Subjective testing of audio-visual quality formultimedia services”, 1998.

[11] 3GPP TS 26.233, “Transparent End to End Packet SwitchedStreaming Service (PSS) – General description”, Rel 11, Sept. 2012

[12] 3GPP TS 26.234, “Transparent End to End Packet SwitchedStreaming Service (PSS) – Protocols and Codecs”, Rel 11, March2012

[13] 3GPP TS 26.247, “Transparent End to End Packet SwitchedStreaming Service (PSS) – Progressive Download and Dynamic Adaptive Streaming over HTTP”, Rel 11, June 2012

[14] 3GPP TS 26.237: "IP Multimedia Subsystem (IMS) based PacketSwitch Streaming (PSS) and Multimedia Broadcast/MulticastService (MBMS) User Service; Protocols", June 2012

[15] T. Stockhammer, “Dynamic Adaptive Streaming over HTTP- Design

Principles and Standards”, Qualcomm, 2011

[16] Adobe, HTTP Dynamic Streaming on Adobe Flash Platform, AdobeSystems Inc., 2010

[17] A.Zambelli, “HTTP Smooth Streaming technical overview”, MSCorporation, 2009

[18] Apple, “HTTP Live Streaming, HLS”, IETF RFC 2616, Oct. 2011

[19] Luca De Cicco, S. Mascolo, “An Experimental Investigation of the Akamai Adaptive Video Streamimg”, Polytechnic University Bari,Bari, Italy, 2010

[20] O. Oyman, “QoE for HTTP Adaptive Streaming Services”, IEEE

Communication magazine, April 2012.

[21] 3GPP TR 101.578, “QoS Aspects of TCP based video services likeYouTube”, vs. 3, September 2012

[22] 3GPP, TS 102.250.2, “Definition of QoS Parameters and theirComputation”, September 2011

[23] Irina Cotanis, Anders Hedlund, “Voice over LTE and TestImplications”, Ascom white paper, Oct. 2012

[24] [G. Hervouet, “Video Quality Optimization – Multi-rate VideoEncoding”, Envivio White paper, July 2010

[25] GSMA, “Rich Communication Suite –Enhanced, v.1.2.2”



[26] GSMA PRD IR-92, “IMS Profile for Voice and SMS-v.3”, GSMA,December 2010

[27] GSM PDR IR-94, “IMS Profile for Conversational Video Service

v.1”, GSMA, December 2011

[28] 3GPP TS 26.114, “IP Multimedia Subsystem (IMS); MultimediaTelephony; Media Handling (Release 11)”, September 2012

Overview of Video Streaming Services

Documents

Transcript of Overview of Video Streaming Services