Neural Network Based Available Bandwidth Estimation in the ETOMIC Infrastructure Péter Hága,...

Post on 03-Jan-2016

213 views 1 download

Tags:

Transcript of Neural Network Based Available Bandwidth Estimation in the ETOMIC Infrastructure Péter Hága,...

Neural Network Based Available Bandwidth Estimation in the ETOMIC

Infrastructure

Péter Hága, Sándor Laki, Ferenc Tóth, István Csabai, József Stéger, Gábor

Vattay

ETOMIC Project, Eötvös UniversityBudapest, Hungary

Péter Hága - TridentCom 2007

Motivation

To estimate network or path parameters with a method which is:– accurate– fast, high time resolution– not requires much computational power

Approaches:– new empirical methods?– new analytical models?– else?– try artificial intelligence!

Péter Hága - TridentCom 2007

Outline

• Available bandwidth estimation• Neural networks• Performance analysis in simulations• Verification in laboratory experiments• Etomic Infrastructure, data collection• Summary

Available bandwidth

Péter Hága - TridentCom 2007

Active probing methods

Sender Receiver

Sender Monitor: Receiver Monitor:

Goal: estimate network parameters (available bandwidth, physical bandwidth, cross traffic statistical properties, etc.) with end-to-end methods

background traffic

Péter Hága - TridentCom 2007

Packet pair methods

output spacing, receiver node

background trafficstochastic process

probe pairsfixed inter packet delay

12 tt 12 't't'

input spacing, sender node

Péter Hága - TridentCom 2007

’p/C+

• the curve is based on self induced congestion, since the probe pair congests the bottleneck link with a certain d inter pair spacing• fluid model – correct asymptotic behavior, deviation in the transition range

Dispersion curvefor packet pairs

p/(C-Cc)

time

input spacing, sender node

output spacing,receiver node

Péter Hága - TridentCom 2007

Granular model and available bandwidth

• Describes the packet pair dispersion curve

• Two well known parameters are– Physical bandwidth– Cross traffic/ link

utilization

• New parameter is the granularity, the effective cross traffic packet size

• Granular model as a reference for the neural network

Neural networks

Péter Hága - TridentCom 2007

Neural Networks,in general

• A neural network is a group of interconnected artificial neurons that uses a mathematical model (a function) for information processing

• massively parallel architecture which can be used to speed up the evaluation

• Neural networks have been applied in many problem:– Function approximation, time series prediction– Classification, pattern recognition– Data processing, filtering, clustering

• two phases: training and evaluation• In the field of bandwidth estimation is not prevalent

Péter Hága - TridentCom 2007

Neural Networks,Neurons

• Artificial neurons and neural networks has a biological inspiration.

• inputs (x1, ..., xn)

• the inputs are weighted (w1, ..., wn).

• activation function f() • one output (y)• the output value is calculated from the

input values:

• common activation functions:– step, – ramp, – Sigmoid,– Gaussian

)(1 i

N

i ixwfy

Péter Hága - TridentCom 2007

Multilayer Feedforward Neural Network

• The neural network: a black box wich has several inputs and outputs.

• The inputs of the neurons are connected either to an input of the NN (this neurons are the input neurons), or to the output of an other neuron.

• The output is connected either to the output of the NN (this neurons are the input neurons) or to the input of an other neuron.

• Neurons which are not connected neither to the inputs nor the outputs called the hidden neurons.

• There are no cycles or loops in the network.

The standard multilayer feed-forward networks are universal approximators in C(Rm) which is the reason we applied them.

Péter Hága - TridentCom 2007

Neural Networks,Using neural networks for

parameter estimation

Using neural networks for estimation has two phases:

– Train, supervised learning• to adjust the architecture and the weights of the connections

between the neurons in order to approximate the function determined by the training examples

• during the training we try to minimize the error between the output of the NN and the values of the training examples

– Evaluation, use it as a function• On a different data set than the train set• Training on a well known data sets• Evaluation on the experimental data sets

Péter Hága - TridentCom 2007

Neural Networks,Training

We made experiments with ...

– ... fixed structure neural networks. • Connections between the neurons are fixed. During the training

only the weights of the connections changes. • The most common solution of this problem is the backpropagation

algorithm.

– ... cascade neural networks. • Cascade training methods begin training with only input neurons

connected directly to output neurons.• Neurons are added one by one to the network and are connected to

all previous hidden and input neurons.

Péter Hága - TridentCom 2007

Neural Networks

0

0,5

1

1,5

2

2,5

ShortC

LongC

ShortC_ c

LongC_ c

Cascade36v1C(long)36v1C(short)777C(long)36v2C_ Cc(short)36v1Cc(long)36v1Cc(short)777Cc(long)

Est

imat

ion

erro

r

Test scenarios

Estimation error of available bandwidth for different neural networks

Péter Hága - TridentCom 2007

Neural Networks,Our choice

• cascade networks are more efficient in current situation then fix structured networks

• All the experiments presented here was made with cascade neural networks.

• We used Fahlman’s Cascade2 algorithm with the following parameters: – 35 input neurons, connected to

the δ′ sequences– 36 hidden neurons – one or two output neurons,

connected to the physical and available bandwidth.

SimulationsPerformance analysis

Péter Hága - TridentCom 2007

Performance analysis

• Wide range of packet level simulations • To compare the real parameters to the estimated ones• To compare the two estimators to each other• Three main classes:

– Single-hop scenarios with long term averages on 6000 ’ values – Single-hop scenarios with short term averages on 100 ’ values – Multi-hop scenarios with short term averages on 100 ’ values

• Simulated configurations:– Physical bandwidth: 2-20Mbps– Cross traffic: 0-18Mbps, Poissonian arrival process, trimodal packet size

distribution

• Relative error to compare the accuracy of the estimators

Péter Hága - TridentCom 2007

Single-hop,long term averages

Péter Hága - TridentCom 2007

Single-hop,short term averages

Péter Hága - TridentCom 2007

Multi-hop,short term averages

Péter Hága - TridentCom 2007

Laboratory experiments

• Our testbed consisted of 5 computers.– A traffic generator and sink.– Probe traffic sender and receiver.– Central router with a kernel module, to emulate bottleneck link.

• They are separated from other machines.• Measurement traffic was also separated from management traffic.

• Bandwidth: 1..10Mbps• The δ′ values were averaged to 100 consecutive trains.

Péter Hága - TridentCom 2007

Performance analysis

Estimation accuracy (RMS) in the different scenarios (Mbps).

Péter Hága - TridentCom 2007

Performance analysis,

evaluation time.

• Theoretical analysis [operation/sec]– Neural network:

• For one neuron: #inputs + k• For a cascade network the overall number of inputs are O(n2).

– n = #nodes ~ #inputs

– Granular model:• #inputs x #C x #Cc x NPg

• Usually: #C x #Cc x NPg

• Practice– Evaluation of the cascade network: ~16 sec.– Evaluation of the analytical model: ~100 msec.

Etomic wide area experiments

Péter Hága - TridentCom 2007

• The European Traffic Observatory Measurement InfrastruCture (etomic) was created in 2004-05 within the Evergrow Integrated Project.

• Central management system by Navarra University, Spain • The measurement stations are hosted by European Universities in

the Evergrow project, EuroLab members, MoMe members, associate partners of CNDA

• Its goals:– to provide open access, public testbed for researchers experimenting

the Internet – to serve as a Virtual Observatory active measurement data on the

European part of the Internet

Best Testbed Award

ETOMIC Infrastructure

Péter Hága - TridentCom 2007

Hardware architecture

• Each station consists of:– Server PC architecture– DAG 3.6 GE card with packet

sending capability (Endace)– own GPS antenna (Garmin)

for time synchronization

• Repository and data processing:– Everlab IBM blade center

(112 blades)

Péter Hága - TridentCom 2007

Best Testbed Award

Péter Hága - TridentCom 2007

• End-to-end packet pair (chirp) experiments between the sender and receiver nodes

• Appropriate probe packet pattern • DAG cards for precise timing in dispersion curves• Appropriate neural network was used (due to the

different scale than the simulations)• Experiments series performed between 11 ETOMIC

nodes in all-to-all fashion• Periodically collected data since the autumn of 2006

Data collection

Péter Hága - TridentCom 2007

Estimated physical bandwidth

• The exact values are 100Mbps.• All the estimated values are in 5% range.

Péter Hága - TridentCom 2007

Estimated available bandwidth

• The exact values are not known.• but the estimated values are consistent with each other for

the same destination.

Summary

Péter Hága - TridentCom 2007

Summary

• Main steps of using neural networks• Estimation accuracy,

– better than a single-hop analytical model– Packet level simulations– Laboratory experiments

• Evaluation time• Data collection in real world experiments in the Etomic active

probing infrastructure• New approach in network parameter estimation - neural networks

– Fast– Accurate– Low resource requirement

• Close future: raw data and estimated values will be organized in our Network Measurement Virtual Observatory(„Building a Prototype for Network Measurement Virtual Observatory'”, ACM Sigmetrics, MineNet 2007)

Péter Hága - TridentCom 2007

This work was partially supported by the National Science Foundation (OTKA T37903), the National Office for Research and Technology (NKFP 02/032/2004 and NAP 2005/ KCKHA005) and the EU IST

FET Complexity EVERGROW Integrated Project.

Thank you for your attention!

IST Future and Emerging Technologies