Investigating Network Performance – A Case Study
description
Transcript of Investigating Network Performance – A Case Study
![Page 1: Investigating Network Performance – A Case Study](https://reader034.fdocuments.us/reader034/viewer/2022051517/56815734550346895dc4d2a7/html5/thumbnails/1.jpg)
Investigating Network Performance – A Case Study
Ralph Spencer, Richard Hughes-Jones, Matt Strong and Simon Casey
The University of ManchesterG2 Technical Workshop, Cambridge, Jan
2006
![Page 2: Investigating Network Performance – A Case Study](https://reader034.fdocuments.us/reader034/viewer/2022051517/56815734550346895dc4d2a7/html5/thumbnails/2.jpg)
Very Long Baseline Interferometry
eVLBI – using the Internet for data transfer
![Page 3: Investigating Network Performance – A Case Study](https://reader034.fdocuments.us/reader034/viewer/2022051517/56815734550346895dc4d2a7/html5/thumbnails/3.jpg)
GRS 1915+105: 15 solar mass BH in an X-ray binary: MERLIN observations
receding
600 mas = 6000 A.U. at 10 kpc
![Page 4: Investigating Network Performance – A Case Study](https://reader034.fdocuments.us/reader034/viewer/2022051517/56815734550346895dc4d2a7/html5/thumbnails/4.jpg)
Sensitivity in Radio Astronomy
• Noise level• B=bandwidth, integration
time.• High sensitivity requires large
bandwidths as well as large collecting area e.g Lovell, GBT, Effelsberg, Camb. 32-m
• Aperture synthesis needs signals from individual antennas to be correlated together at a central site
• Need for interconnection data rates of many Gbit/sec
B/1
![Page 5: Investigating Network Performance – A Case Study](https://reader034.fdocuments.us/reader034/viewer/2022051517/56815734550346895dc4d2a7/html5/thumbnails/5.jpg)
New Instruments are making the best use of bandwidth:
• eMERLIN 30 Gbps• Atacama Large mm Array
(ALMA) 120 Gbps• EVLA 120 Gbps• Upgrade to European VLBI:
eVLBI 1 Gbps• Square Km Array (SKA)
many Tbps
![Page 6: Investigating Network Performance – A Case Study](https://reader034.fdocuments.us/reader034/viewer/2022051517/56815734550346895dc4d2a7/html5/thumbnails/6.jpg)
The European VLBI NetworkEVN
• Detailed radio imaging uses antenna networks over 100s-1000s km
• Currently use disk recording at 512Mb/s (Mk5)
• real-time connection allows greater – response– reliability– sensitivity– Need Internet
eVLBI
![Page 7: Investigating Network Performance – A Case Study](https://reader034.fdocuments.us/reader034/viewer/2022051517/56815734550346895dc4d2a7/html5/thumbnails/7.jpg)
WesterborkNetherlands
Dedicated
Gbit link
EVN-NREN
OnsalaSweden
Gbit link
Jodrell BankUK
DwingelooDWDM link
CambridgeUK
MERLIN
MedicinaItaly
Chalmers University
of Technolo
gy, Gothenbu
rg
TorunPoland
Gbit link
![Page 8: Investigating Network Performance – A Case Study](https://reader034.fdocuments.us/reader034/viewer/2022051517/56815734550346895dc4d2a7/html5/thumbnails/8.jpg)
Testing the Network for eVLBI Aim is to obtain maximum BW compatible
with VLBI observing systems in Europe and USA.
First sustained data flow tests in Europe:
iGRID 200224-26 September 2002
Amsterdam Science and Technology Centre (WTCW)
The Netherlands“ We hereby challenge the international research
community to demonstrate applications that benefit from huge amounts of bandwidth! ”
![Page 9: Investigating Network Performance – A Case Study](https://reader034.fdocuments.us/reader034/viewer/2022051517/56815734550346895dc4d2a7/html5/thumbnails/9.jpg)
iGRID2002 Radio Astronomy VLBI Demo.
• Web based demonstration sending VLBI data– A controlled stream of UDP packets– 256-500 Mbit/s
• production network Man –Superjanet Geant --Amsterdam
• Dedicated lambda Amsterdam Dwingeloo
![Page 10: Investigating Network Performance – A Case Study](https://reader034.fdocuments.us/reader034/viewer/2022051517/56815734550346895dc4d2a7/html5/thumbnails/10.jpg)
The Works:
n bytes
Wait timetime
Raid0Disc
UDP Data
Raid0Disc
RingBuffer RingBuffer
TCP Control
Web Interface
![Page 11: Investigating Network Performance – A Case Study](https://reader034.fdocuments.us/reader034/viewer/2022051517/56815734550346895dc4d2a7/html5/thumbnails/11.jpg)
UDP Throughput on the Production WAN
Manc-UvA SARA 750 Mbit/s SJANET4 + Geant +
SURFnet 75% Manchester Access link
Manc-UvA SARA 825 Mbit/s
UDP Man-UvA Gig 19 May 02
0
100
200
300
400
500
600
700
800
900
1000
0 5 10 15 20 25 30 35 40
Transmit Time per frame us
Rec
v W
ire
rate
Mb
its/
s
50 bytes
100 bytes
200 bytes
400 bytes
600 bytes
800 bytes
1000 bytes
1200 bytes
1472 bytes
UDP Man-UvA Gig 28 Apr 02
0
100
200
300
400
500
600
700
800
900
1000
0 5 10 15 20 25 30 35 40
Transmit Time per frame us
Rec
v W
ire
rate
Mbi
ts/s
50 bytes 100 bytes 200 bytes 400 bytes 600 bytes 800 bytes 1000 bytes 1200 bytes 1400 bytes 1472 bytes
![Page 12: Investigating Network Performance – A Case Study](https://reader034.fdocuments.us/reader034/viewer/2022051517/56815734550346895dc4d2a7/html5/thumbnails/12.jpg)
![Page 13: Investigating Network Performance – A Case Study](https://reader034.fdocuments.us/reader034/viewer/2022051517/56815734550346895dc4d2a7/html5/thumbnails/13.jpg)
How do we test the network?• Simple connectivity test from Telescope site to
correlator (at JIVE, Dwingeloo, The Netherlands, or MIT Haystack Observatory, Massachusetts) : traceroute, bwctl
• Performance of link and end hosts: UDPmon, iPERF• Sustained data tests vlbiUDP (under development)• True eVLBI data from Mk5 recorder: pre-recorded
(Disk2Net) or Real Time (Out2Net)
Mk 5’s are 1.2 GHz P3’s with Streamstore cardsand 8-pack exchangeable disks, 1.3 Tbytes storage.Capable of 1 Gbps continuous recording and playback.Made by Conduant, Haystack design.
![Page 14: Investigating Network Performance – A Case Study](https://reader034.fdocuments.us/reader034/viewer/2022051517/56815734550346895dc4d2a7/html5/thumbnails/14.jpg)
Jodrell BankUK
OnsalaSweden
MedicinaItaly
TorunPoland
EffelsbergGermany
WesterborkNetherlands
Telescope connections
JIVE
1Gb/s
1Gb/s
1Gb/s
155Mb/s
MERLIN
CambridgeUK
2* 1G
1Gb/s light now
MERLINe
??end 06???
![Page 15: Investigating Network Performance – A Case Study](https://reader034.fdocuments.us/reader034/viewer/2022051517/56815734550346895dc4d2a7/html5/thumbnails/15.jpg)
eVLBI Milestones • January 2004: Disk buffered eVLBI session:
• Three telescopes at 128Mb/s for first eVLBI image
• On – Wb fringes at 256Mb/s
• April 2004: Three-telescope, real-time eVLBI session.
• Fringes at 64Mb/s• First real-time EVN image - 32Mb/s.
• September 2004: Four telescope real-time eVLBI• Fringes to Torun and Arecibo• First EVN, eVLBI Science session
• January 2005: First “dedicated light-path” eVLBI• ??Gbyte of data from Huygens descent
transferred from Australia to JIVE• Data rate ~450Mb/s
![Page 16: Investigating Network Performance – A Case Study](https://reader034.fdocuments.us/reader034/viewer/2022051517/56815734550346895dc4d2a7/html5/thumbnails/16.jpg)
• 20 December 20 2004• connection of JBO to Manchester by 2 x 1 GE• eVLBI tests between Poland Sweden UK and Netherlands at 256 Mb/s
• February 2005• TCP and UDP memory – memory tests at rates up to 450 Mb/s(TCP) and 650 Mb/s (UDP)• Tests showed inconsistencies betweeb Red Hat kernals, rates of 128 Mb/s only obtained on 10 Feb• Haystack (US) – Onsala (Sweden) runs at 256 Mb/s
• 11 March 2005 Science demo• JBO telescope winded off, short run on calibrator source done
![Page 17: Investigating Network Performance – A Case Study](https://reader034.fdocuments.us/reader034/viewer/2022051517/56815734550346895dc4d2a7/html5/thumbnails/17.jpg)
![Page 18: Investigating Network Performance – A Case Study](https://reader034.fdocuments.us/reader034/viewer/2022051517/56815734550346895dc4d2a7/html5/thumbnails/18.jpg)
Summary of EVN eVLBI tests
• Regular tests with eVLBI Mk5 data every ~6 weeks – 128 Mpbs OK, 256 Mpbs often,– 512 Mbps Onsala – Jive occasionally– but not JBO at 512 Mbps – WHY NOT?(NB using Jumbo packets 4470 or 9000 bytes)
• Note correlator can cope with large error rates– up to ~ 1 %– but need high throughput for sensitivity– implications for protocols, since throughput on TCP
is very sensitive to packet loss.
![Page 19: Investigating Network Performance – A Case Study](https://reader034.fdocuments.us/reader034/viewer/2022051517/56815734550346895dc4d2a7/html5/thumbnails/19.jpg)
Gnt5-DwMk5 11Nov03-1472 bytes
0
2
4
6
8
10
12
0 5 10 15 20 25 30 35 40Spacing between frames us
% P
acket
loss
Gnt5-DwMk5
DwMk5-Gnt5
Throughput vs packet spacing Manchester: 2.0G Hz Xeon Dwingeloo: 1.2 GHz PIII Near wire rate, 950 Mbps UDPmon
Packet loss
CPU Kernel Load sender
CPU Kernel Load receiver 4th Year project
Adam Mathews Steve O’Toole
Gnt5-DwMk5 11Nov03/DwMk5-Gnt5 13Nov03-1472bytes
0
200
400
600
800
1000
1200
0 5 10 15 20 25 30 35 40Spacing between frames us
Recv W
ire r
ate
Mbits/s
Gnt5-DwMk5
DwMk5-Gnt5
Gnt5-DwMk5 11Nov03 1472 bytes
020406080
100
0 5 10 15 20 25 30 35 40Spacing between frames us
% K
erne
l S
ende
r
Gnt5-DwMk5 11Nov03 1472 bytes
020406080
100
0 5 10 15 20 25 30 35 40Spacing between frames us
% K
erne
l R
ecei
ver
UDP Throughput Oct-Nov 2003 Manchester-Dwingeloo Production
![Page 20: Investigating Network Performance – A Case Study](https://reader034.fdocuments.us/reader034/viewer/2022051517/56815734550346895dc4d2a7/html5/thumbnails/20.jpg)
ESLEA
• Packet loss will cause low throughput in TCP/IP
• Congestion will result in routers drooping packets: use Switched Light Paths!
• Tests with MB-NG network Jan-Jun 05
• JBO connected to JIVE via UKLight in June (thanks to John Graham, UKERNA)
• Comparison tests between UKLight connections JBO-JIVE and production (SJ4-Geant)
![Page 21: Investigating Network Performance – A Case Study](https://reader034.fdocuments.us/reader034/viewer/2022051517/56815734550346895dc4d2a7/html5/thumbnails/21.jpg)
Project Partners
Project Collaborators
The Council for the Central Laboratoryof the Research Councils
Funded by
EPSRC GR/T04465/01
www.eslea.uklight.ac.uk
£1.1 M, 11.5 FTE
![Page 22: Investigating Network Performance – A Case Study](https://reader034.fdocuments.us/reader034/viewer/2022051517/56815734550346895dc4d2a7/html5/thumbnails/22.jpg)
UKLight Switched light path
![Page 23: Investigating Network Performance – A Case Study](https://reader034.fdocuments.us/reader034/viewer/2022051517/56815734550346895dc4d2a7/html5/thumbnails/23.jpg)
Tests on the UKLight switched light-path Manchester : Dwingeloo
• Throughput as a function of inter-packet spacing (2.4 GHz dual Xeon machines)
• Packet loss for small packet size • Maximum size packets can reach
full line rates with no loss, and there was no re-ordering (plot not shown).
gig03-jiveg1_UKL_25Jun05
0100200300400500600700800900
1000
0 10 20 30 40Spacing between frames us
Rec
v W
ire r
ate
Mbi
t/s
50 bytes
100 bytes
200 bytes
400 bytes
600 bytes
800 bytes
1000 bytes
1200 bytes
1400 bytes
1472 bytes
gig03-jiveg1_UKL_25Jun05
0.0001
0.001
0.01
0.1
1
10
100
0 10 20 30 40Spacing between frames us
% P
acke
t lo
ss
50 bytes
100 bytes 200 bytes
400 bytes 600 bytes
800 bytes 1000 bytes
1200 bytes 1400 bytes
1472 bytes
![Page 24: Investigating Network Performance – A Case Study](https://reader034.fdocuments.us/reader034/viewer/2022051517/56815734550346895dc4d2a7/html5/thumbnails/24.jpg)
Tests on the production network Manchester : Dwingeloo.
• Throughput
• Small (0.2%) packet loss was seen
• Re-ordering of packets was significant
gig6-jivegig1_31May05
0.0001
0.001
0.01
0.1
1
10
100
0 10 20 30 40Spacing between frames us
% P
acket
loss
50 bytes
100 bytes 200 bytes
400 bytes 600 bytes
800 bytes 1000 bytes
1200 bytes 1400 bytes
1472 bytes
![Page 25: Investigating Network Performance – A Case Study](https://reader034.fdocuments.us/reader034/viewer/2022051517/56815734550346895dc4d2a7/html5/thumbnails/25.jpg)
UKLight using Mk5 recording terminals
![Page 26: Investigating Network Performance – A Case Study](https://reader034.fdocuments.us/reader034/viewer/2022051517/56815734550346895dc4d2a7/html5/thumbnails/26.jpg)
Jodrell BankUK
DwingelooDWDM link
MedicinaItaly Torun
Poland
e-VLBI at the GÉANT2 Launch Jun 2005
![Page 27: Investigating Network Performance – A Case Study](https://reader034.fdocuments.us/reader034/viewer/2022051517/56815734550346895dc4d2a7/html5/thumbnails/27.jpg)
UDP Performance: 3 Flows on GÉANT• Throughput: 5 Hour run 1500 byte MTU
• Jodrell: JIVE2.0 GHz dual Xeon – 2.4 GHz dual Xeon670-840 Mbit/s
• Medicina (Bologna): JIVE 800 MHz PIII – Mk5 (623) 1.2 GHz PIII 330 Mbit/s limited by sending PC
• Torun: JIVE 2.4 GHz dual Xeon – Mk5 (575) 1.2 GHz PIII
245-325 Mbit/s limited by security policing
(>600Mbit/s 20 Mbit/s) ?
• Throughput: 50 min period• Period is ~17 min
BW 14Jun05
0
200
400
600
800
1000
0 500 1000 1500 2000Time 10s steps
Rec
v w
ire ra
te M
bit/s
JodrellMedicinaTorun
BW 14Jun05
0
200
400
600
800
1000
200 250 300 350 400 450 500Time 10s steps
Rec
v w
ire ra
te M
bit/s
JodrellMedicinaTorun
![Page 28: Investigating Network Performance – A Case Study](https://reader034.fdocuments.us/reader034/viewer/2022051517/56815734550346895dc4d2a7/html5/thumbnails/28.jpg)
18 Hour Flows on UKLightJodrell – JIVE, 26 June 2005
• Throughput:• Jodrell: JIVE
2.4 GHz dual Xeon – 2.4 GHz dual Xeon
960-980 Mbit/s
• Traffic through SURFnet
• Packet Loss– Only 3 groups with 10-150 lost
packets each– No packets lost the rest of the
time
• Packet re-ordering– None
man03-jivegig1_26Jun05
0
200
400
600
800
1000
0 1000 2000 3000 4000 5000 6000 7000
Time 10s steps
Rec
v w
ire r
ate
Mbi
t/s
w10
man03-jivegig1_26Jun05
900910920930940950
960970980990
1000
5000 5050 5100 5150 5200
Time 10s
Recv w
ire r
ate
Mbit/s w10
man03-jivegig1_26Jun05
1
10
100
1000
0 1000 2000 3000 4000 5000 6000 7000
Time 10s steps
Packet
Loss
w10
![Page 29: Investigating Network Performance – A Case Study](https://reader034.fdocuments.us/reader034/viewer/2022051517/56815734550346895dc4d2a7/html5/thumbnails/29.jpg)
Recent Results 1:
• iGRID 2005 and SC 2005– Global eVLBI demonstration– Achieved 1.5 Gbps across Atlantic using UKLight– 3 VC-3-13c ~700 Mbps SDH links carrying data
across the Atlantic from Onsala, JBO and Westerbork telescopes
– 512 Mps K4 – Mk5data from Japan to USA– 512 Mbs Mk5 real time interferometry between
Onsala, Westford, Maryland Point antennas correlated at Haystack observatory
– Used VLSR technology from DRAGON project in US to set up light paths.
![Page 30: Investigating Network Performance – A Case Study](https://reader034.fdocuments.us/reader034/viewer/2022051517/56815734550346895dc4d2a7/html5/thumbnails/30.jpg)
<JBO Mk2 Westerbork array>
Onsala 20-m
Kashima 34-m >
![Page 31: Investigating Network Performance – A Case Study](https://reader034.fdocuments.us/reader034/viewer/2022051517/56815734550346895dc4d2a7/html5/thumbnails/31.jpg)
Recent results 2:• Why can Onsala achieve 512 Mbps from Mk5 to Mk5 even transatlantic?
– Identical Mk5 to JBO – Longer link
• iperf TCP JBO Mk5 to Man. rtt ~1ms 4420 byte packets get 960 Mpbs
• iperf TCP JBO Mk5 to JIVE rtt ~15ms 4420 byte packets get 777 Mpbs
Not much wrong with the networks!
• – –
• shows 94.7% kernel usage and 1.5% idle
• shows 96.3% kernel usage and 0.06% idle – no cpu left!
• Likelihood is that Onsala Mk 5 marginally faster cpu – at critical point for 512 Mbps transmission
• Solution – better motherboards for Mk5’s – about 40 machines to upgrade!
mk5-606-jive_9Dec05
0102030405060708090
100
0 1 2 3 4 5trial
% C
PU
ker
nel
00.511.522.533.544.55
% C
PU
mod
e
kernel
user
nice
idle
mk5-606-g7_10Dec05
0100200300400500600700800900
1000
0 2 4 6 8 10 12 14 16 18 20nice large value - low priority
Thr
ough
put M
bit/s
no CPU load
![Page 32: Investigating Network Performance – A Case Study](https://reader034.fdocuments.us/reader034/viewer/2022051517/56815734550346895dc4d2a7/html5/thumbnails/32.jpg)
The Future:• Regular eVLBI tests in EVN continue• Testing Mk5 SuperStor interface <-> network
interaction• Test upgraded Mk5 recording devices• Investigate alternatives to TCP/UDP – DCCP,
vlbiUDP, tsunami, etc.• ESLEA comparing UKLight with production• EU’s EXPReS eVLBI project starts March 2006
– Connection of 100-m Effelsberg telescope in 2006– Protocols for distributed processing– Onsala-JBO correlator test link at 4 Gbps in 2007
• eVLBI will become routine in 2006!
![Page 33: Investigating Network Performance – A Case Study](https://reader034.fdocuments.us/reader034/viewer/2022051517/56815734550346895dc4d2a7/html5/thumbnails/33.jpg)
Processing Nodes
Controller/DataConcentrator
VLBI Correlation: GRID Computation task
![Page 34: Investigating Network Performance – A Case Study](https://reader034.fdocuments.us/reader034/viewer/2022051517/56815734550346895dc4d2a7/html5/thumbnails/34.jpg)
Questions ?