Solving Network Throughput Problems at the Diamond Light Source
-
Upload
jisc -
Category
Technology
-
view
214 -
download
0
Transcript of Solving Network Throughput Problems at the Diamond Light Source
![Page 1: Solving Network Throughput Problems at the Diamond Light Source](https://reader030.fdocuments.us/reader030/viewer/2022032711/5872eb8e1a28abfa548b720b/html5/thumbnails/1.jpg)
Alex White, Campus network engineering workshop19/10/2016 Solving Network Throughput Problems at the
Diamond Light Source
![Page 2: Solving Network Throughput Problems at the Diamond Light Source](https://reader030.fdocuments.us/reader030/viewer/2022032711/5872eb8e1a28abfa548b720b/html5/thumbnails/2.jpg)
Introduction to Diamond Light SourceSolving Network Throughput Problemsat the Diamond Light Source
Alex [email protected]
![Page 3: Solving Network Throughput Problems at the Diamond Light Source](https://reader030.fdocuments.us/reader030/viewer/2022032711/5872eb8e1a28abfa548b720b/html5/thumbnails/3.jpg)
![Page 4: Solving Network Throughput Problems at the Diamond Light Source](https://reader030.fdocuments.us/reader030/viewer/2022032711/5872eb8e1a28abfa548b720b/html5/thumbnails/4.jpg)
So, what do we actually do?
The Diamond machine is a type of particle accelerator
CERN = high energy particles smashed together and analyse the “crash”!
Diamond = accelerate electrons to produce synchrotron light
Use this light to study matter – like a “super microscope”
![Page 5: Solving Network Throughput Problems at the Diamond Light Source](https://reader030.fdocuments.us/reader030/viewer/2022032711/5872eb8e1a28abfa548b720b/html5/thumbnails/5.jpg)
Three particle accelerators:
Linear accelerator
Booster Synchrotron
Storage ring (48 straight sections angled
together, 562m long)
The Diamond machine
![Page 6: Solving Network Throughput Problems at the Diamond Light Source](https://reader030.fdocuments.us/reader030/viewer/2022032711/5872eb8e1a28abfa548b720b/html5/thumbnails/6.jpg)
Simultaneous Experiments
![Page 7: Solving Network Throughput Problems at the Diamond Light Source](https://reader030.fdocuments.us/reader030/viewer/2022032711/5872eb8e1a28abfa548b720b/html5/thumbnails/7.jpg)
Data-intensive research
Lustre and GPFS filesystems: 430TB, 900TB, 3.3PB as of 2016
Typical X-ray camera 4MB * 100hz An experiment can easily produce 300GB-1TB Scientists want to take their data home
![Page 8: Solving Network Throughput Problems at the Diamond Light Source](https://reader030.fdocuments.us/reader030/viewer/2022032711/5872eb8e1a28abfa548b720b/html5/thumbnails/8.jpg)
Site Limitations
Scientific data download speeds from Diamond to visiting user’s institutes were inconsistent and slow even though the facility had a “10Gb/s” JANET connection from STFC.
The limit on download speeds was delaying post-experiment data analysis by academics at their home institutes.
![Page 9: Solving Network Throughput Problems at the Diamond Light Source](https://reader030.fdocuments.us/reader030/viewer/2022032711/5872eb8e1a28abfa548b720b/html5/thumbnails/9.jpg)
How did we characterise the problem?
We set ourselves an initial target of “a stable 50Mb/s over a 10ms path”
![Page 10: Solving Network Throughput Problems at the Diamond Light Source](https://reader030.fdocuments.us/reader030/viewer/2022032711/5872eb8e1a28abfa548b720b/html5/thumbnails/10.jpg)
Initial Findings
10Gb/s inside our network, with no packet loss Low speeds found with iperf over the
STFC/JANET segment between Diamond's edge and the Physics Department at Oxford
We saw a small amount of packet loss over the STFC/JANET link
![Page 11: Solving Network Throughput Problems at the Diamond Light Source](https://reader030.fdocuments.us/reader030/viewer/2022032711/5872eb8e1a28abfa548b720b/html5/thumbnails/11.jpg)
TCP Performance and the Mathis equation
Packet size Latency (AKA Round Trip Time) Packet Loss
![Page 12: Solving Network Throughput Problems at the Diamond Light Source](https://reader030.fdocuments.us/reader030/viewer/2022032711/5872eb8e1a28abfa548b720b/html5/thumbnails/12.jpg)
“Interesting” effects of packet loss
![Page 13: Solving Network Throughput Problems at the Diamond Light Source](https://reader030.fdocuments.us/reader030/viewer/2022032711/5872eb8e1a28abfa548b720b/html5/thumbnails/13.jpg)
Packet Loss
According to Mathis, to achieve our initial goal of 50Mb/s over a 10ms path the tolerable packet loss is 0.026% maximum.
![Page 14: Solving Network Throughput Problems at the Diamond Light Source](https://reader030.fdocuments.us/reader030/viewer/2022032711/5872eb8e1a28abfa548b720b/html5/thumbnails/14.jpg)
Finding the problem – the Last Mile
We worked with STFC to connect a PerfSonar server directly to the Harwell site border router.
Tests with this extra server allowed us to pinpoint the STFC firewall (our “last mile”) as the source of the insidious packet loss.
![Page 15: Solving Network Throughput Problems at the Diamond Light Source](https://reader030.fdocuments.us/reader030/viewer/2022032711/5872eb8e1a28abfa548b720b/html5/thumbnails/15.jpg)
The Fix: Science DMZ
![Page 16: Solving Network Throughput Problems at the Diamond Light Source](https://reader030.fdocuments.us/reader030/viewer/2022032711/5872eb8e1a28abfa548b720b/html5/thumbnails/16.jpg)
The Fix: Science DMZ
![Page 17: Solving Network Throughput Problems at the Diamond Light Source](https://reader030.fdocuments.us/reader030/viewer/2022032711/5872eb8e1a28abfa548b720b/html5/thumbnails/17.jpg)
Globus GridFTP
Uses parallel TCP streams Simple, web-based interface
![Page 18: Solving Network Throughput Problems at the Diamond Light Source](https://reader030.fdocuments.us/reader030/viewer/2022032711/5872eb8e1a28abfa548b720b/html5/thumbnails/18.jpg)
Performance with Science DMZ
Test data: 2Gb/s+ consistently between DLS and Brookhaven National Labs (USA)!
Actual transfers in August 2016:Fastest: Crystallography dataset from DLS to
Newcastle: 260GB @ 480Mb/sBiggest: Electron Microscope data from DLS to
Imperial: 1120GB @ 290Mb/s
![Page 19: Solving Network Throughput Problems at the Diamond Light Source](https://reader030.fdocuments.us/reader030/viewer/2022032711/5872eb8e1a28abfa548b720b/html5/thumbnails/19.jpg)
Security in the Science DMZ
![Page 20: Solving Network Throughput Problems at the Diamond Light Source](https://reader030.fdocuments.us/reader030/viewer/2022032711/5872eb8e1a28abfa548b720b/html5/thumbnails/20.jpg)
In Summary
1. Use real-world testing to find packet loss2. Zero packet loss is crucial3. The last mile is usually the problem4. Firewalls have been shown to introduce packet loss – this is
backed up by ESnet's own testing5. Don't use SCP as the common implementation has a fixed
TCP window size – it will never grow to fill your link