Moving more data, faster (yours!) – the Science DMZ as ... · Moving more data, faster (yours!)...
Transcript of Moving more data, faster (yours!) – the Science DMZ as ... · Moving more data, faster (yours!)...
Movingmoredata,faster(yours!)–theScienceDMZasanenabler
PresentedatDIRISANationalResearchDataWorkshop
byKasandraPillay–SeniorEngineer,SANReN
Overview
Architectures and tools for optimising big data transfers
especially for science and research – how fast can you go? • Science DMZ network architecture
• Data transfer nodes and tools
• perfSONAR monitoring toolkit
• Motivation for Science DMZ • What is SANReN doing?
Science DMZ
“A Network Design Pattern for Data-Intensive Science”
• Trademark of the Energy Sciences Network (ESnet – USA)
• Built at or near lab or campus network perimeter
• Optimised for high-performance scientific applications
• Not general purpose / everyday business computing
• Addresses common network performance problems
• Tailored to high performance science applications, high- volume bulk data transfer
following slides used with permission… (http://fasterdata.es.net/science-dmz/science-dmz-community-presentation/)
ScienceDMZDesignPattern(Abstract)
10GE
10GE
10GE
10GE
10G
Border Router
WAN
Science DMZSwitch/Router
Enterprise Border Router/Firewall
Site / CampusLAN
High performanceData Transfer Node
with high-speed storage
Per-service security policy control points
Clean, High-bandwidth
WAN path
Site / Campus access to Science
DMZ resources
perfSONAR
perfSONAR
perfSONAR
©2015,TheRegentsoftheUniversityofCalifornia,throughLawrenceBerkeleyNationalLaboratoryandislicensedunderCCBY-NC-ND4.0
3
Key components of the Science DMZ
• Dedicated Network enclave
• Dedicated software and systems for data transfer
• Integrated performance measurement and monitoring (perfSONAR)
• Tailored, performant security
4
ScienceDMZDesignPattern(Abstract)
5©2015,TheRegentsoftheUniversityofCalifornia,throughLawrenceBerkeleyNationalLaboratoryandislicensedunderCCBY-NC-ND4.0
10GE
10GE
10GE
10GE
10G
Border Router
WAN
Science DMZSwitch/Router
Enterprise Border Router/Firewall
Site / CampusLAN
High performanceData Transfer Node
with high-speed storage
Per-service security policy control points
Clean, High-bandwidth
WAN path
Site / Campus access to Science
DMZ resources
perfSONAR
perfSONAR
perfSONAR
• Networksareanessentialpartofdata-intensivescience– Connectdatasourcestodataanalysis– Connectcollaboratorstoeachother– Enablemachine-consumableinterfacestodataandanalysisresources(e.g.portals),automation,scale
• Performanceiscritical– Exponentialdatagrowth– Constanthumanfactors– Datamovementanddataanalysismustkeepup
• Effectiveuseofwidearea(long-haul)networksbyscientistshashistoricallybeendifficult
Motivation
©2015,TheRegentsoftheUniversityofCalifornia,throughLawrenceBerkeleyNationalLaboratoryandislicensedunderCCBY-NC-ND4.0
“but we’ve always shipped disks!”
6
DataMobilityinaGivenTimeInterval
This table available at:http://fasterdata.es.net/fasterdata-home/requirements-and-expectations/
©2015,TheRegentsoftheUniversityofCalifornia,throughLawrenceBerkeleyNationalLaboratoryandislicensedunderCCBY-NC-ND4.07
1 TB of data vs network speed
10 Mbps 300 hrs (12.5 days) 100 Mbps 30 hrs 1 Gbps 3 hrs 10 Gbps 20 mins
• The disk subsystem can also be a bottleneck • Parallel streams • Don't try saturate the network - be nice • Rule of thumb: 1/4 to 1/3 of a shared path that has
nominal background load • E.g. 1 Gbps host: target 150-200 Mbps (20-25 MB/s)
= DHL: time to ship… L
A small amount of packet loss makes a huge difference in TCP performance
MetroArea
Local(LAN)
Regional
Continental
International
Measured (TCP Reno) Measured (HTCP) Theoretical (TCP Reno) Measured (no loss)
With loss, high performance beyond metro distances is essentially impossible
©2015,TheRegentsoftheUniversityofCalifornia,throughLawrenceBerkeleyNationalLaboratoryandislicensedunderCCBY-NC-ND4.0
PTA--CPT
10 G
1 G or <
9
DataTransferNode• Highperformance• Configuredspecifically
fordatatransfer• Propertools• GridFTP/Globus, etc.
perfSONAR• Enablesfault
isolation• Verifycorrect
operation• Widelydeployedin
ESnetandothernetworks,aswellassitesandfacilities
TheScienceDMZDesignPattern
©2015,TheRegentsoftheUniversityofCalifornia,throughLawrenceBerkeleyNationalLaboratoryandislicensedunderCCBY-NC-ND4.0
Dedicated Systems for
Data Transfer
Network Architecture
Performance Testing &
Measurement
ScienceDMZ• Dedicatednetwork
locationforhigh-speeddataresources
• Appropriatesecurity• Easytodeploy-no
needtoredesignthewholenetwork
10
Engagement with Network
Users
Engagement• Resourcesand
Knowledgebase• Partnerships• Educationand
Consulting
Why Science DMZ?
• Performance Improve performance of data-intensive research High-speed access to cloud resources
• Usability Software to enable high-speed data transfer, e.g. Globus
• Cost Delay expensive firewall upgrades High-speed switches, rather than expensive routers
• Security Maintaining layered security, applied on both network and host
PERT Performance Enhancement Response Team
“Performance Enhancement Response Teams (PERTs) provide an investigation and consulting service to academic and research users on their
network performance issues.” Source: GEANT eduPERT
3 locations have been chosen and DTNs deployed for first phase:
§ Wits University (Johannesburg) § CSIR (Pretoria) § Teraco Data Centre (Rondebosch, Cape Town)
§ CHPC have their own Globus node that is operational and has been tested with the POC nodes.
§ Test DTN in SANReN lab
SANReN Proof of Concept
Typical Use Cases for the Service
§ CSIR LandSAT generates 40TB of raw data and sends it in 1TB chunks for processing to the CHPC. Processed data results in 160TB of output data which needs to be transferred back to CSIR.
§ H3BioNet transfers > 100TB or human genome data nationally and internationally - regularly.
§ South African Weather Service transfers TB’s of data regularly between their systems and the CHPC for processing.
§ And many, many others.
Initial results?
§ All DTN’s are connected to the SANReN network at 10Gb/s. § In real tests we are achieving between 1Gb/s and 6Gb/s or
real throughput nationally. § And around 1Gb/s internationally. § For example a 500GB file was transferred from CSIR to
Wits in 36 minutes at an effective throughput of ~1.7Gb/s – but this was at 1:20pm when SANReN is typically highly loaded.
§ A 100GB file was transferred from ESnet in a time of 21 minutes at an effective throughput of 1Gb/s – also in the middle of the work day.
What’s next?
§ Engagement with potential users will happen to determine specific requirements, use cases and required tools.
§ Additional data transfer tools will be setup on the DTNs based on user/project requirements (Using Globus)
§ Monitor the POC. § Engage with users and potential sites for more DTN’s. § Workshop additional requirements with users of the
service. § International tests