Post on 20-Aug-2020
1st SCAMPI Workshop – 27 Jan. 2003 – DataGRID Network Monitoring
Network monitoring in DataGRID project
Franck Bonnassieux (CNRS)franck.bonnassieux@ens-lyon.fr
1st SCAMPI Workshop – 27 Jan. 2003 – DataGRID Network Monitoring - 2
Outline
DataGRID network
Specificity of Grid network monitoring
Network metrics and measures
Architecture of DataGRID network monitoring
Network sensors and tools
High level tools
Perspectives
1st SCAMPI Workshop – 27 Jan. 2003 – DataGRID Network Monitoring - 3
DataGRID http://www.eu-datagrid.org/
• 7 applications distributed among 6 virtual organisations• 11 organisations over 15 countries• 40 sites in Europe• 12 work packages • WP7 : network work package
– provisioning of infrastructure – Network and Transport Services – Network and Grid traffic monitoring – Grid Security
1st SCAMPI Workshop – 27 Jan. 2003 – DataGRID Network Monitoring - 4
SuperJANET4
NIKHEF
RAL
SURFnetEuropean Topology : Geant, NRNs, Sites
CERN
1st SCAMPI Workshop – 27 Jan. 2003 – DataGRID Network Monitoring - 5
Why network performances monitoring in grid environments ?
For Provisioning:To be available, via visualization to human observer (user, network/system administrators)To provide tools for network performances measurement, problems identification and resolution (bottlenecks, point of unreliability, quality of service needs, topology…)To achieve network performance forecast and optimization –Capacity planning
For Resource Brokers:Network performance parameters are used for optimizing resource allocation (replication, Remote file access…)
1st SCAMPI Workshop – 27 Jan. 2003 – DataGRID Network Monitoring - 6
“Physical” view of the Network
Public NetworkNo securityNo predictable performancesNo control on the traffic
The flat INTERNET
Resource = CE (computing element) or Resource = SE(storage element)
1st SCAMPI Workshop – 27 Jan. 2003 – DataGRID Network Monitoring - 7
“Logical” view of the Grid Network
1st SCAMPI Workshop – 27 Jan. 2003 – DataGRID Network Monitoring - 8
Measurement methods
Active methodsInjection of traffic inside the network for testing performances between two pointsproblem: may be intrusive (TCP/UDP throughput)
Passive methodsCollect of traffic information in one point of the network : router, switch, dedicated passive host, computing element…
Problem : give network usage, not capacity
1st SCAMPI Workshop – 27 Jan. 2003 – DataGRID Network Monitoring - 9
Network metrics and tools
One Way Delay => Ripe Boxes
Round Trip Delay => PinGEr
Packet Loss => PinGEr
TCP throughput => IPerfEr
UDP throughput
Jitter => UDPMon
Routers traffic => NetLoad Agent
1st SCAMPI Workshop – 27 Jan. 2003 – DataGRID Network Monitoring - 10
Ripe boxes
RIPE box or Surveyor
one-way delay + lossUDP protocol
• specialized boxes
• Measurement between 2 RIPE boxes
• GPS time synchronization
Time
GPS
RIPE 2RIPE 1
1st SCAMPI Workshop – 27 Jan. 2003 – DataGRID Network Monitoring - 11
RIPE output entre UK et CERN
1st SCAMPI Workshop – 27 Jan. 2003 – DataGRID Network Monitoring - 12
Network Monitoring Architecture
PCP
TopoGrid RTPL
Distributed Data Collector
Raw
IPerfEr UDPmon NetLoad AgentPingEr
Measure on each site
CollectAndStorage
Visualization
MapCenter
Replica Managers & resources brokersNetwork Managers
Forecaster
Processing
ArchiveGlobus MDSR-GMA
Globus MDSR-GMA
NetworkCost
1st SCAMPI Workshop – 27 Jan. 2003 – DataGRID Network Monitoring - 13
PCP (Probe Coordination Protocol)
2
1
6
3
4
5
7
Scheduling of measuresby clique
1st SCAMPI Workshop – 27 Jan. 2003 – DataGRID Network Monitoring - 14
PingEr
RTT
LOSS
1st SCAMPI Workshop – 27 Jan. 2003 – DataGRID Network Monitoring - 15
IperfEr
TCP Throughput
1st SCAMPI Workshop – 27 Jan. 2003 – DataGRID Network Monitoring - 16
UDPmon
UDP Throughput
Loss
1st SCAMPI Workshop – 27 Jan. 2003 – DataGRID Network Monitoring - 17
NetLoad Agent
In/Out Traffic
1st SCAMPI Workshop – 27 Jan. 2003 – DataGRID Network Monitoring - 18
Metrics stored in LDAP (Globus MDS)
• R-GMA and Globus MDS Producer for storage of network metrics• Storage and aggregation of all historical data in archives
1st SCAMPI Workshop – 27 Jan. 2003 – DataGRID Network Monitoring - 19
rTPL
• High-level performance presentation interface• Built-on-demand network metrics matrixes• Customizable views for network administrators of all testbed sites
1st SCAMPI Workshop – 27 Jan. 2003 – DataGRID Network Monitoring - 20
TopoGRID http://ccwp7.in2p3.fr/topogrid
• Automatic discovery of routers and end nodes• Identification of routes and bottlenecks• Clickable Java Applet for in deep analysis of network paths
1st SCAMPI Workshop – 27 Jan. 2003 – DataGRID Network Monitoring - 21
MapCenter http://ccwp7.in2p3.fr/mapcenter
•• Open and Flexible tool to Open and Flexible tool to visualize in realvisualize in real--time Grid time Grid status and Network status and Network performancesperformances• Automatic discovery and positioning of resources in graphical maps• Advanced stealth monitoring techniques (ICMP, TCP, UDP, HTTP, LDAP)
1st SCAMPI Workshop – 27 Jan. 2003 – DataGRID Network Monitoring - 22
NetworkCost functions
13,084,046,534,5CNAF
7,086,2410,385,03IN2P3
2,6611,863,2511,13NIKHEF
4,357,122,447,46RAL
35,4444,8777,7846,75 CERN
CNAFIN2P3NIKHEFRALCERN
CERNRALNIKHEFIN2P3CNAF
CERNRALNIKHEFIN2P3CNAF
getNetworkCost
FileSize = 10 MBResults = time to transfer (sec.)
1st SCAMPI Workshop – 27 Jan. 2003 – DataGRID Network Monitoring - 23
NetworkCost model
The current cost model is designed for data intensive computing and especially large files transfers
The most relevant metric for that cost model is available throughput
ImplementationIperf Measurements (current)
GridFTP Logs (future)
Other metrics (future) : UDP, RTT, Jitter, ...
Synchronisation (PCP)
1st SCAMPI Workshop – 27 Jan. 2003 – DataGRID Network Monitoring - 24
MapCenter – Network Monitoring Page
1st SCAMPI Workshop – 27 Jan. 2003 – DataGRID Network Monitoring - 25
Perspectives
GRID -> SCAMPITestbed infrastructure
Huge traffic generation on gigabits links
End-to-end metrics
…
SCAMPI -> GRIDTraffic analysis (protocol distribution)
GridFTP, Remote file access, …
Grid traffic measurement (per protocol or/and per address domains)
…