1
ESnetTrends and Pressures
andLong Term Strategy
ESCC, July 21, 2004
William E. Johnston, ESnet Dept. Head and Senior Scientist
R. P. Singh, Project Manager
Michael S. Collins, Stan Kluz,Joseph Burrescia, and James V. Gagliardi, ESnet Leads
and the ESnet Team
Lawrence Berkeley National Laboratory
2
DOE Science Bandwidth Requirements
• Bandwidth requirements are established by the scientific community by looking ato the increase in the rates at which supercomputers
generate data
o the geographic scope of the community that must analyze that data
o the types of distributed applications must run on geographically diverse systems
- e.g. whole system climate models
o the data rates, and analysis and collaboration style of the next generation science instruments
- e.g. SNS, Fusion, LHC/Atlas/CMS
3
Evolving Quantitative Science Requirements for Networks
Science Areas Today End2End Throughput
5 years End2End Throughput
5-10 Years End2End Throughput
Remarks
High Energy Physics
0.5 Gb/s 100 Gb/s 1000 Gb/s high bulk throughput
Climate (Data & Computation)
0.5 Gb/s 160-200 Gb/s N x 1000 Gb/s high bulk throughput
SNS NanoScience Not yet started 1 Gb/s 1000 Gb/s + QoS for control channel
remote control and time critical throughput
Fusion Energy 0.066 Gb/s(500 MB/s burst)
0.198 Gb/s(500MB/20 sec. burst)
N x 1000 Gb/s time critical throughput
Astrophysics 0.013 Gb/s(1 TBy/week)
N*N multicast 1000 Gb/s computational steering and collaborations
Genomics Data & Computation
0.091 Gb/s(1 TBy/day)
100s of users 1000 Gb/s + QoS for control channel
high throughput and steering
4
Evolving Qualitative Requirements for Network Infrastructure
In the near term applicationsneed high bandwidth
2-4 yrs requirement is for high bandwidth and QoS.
3-5 yrs requirement is for high bandwidth and QoS and network resident cache and compute elements.
4-7 yrs requirement is for high bandwidth and QoS and network resident cache and compute elements, and robust bandwidth (multiple paths)
C
S
C
C
S
I
C
S
C
C
S
I
C
S
C
C
S
IC&C
C&C
C&
C
C&
C
C&C
C&
C
C
S
C
C
S
IC&C
C&C
C&
C
C&
C
C&C
C&
CS
C
I
C&C
instrument
compute
storage
cache &compute
1-40 Gb/s,end-to-end
1-3
yrs
2-4
yrs
3-5
yrs
4-7
yrs
guaranteedbandwidthpaths
100-200 Gb/s,end-to-end
5
Point to Point Connections
• 10 Gb/s connections between major data site provides the ability to move about 100 TBy/day – a petabyte every 10 days
• A few 10 Gb/s connections between ½ dozen Labs will be probably be feasible in the next few years
6
ESnet’s Evolution over the Next 10-20 Years
• Upgrading ESnet to accommodate the anticipated increase from the current 100%/yr traffic growth to 300%/yr over the next 5-10 years is priority number 7 out of 20 in DOE’s “Facilities for the Future of Science – A Twenty Year Outlook”
7
ESnet’s Evolution over the Next 10-20 Years
• Based on the requirements of the OSC High Impact Science Workshop and Network 2008 Roadmap, ESnet must address
I. Capable, scalable, and reliable production IP networking- University and international collaborator connectivity
- Scalable, reliable, and high bandwidth site connectivity
II. Network support of high-impact science- provisioned circuits with guaranteed quality of service
(e.g. dedicated bandwidth)
III. Evolution to optical switched networks- Partnership with UltraScienceNet
- Close collaboration with the network R&D community
IV. Science Services to support Grids, collaboratories, etc
8
I. Production IP: University and International Connectivity
Connectivity between any DOE Lab and any Major University should be as good as ESnet connectivity between DOE Labs and Abilene connectivity between Universities
o Partnership with Internet2/Abilene
o Multiple high-speed peering points
o Routing tailored to take advantage of this
o Continuous monitoring infrastructure to verify correct routing
o Status: In progress- 4 cross-connects are in place and carrying traffic
- first phase monitoring infrastructure is in place
9
ESnet/QwestAbileneORNL
DENDEN
ELPELP
ALBALB
DCDC
DOE Labs w/ monitorsUniversities w/ monitorsnetwork hubshigh-speed cross connects with Internet2/Abilene
Monitoring DOE Lab - University Connectivity• Normal, continuous monitoring (full mesh – need auto detection of bandwidth anomalies)
• All network hubs will have monitors• Monitors = network test servers (e.g. OWAMP) + stratum 1 time source
Japan
Japan
CERN/Europe Europe
SDGSDG
Japan
CHICHI
AsiaPacSEASEA
NYCNYC
HOUHOU
KCKC
LALA
ATLATL
INDINDSNVSNV
10
ESnet/QwestAbileneORNL
DENDEN
ELPELP
ALBALB
DCDC
DOE Labs w/ monitorsUniversities w/ monitorsnetwork hubshigh-speed cross connects with Internet2/Abilene
Monitoring DOE Lab - University Connectivity• Diagnostic monitoring (e.g. follow path from SLAC to IU)
Japan
Japan
CERN/Europe Europe
SDGSDG
Japan
CHICHI
AsiaPacSEASEA
NYCNYC
HOUHOU
KCKC
LALA
ATLATL
INDINDSNVSNV
11
ESnet/QwestAbileneORNL
DENDEN
ELPELP
ALBALB
DCDC
DOE Labs w/ monitorsUniversities w/ monitorsnetwork hubshigh-speed cross connects with Internet2/Abilene
Monitoring DOE Lab - University Connectivity• Initial set of site monitors
Japan
Japan
CERN/Europe Europe
SDGSDG
Japan
CHICHI
AsiaPacSEASEA
NYCNYC
HOUHOU
KCKC
LALA
ATLATL
INDINDSNVSNV
Prototype site monitors
12
Initial Monitor Results (http://measurement.es.net)
13
Initial Monitoring Prototype LBNL/ESnet -> NCSU/Abilene
1 128.109.41.1 (128.109.41.1) 0.188 ms 0.124 ms 0.116 ms NCSU sentinel host
2 rlgh1-gw-to-nc-itec.ncren.net (128.109.66.1) 1.665 ms 1.579 ms 1.572ms
3 abilene-wash.ncni.net (198.86.17.66) 9.829 ms 8.849 ms 13.470 ms Abilene-regional peering
4 nycmng-washng.abilene.ucaid.edu (198.32.8.84) 13.096 ms 27.682 ms13.084 ms Abilene DC
5 aoa-abilene.es.net (198.124.216.117) 13.151 ms 13.154 ms 13.173 ms Abilene NYC
6 aoacr1-ge0-aoapr1.es.net (134.55.209.109) 13.269 ms 13.157 ms 13.166ms Abilene -> ESnet 1 GE
7 chicr1-oc192-aoacr1.es.net (134.55.209.57) 33.516 ms 33.589 ms 33.579ms ESnet CHI
8 snvcr1-oc192-chicr1.es.net (134.55.209.53) 81.528 ms 81.514 ms 81.499ms ESnet SNV
9 lbl-snv-oc48.es.net (134.55.209.6) 82.867 ms 82.853 ms 82.959 ms ESnet-LBL peering
10 lbnl-ge-lbl2.es.net (198.129.224.1) 85.412 ms 83.736 ms 84.405 ms LBNL
11 ir1000gw.lbl.gov (131.243.128.210) 83.243 ms 82.960 ms 82.906 ms
12 beetle.lbl.gov (131.243.2.45) 83.138 ms 83.075 ms 83.045 ms LBNL sentinel host
42 ms
41 ms
48 hour sample
Thanks! to Chintan Desai, NCSU, Jin Guojun, LBNL, Joe Metzger, ESnet, Eric Boyd Internet2
14
ESnet/QwestAbileneORNL
DENDEN
ELPELP
ALBALB
DCDC
DOE Labsnetwork hubshigh-speed cross connects with Internet2/Abilene
I. Production IP: University and International Connectivity
Japan
Japan
CERN/Europe Europe
SDGSDG
Japan
CHICHI
AsiaPacSEASEA
NYCNYC
HOUHOU
KCKC
LALA
ATLATL
INDIND
ESnet/Qwest
Starlight/NW
CERN
AsiaPac
10Gb/sESnet core
10Gb/s
10Gb/s
2.5Gb/s 10Gb/sESnet core
ESnet/
Qw
est
MAN LAN
JAPAN
Europe
10Gb/s
2.5Gb/s10Gb/s
10Gb/sAbilene core
SNVSNV
15
I. Production IP: University and International Connectivity
• 10 Gb/s ring in NYC to MANLAN foro 10 Gb/s ESnet – Abilene x-connect
o international links
• 10 Gb/s ring to StarLight for CERN link, etc.o 10 GE switch for ESnet aggration at Starlight in the
procurement process
o 10 GE interface in ESnet Chi router in the procurement process
o will try and get use of second set of fibers from ESnet Chi router to Starlight so that we
• Status: Both of these are in progress
16
I. Production IP: A New ESnet Architecture
Local rings, architected like the core, will provide multiple paths for high reliability and scalable bandwidth from the ESnet core to the sites
o No single points of failure
o Fiber / lambda ring based Metropolitan Area Networks can be built in several important areas- SF Bay Area
- Chicago
- Long Island
- maybe VA
- maybe NM
17
MAN Rings
• The ESnet Metropolitan Area Networks (MANs) rings are a critical first step in addressing both increased bandwidth and reliability
• The MAN architecture approach is to replace the current hub and tail circuit arrangement with local fiber rings that provideo diverse paths to the Labs
o multiple high-speed configurable circuits
18
site gateway routercircuits to
site equip.site gateway router
Qwest hubs
Vendor neutralfacility
ESnet core
network
ANLFNAL
ESnet MAN ArchitectureDOE funded CERN link
monitor
ESnet management and
monitoring
monitor
circuits tosite equip.
ESnet production IP service
ESnet managedcircuit services
T320
StarLight
other international
peerings
Site LAN Site LAN
circuit services
ESnet production IP service
ESnet managedcircuit services
local fiber ring
Chicago hub
Core ring – MAN intersection
circuits to site equip.
production IP
spare capacity
19
Site gateway routersite equip. Site gateway router
Qwest hub
Vendor neutraltelecom facility
ESnet core
ANLFNAL
New ESnet Architecture – Chicago MAN as ExampleCERN
(DOE funded link)
monitor monitor
site equip.
ESnet production IP service
T320
StarLight
other high-speed
international peerings
Site LAN Site LAN
all interconnects from the sites
back to the core ring are high
bandwidth and have full module
redundancy
No single point failure can
disrupt
Current approach of
point-to-point tail circuits from hub
to site
20
The Case for ESnet MANs – Addressing the Requirements
• All paths are based on 10 Gb/s Ethernet interfaces that are about ½ the cost of the 10 Gb/s OC192 interfaces of the core networko This addresses the next increment in site access
bandwidth (from 622 Mb/s and 2.5 Gb/s today to 10 Gb/s in the MANs)
• Logically the MAN ring intersects the core ring twice (though at one physical location)o This means that no single component or fiber failure can
disrupt communication between any two points in the network
- Today we have many single points of failure
21
SF BA MAN – Engineering Study ConfigurationOAK Level3 POP (Emeryville)
1380 Kifer(Level3 Comm. hub)
LBNL
JGI
PAIX(Palo Alto peering point)
SLAC
NERSC
ESnet T320core router
10GE
1400 Kifer(Qwest Comm., ESnet hub)
Optoelectronics
10G Ethernet switch
the logical ring
existing CENIC fiber pathsStanford
Berkeley
Walnut Creek
Oakland
Sunnyvale
ESnetcore network
NationalLambda
Rail
Phase 2 adds LLNL and SNLL
22
Site gateway router
one optical fiber pairDWDM
site equip. Site gateway router site equip.
ESnet Qwest hub
Starlight
ESnet core
ANL
Shared w/ FNAL
Shared w/ IWire
ESnet
Chicago MAN – Engineering Study Configuration
FNAL
CERN
T320
optoelectronics
Ethernetswitch
23
I. Production IP: A New ESnet Architecture
• Status: In progresso Migrate site local loops to ring structured Metropolitan
Area Networks and regional nets in some areas
o Preliminary engineering study completed forSan Francisco Bay Area and Chicago area
Have received funding to build the San Francisco Bay Area ring
24
I. Production IP: Long-Term ESnet Connectivity Goal
• The case for dual core ringso For high reliability ESnet should not depend on a single
core/backbone because of the possibility of hub failure
o ESnet needs high-speed connectivity to places where the current core does not provide access
o A second core/backbone would provide both redundancy for highly reliable production service and extra bandwidth for high-impact science applications
- The IP production traffic would normally use the primary core/backbone (e.g. the current Qwest ring)
25
ESnet/QwestNLRORNL
DENDENSNVSNV
ELPELP
ALBALBATLATL
DCDC
MANs
High-speed cross connects with Internet2/Abilene
Major DOE Office of Science Sites
I. Production IP: Long-Term ESnet Connectivity Goal• Connecting MANs with two cores to ensure against hub failure (for example, NLR is shown as the second core – in blue – below)
Japan
Japan
CERN/Europe
Europe
SDGSDG
Japan
CHICHI
AsiaPacSEASEA
NYCNYC
26
The Need for Multiple Backbones
• The backbones connect to the MANs via “hubs” – the router locations on the backbone ring
• These hubs present several possibilities for failure that would isolate the MAN rings from the backbone, thus breaking connectivity with the rest of ESnet for significant lengths of time
• The two most likely failures are thato the ESnet hub router could suffer a failure that take it completely out of
service (e.g. a backplane failure) – this could result in several days of isolation of all of the sites connected to that hub
o The hub site could be disabled by fire, physical attack, physical damage from an earthquake or tornado, etc. – this could result in several weeks or more of isolation of all of the sites connected to that hub
• A second backbone would connect to the MAN ring at a different location from the first backbone, thus mitigating the impact of a backbone hub failure
27
ESnet MAN Architecture with Single Core Ring
MetropolitanArea
Networkring
corering
site
site
site
production IPprovisioned circuits carriedover lambdasOptical channel (λ)
management equipment
one optical fiber pairDWDM
Layer 2 management
equipment (e.g. 10 GigEthernet
switch)
Layer 3 (IP)management
equipment (router)
provisioned circuits carriedas tunnels through the ESnetIP backbone
one POS flow between ESnet
routers
hub site
hub router
ESnet MAN Architecture withOptimally Connected Dual Core Rings
MetropolitanArea
Networkring
corering #1
site
site
production IPprovisioned circuits carriedover lambdas
site
provisioned circuits carriedas tunnels through the ESnetIP backbone
hub site #1
hub router
hub site #2
corering #2
29
I. Production IP: Long-Term ESnet Connectivity Goal
• What we want
Qwest hubLevel3 hub
ESnet coreNLR core
SF BAMAN
• What we will probably get
Qwest hubLevel3 hub
ESnet core
NLR core
SF BAMAN
SF BAMAN
A or B
30
I. Production IP: Long-Term ESnet Connectivity Goal
• Using NLR as a second backbone improves the reliability situation with respect to sites connected to the two proposed MANs, but is not a complete solution because instead of each core ring independently connecting to the MAN ring, the two core hubs are connected together, and the MAN is really intersected by only one ring (see below) – true for both SF Bay and Chicago MANs
• For full redundancy, need to keep some current circuits in place to connect both cores to the MAN ring, as below
Qwest hubLevel3 hub
ESnet coreNLR core
North Bay site(NERSC, JGI, LBNL)
SF BAMAN
Existing Qwest circuitSF Bay Area example
31
Tactics
• The planned Southern core route upgrade from OC48 (2.5Gb/s) to OC192 (10Gb/s) will cost nearly $3Mo This is the equipment cost for ESnet
o This has nothing to do with the So. Core route per se – that remains absolutely essential to ESnet
o Qwest optical ring (“Qwave service”) - what I refer to as the No. core route and the So. core route - is the basis of ESnet high-speed, production IP service. And this ring, or something like it, will continue to be at the heart of ESnet's production service.
32
Tactics
• What benefit will this upgrade have for ESnet science users?
• The answer - now that ORNL will be peering with ESnet at 10 Gb/s in Chicago – is that this upgrade will have zero positive impact on OSC science users. o With ORNL connecting at Atlanta, there was a strong case for
OC192 on the So. core route. However, their strategy has changed, and they are now connecting to the No. core route.
o Therefore, while originally the upgrade made sense, it no longer does. 2.5Gb/s on So. route is adequate for foreseeable future.
o All that is happening here is that the networking situation with the OSC Labs has changed fairly significantly over the last several years, and we are just adapting our planning to accommodate those changes.
33
Northern core route
Southerncore route
34
Tactics
• ESnet will postpone the southern route upgrade1) Pursue getting a lambda on NLR from Chicago to Seattle
to Sunnyvale to San Diego
o This will have considerable positive impact on our science users. It will give us
- a) a high-speed link from SNV to Seattle and San Diego (we currently have a ridiculous OC3)
- b) the potential to provide alternate backbone service to the MANs
- c) the ability to get PNNL on the ESnet core at high speed
- d) another resource on which we can provision end-to-end circuits for high impact science
2) Collaborate with NYSERNet to build a MAN around Long Island, which will give us the means to get BNL on the ESnet core network at high-speed.
35
Tactics
• If it turns out that the NNSA labs in the SW need more bandwidth to the ESnet core in the future, we can always upgrade the So. core route piecemeal, starting with the El Paso to Sunnyvale link.
• When ESnet has not been able to afford to increase the site bandwidth, the Labs have sometimes gotten their own high-speed connections
• ESnet can take advantage of this to provide reliable, production high-speed access to the Labs
When possible, incorporate the existing non-ESnet connections into the new ESnet architecture to provide a better and more capable service than the Labs can provide on their own
• ANL, SLAC, LANL, PNNL, FNAL, ORNL
• BNL, JLab
TacticsLeverage and Amplify
Non-ESnet Network Connectivity to Labs
ESnet/QwestNLRORNL
DENDENSNVSNV
ELPELP
ALBALB ATLATL
DCDC
MANs
High-speed cross connects with Internet2/Abilene
Major DOE Office of Science Sites
Tactics ORNL Connection to ESnet
Japan
Japan
CERN/Europe
Europe
SDGSDG
Japan
CHICHI
AsiaPacSEASEA
NYCNYC
The ORNL contributed circuit
+ the existing ESnet circuit
effectively incorporate ORNL into a secondary ESnet core ring
38
Outline
• Trends, Opportunities, and Pressures
• ESnet is Driven by the Needs of DOE Science
• New Strategic Directions for ESneto I. Capable, scalable, and reliable production IP
networking
o II. Network support of high-impact science
o III. Evolution to optical switched networks
o IV. Science Services to support Grids, collaboratories, etc
• Draft Outline Strategy, 2005-2010
39
II. Network Support of High-Impact Science
Dynamic provisioning of private “circuits” in the MAN and through the core can provide “high impact science” connections with Quality of Service guarantees o A few high and guaranteed bandwidth circuits and many
lower bandwidth circuits (e.g. for video, remote instrument operation, etc.)
o The circuits are secure and end-to-end, so if
- the sites trust each other, and
- they have compatible security policies
then they should be able to establish direct connections by going around site firewalls to connect specific systems – e.g. HPSS <-> HPSS
40
II. Hi-Impact Science Bandwidth
New York (AOA)
Atlanta (ATL)
Washington
El Paso (ELP)
Site gateway router
SiteLAN
DMZ
Site
MANoptical fiber ring
ESnet border
circuit cross
connect
ESnetcore
Site gateway
routerSiteLANDMZ
Site
MANoptical fiber ring
ESnet border
circuit cross
connect
Private “circuit” from one system to another
Specific host, instrument,
etc.
Specific host, instrument,
etc.
common security policy
Production IP network
41
II. Network Support of High-Impact Science
• Status: Initial progresso Proposal funded by MICS Network R&D program for initial
development of basic circuit provisioning infrastructure in ESnet core network (site to site)
o Will work with UltraScience Net to import advanced services technology
ESnet On-Demand Secure Circuits and Advance Reservation System (OSCARS)
• The procedure of a typical path setup will be as follows
• A user submits a request to the ESnet Reservation Manager (RM) (using an optional web front-end) to schedule an end-to-end path (e.g. between an experiment and computing cluster) specifying start and end times, bandwidth requirements, and specific source IP address and port that will be used to provide application access to the path.
• At the requested start time, the RM will configure the ESnet router (at the start end of the path) to create a Label Switched Path (LSP) with the specified bandwidth.
• Each router along the route receives the path setup request (via RSVP) and commits bandwidth (if available) creating an end-to-end LSP. The RM will be notified by RSVP if the end-to-end path cannot be established. The RM will then pass on this information to the user.
• Packets from the source (e.g. experiment) will be routed through the LAN’s production path to ESnet’s edge router. On entering the edge router, these packets are identified and filtered using flow specification parameters (e.g. source/destination IP address/port numbers) and policed at the specified bandwidth. The packets are then injected into the LSP and switched (using MPLS) through the network to its destination (e.g. computing cluster).
43
ESnet On-Demand Secure Circuits and Advance Reservation System
44
ESnet On-Demand Secure Circuits and Advance Reservation System
• Issueso Scalability in numbers of paths may require shapers as
part of the architecture
o Allocation management (!)
o In a single lambda MAN, may have to put a router at the site (previous thinking was no router at the sites as a cost savings – just Ethernet switches) – otherwise you cannot carry the circuit all the way to the site
45
III. Evolution to Optical Switched Networks
• Optical transparencyo On-demand, rapid setup of “transparent” optical paths
o G.709 standard optical interfaces – evolution of SONET for optical networks
46
III. Evolution to Optical Switched Networks
Partnership with DOE’s network R&D programo ESnet will cross-connect with UltraNet / National Lambda
Rail in Chicago and Sunnyvale, CA
o ESnet can experiment with UltraScience Net virtual circuits tunneled through the ESnet core (up to 5 Gb/s between UltraNet and appropriately connected Labs)
o One important element of importing DOE R&D into ESnet
o Status: In progress- Chicago ESnet – NLR/UltraNet x-connect based on the IWire
ring is engineered
- Qwest – ESnet Sunnyvale hub x-connect is dependent on Qwest permission, which is being negotiated (almost complete)
47
III. Evolution to Optical Switched Networks
• ESnet is building partnerships with the Federal and academic R&D networks in addition to DOE network R&D programs and UltraScienceNeto Internet2 Hybrid Optical Packet Internet (HOPI) and
National Lambda Rail for R&D on the next generation hybrid IP packet – circuit switched networks
- ESnet will participating in the Internet2 HOPI design team (where UltraScience Net also participates)
o ESnet co-organized a Federal networking workshop on the future issues for interoperability of Optical Switched Networks
o Lots of good material at JET Web site
• These partnerships will provide ESnet with direct access to, and participation in, next generation technology for evaluation and early deployment in ESnet
III. Evolution to Optical Switched NetworksUltraNet – ESnet Interconnects
ESnet/QwestNLRORNL
DENDENSNVSNV
ELPELP
ALBALBATLATL
DCDC
High-speed cross connects with Internet2/Abilene
Major DOE Office of Science Sites
Japan
Japan
CERN/Europe
Europe
SDGSDG
Japan
CHICHI
AsiaPacSEASEA
NYCNYC
MANs
ESnet – UltraScienceNet cross connects
UltraNet
49
Conclusions• ESnet is working hard to meet the current and future
networking need of DOE mission science in several ways:o Evolving a new high speed, high reliability, leveraged
architecture
o Championing several new initiatives that will keep ESnet’s contributions relevant to the needs of our community
Top Related