Slide 1 UCSC 100 Gbps Science DMZ – 1 year 9 month Update Brad Smith & Mary Doyle.
-
Upload
nathan-simpson -
Category
Documents
-
view
215 -
download
0
Transcript of Slide 1 UCSC 100 Gbps Science DMZ – 1 year 9 month Update Brad Smith & Mary Doyle.
Slide 1Slide 1
UCSC 100 Gbps Science DMZ –1 year 9 month UpdateBrad Smith & Mary Doyle
Slide 2
Goal 1 - 100 Gbps DMZ - Complete!
Slide 3
Goal 2 – Collaborate with users to use it!
• MCD Biologist doing brain wave imaging
• SCIPP analyzing LHC ATLAS data
• HYADES cluster doing Astrophysics visualizations
• CBSE Cancer Genomics Hub
Slide 4
Exploring mesoscale brain wave imaging dataJames AckmanAssistant ProfessorDepartment of Molecular, Cell, & Developmental BiologyUniversity of California, Santa Cruz
1. Record brain activity patterns 2. Analyze cerebral connectivity
• local computing • external computing
Science DMZ
• Acquire 60 2.1GB TIFF images/day (120 GB/day total).• Initially transfer 20 Mbps = 12-15 mins/TIFF = 15hrs/day!• With Science DMZ 354 Mbps = 1min = 1hr/day!• Expected to grow 10x over near term
Slide 5
Ryan [email protected]
Santa Cruz Institute for Particle Physics
SCIPP Network Usage for Physics with ATLAS
Slide 6
T. Rex
Humans(for scale)
ATLAS is a 7 story tall, 100 megapixel “camera”• taking 3-D pictures of proton-proton collisions 20 million
times per second,• saving 10 PB of data per year.
proton beam
p+
p+
Tracker
Muon SpectrometerCalorimeter
collision point
ATLAS Detector
Slide 7
Data Volume
• LHC running 2009-2012 produced ~ 100 PB– Currently ~10 PB/year
• SCIPP process and skim that on the LHC computing grid, and bring ~10 TB of data to SCIPP each year.– 12hr transfer time impacts ability to provide input for next experiment
• Expect ≈ 4 times the data volume in the next run 2015-2018.
• Our bottleneck is downloading the skimmed data to SCIPP.
• Current download rate ~ few TB every few weeks.
Slide 8
Throughput 1 Gbps – 400 Mbps
Dell 6248 Switch (2007)
atlas01 (headprv) atlas02 (int0prv) atlas03 (nfsprv) atlas04 (int1prv)
wrk0prv wrk1prv wrk2prv ... wrk7prv
XROOTD
nfs
public network
private network
XROOTDdata-flow
NFSdata-flow
public-private network bridge
128 CPUs
1 Gb1 Gb1 Gb
campus network
1 Gb
downloadingfrom grid
users
1 Gb
≈20 TB
≈20 TB
Slide 9
Dell 6248 Switch (2007)
atlas01 (headprv) atlas02 (int0prv) atlas03 (nfsprv) atlas04 (int1prv)
wrk0prv wrk1prv wrk2prv ... wrk7prv
XROOTD
nfs
public network
private network
XROOTDdata-flow
NFSdata-flow
public-private network bridge
128 CPUs
10 Gb10 Gb1 Gb
campus network
10 Gb
downloadingfrom grid
users
1 Gb
≈20 TB
≈20 TB
Throughput 10 Gbps – 400 Mbps?!
Slide 10
Dell 6248 Switch (2007)
atlas01 (headprv) atlas02 (int0prv) atlas03 (nfsprv) atlas04 (int1prv)
wrk0prv wrk1prv wrk2prv ... wrk7prv
XROOTD
nfs
public network
private network
XROOTDdata-flow
NFSdata-flow
public-private network bridge
128 CPUs
10 Gb10 Gb1 Gb
campus network
10 Gb
downloadingfrom grid
users
1 Gb
≈20 TB
≈20 TB
10 Gb
Offload Dell Switch – 1.6 Gbps With help from ESNet!
Slide 11
SCIPP Summary
• Quadrupled throughput– Reduce download time from 12 hrs to 3 hrs
• Still long ways from 10 Gbps potential– ~30mins (factor of 8)
• Probably not going to be enough for new run– ~4x data volume
• Possible problems– Atlas03 storage (not enough spindles)– WAN or protocol problems– 6 year old Dell switch– Investigating GridFTP solution and new LHC data access node from SDSC
• We are queued up to help them when they’re ready…
Slide 12
Hyades
• Hyades is an HPC cluster for Computational Astrophysics
• Funded by a $1 million grant from NSF in 2012
• Users from departments of Astronomy & Astrophysics Physics Earth & Planetary Sciences Applied Math & Statistics Computer Science, etc
• Many are also users of national supercomputers
Slide 13
Hyades Hardware
• 180 Compute Nodes• 8 GPU Nodes• 1 MIC Node• 1 big-memory Analysis Node• 1 3D Visualization Node• Lustre Storage, providing 150TB of scratch space• 2 FreeBSD Files Servers, providing 260TB of NFS space• 1 PetaByte Cloud Storage System, using Amazon S3 protocols
Slide 14
Slide 15
Data Transfer
• 100+ TB between Hyades and NERSC
• 20 TB between Hyades and NASA Pleiades; in the process of moving 60+ TB from Hyades to NCSA Blue Waters
• 10 TB from Europe to Hyades
• Shared 10 TB of simulation data with collaborators in Australia, using the Huawei Cloud Storage
Slide 16
Remote Visualization
• Ein is a 3D Visualization workstation, located in an Astronomy office (200+ yards from Hyades)
• Connected to Hyades via a 10G fiber link
• Fast network enables remote visualization in real time:– Graphics processing locally on Ein– Data storage and processing remotely, either on Hyades or on NERSC
supercomputers
Slide 17
CBSE CGHub
• NIH/NCI archive of cancer genomes• 10/2014 - 1.6PB of genomes uploaded• 1/2014 – 1PB/month downloaded(!)
• Located at SDSC… managed from UCSC
• Working with CGHub to explore L2/“engineered” paths
Slide 18
Innovations…
• “Research Data Warehouse”– DTN with long-term storage
• Whitebox switches– On chip packet buffer – 12 MB– 128 10 Gb/s SERDES... so 32 40-gig ports– SOC… price leader, uses less power– Use at network edge
Slide 19
Project Summary
• 100 Gbps Science DMZ completed
• Improved workflow for a number of research groups
• Remaining targets– Extend Science DMZ to more buildings– Further work with SCIPP… when they need it– L2 (“engineered”) paths with CBSE (genomics)– SDN integration
• Innovations– “Research Data Warehouse” - DTN as long-term storage– Whitebox switches
Slide 20
Questions?
Brad Smith
Director Research & Faculty Partnerships, ITS
University of California Santa Cruz