The ROB SDO data system

23
ROB for BNC 2009 Brussels Slide 1 The ROB SDO data system All of the Sun all of the time: Distributing 1TB/day from the Solar Dynamics Observatory satellite, 24/7 for 5+ years

description

All of the Sun all of the time: Distributing 1TB/day from the Solar Dynamics Observatory satellite, 24/7 for 5+ years. The ROB SDO data system. David Boyes, Véronique Delouille, Benjamin Mampaey, Tobias Berghoff, Cis Verbeeck, Jean-François Hochedez (Royal Observatory of Belgium). - PowerPoint PPT Presentation

Transcript of The ROB SDO data system

Page 1: The ROB SDO data system

ROB for BNC 2009 Brussels Slide 1

The ROB SDO data system

All of the Sun all of the time: Distributing 1TB/day from the Solar Dynamics Observatory satellite,

24/7 for 5+ years

Page 2: The ROB SDO data system

ROB for BNC 2009 Brussels Slide 2

The Presentation

The science - studying the Sun

The system - satellite, network, data centres

In practice - getting it all to work

David Boyes, Véronique Delouille, Benjamin Mampaey, Tobias Berghoff, Cis Verbeeck, Jean-François Hochedez (Royal Observatory of Belgium)

The people

Page 3: The ROB SDO data system

ROB for BNC 2009 Brussels Slide 3

Why look at the Sun?

Weather - it affects us but can be forecast

Science - there is still a lot to find out!

– why is the solar corona so hot

– what drives mass ejections

– why is there an 11 year cycle

– … and much more

Page 4: The ROB SDO data system

ROB for BNC 2009 Brussels Slide 4

Why have satellites?

The earths atmosphere blocks a lot - UV and above

At these wavelengths the structure of the sun is revealed

Page 5: The ROB SDO data system

ROB for BNC 2009 Brussels Slide 5

What are the effects on earth? Radiation - effects are

immediate, goal is to predict them

Particles - arrive in hours or days - we can give warning

This is solar weather forecasting - ROB is a Regional Warning Center

Graphic : NASA

Page 6: The ROB SDO data system

ROB for BNC 2009 Brussels Slide 6

What's up there?

Telescopes to observe the Sun's atmosphere at multiple UV wavelengths (AIA)

Telescopes to measure specific wavelengths to allow calculation of magnetic fields and seismic activity (HMI)

Wide band UV sensor to measure total UV spectrum (EVE)

Page 7: The ROB SDO data system

ROB for BNC 2009 Brussels Slide 7

Some numbers about SDO

A massive increase in data quantity and precision - 1000 to 10000 times as much data as current satellites

Flies at 38 000 km, geosynchronous orbit

AIA - images at 10 wavelengths from visible to 131Å, one image every 1.25s

HMI – one magnetic image every 45s

EVE – irradiance time series from 10 to 1050Å

Images are 4Kx4K - 32MB per image

A lot of data - more than 1TB/day

Page 8: The ROB SDO data system

ROB for BNC 2009 Brussels Slide 8

The challenges of SDO

Huge bandwidth

Lots of data to be made available

Too much data for humans to absorb

Page 9: The ROB SDO data system

ROB for BNC 2009 Brussels Slide 9

The solution A worldwide network of data stores holding current quarter and

popular data

Joined by high-speed network

Pushing a full copy of data to as wide an area as possible in compact form

Software system (netDRMS with internal PostgreSQL) at each data store provides virtual storage for file requests from users

Transparent access to any data, if needed going down to original source data

Local users have the impression they have file access

Web based mediation for remote use interface

Automatic processing by high performance computing

Page 10: The ROB SDO data system

ROB for BNC 2009 Brussels Slide 10

What's down here

Ground station - main station at White Sands

Mission Operations Center at Goddard

Joint Science Operations Centers (JSOC) - Stanford and Colorado

Knowledge base – Lockheed, Virtual Observatory - Harvard

Storage at White Sands, JSOC and Data Centres

Compute clusters and data servers at Data Centres

A network of Data Centres...

Page 11: The ROB SDO data system

ROB for BNC 2009 Brussels Slide 11

The Data Centres

Page 12: The ROB SDO data system

ROB for BNC 2009 Brussels Slide 12

What this enables

Many groups working in parallel on the unprecedented flow of data

Simultaneous access and processing of bulk data in many high-performance systems

Online access for forecasters to complete data to refine their techniques

Completely open and low cost access to all data for both researchers with specific interests and for researchers with limited budgets

Page 13: The ROB SDO data system

ROB for BNC 2009 Brussels Slide 13

Does it work?

Yes it does – e.g. two weeks 320Mb/s from Harvard

Page 14: The ROB SDO data system

ROB for BNC 2009 Brussels Slide 14

Network requirements

Throughput

– One set of data takes around 200Mb/s

– Requires 320Mb/s to handle catch ups

– Practical limit is network chain topology

Availability

– More than five year, probably ten year operation

– 24/7, 365

– Must maintain full performance for backbone data system even with subsystem failures

Page 15: The ROB SDO data system

ROB for BNC 2009 Brussels Slide 15

In practice - Bandwidth-Delay product

There are simply a lot of Bytes in the cable - this is the Bandwidth-Delay (BD) product

Problem with the TCP protocol is that buffer size >= 2 * bandwidth * delay and the actual size is adaptive

For example 200Mb/s and 0.1s → 5MB, and you need about twice that for adaptation.

– Standard Linux buffer size is 64K!

Plus you can run into congestion control limits – designed to share traffic fairly!

Page 16: The ROB SDO data system

ROB for BNC 2009 Brussels Slide 16

In practice - BD product

Fixes …

– Use an improved scp (HPN-scp)

– Use multiple sessions

– Use another protocol

– Use a tool which combines these (e.g. GridFTP)

We use multiple sessions in user space

– Raw bandwidth tests used many more than needed

– Production system has tool which interfaces with the data system

Page 17: The ROB SDO data system

ROB for BNC 2009 Brussels Slide 17

In practice - routing

Check it - you might be surprised

– Different networks have quite legitimate different behaviour

What didn't get noticed with e-mail and web pages can still be a problem

– The odd few minutes for e-mail don't get noticed

– Low speed at 2am probably won't get noticed

Page 18: The ROB SDO data system

ROB for BNC 2009 Brussels Slide 18

In practice - reliability Can't use terms like guarantee – things will go wrong

Can't be qualitative – this is way beyond normal hardware reliability

– You still need quality, duplication, spares and conservative ratings

Must get quantitative – failure analysis and point of failure identification

– Time to repair (night shifts!) is critical

Must be able to detect failures

– A single failure will not show up in system performance

Page 19: The ROB SDO data system

ROB for BNC 2009 Brussels Slide 19

In practice - reliability

This is how the Belnet connections at ROB deliver reliability

Single failures do not affect data flow, regardless of which HA node is active

You must check that no failure has occurred

Page 20: The ROB SDO data system

ROB for BNC 2009 Brussels Slide 20

In practice - last mile

It's here you will have the most problems

– Both ends will need work

– Firewalls

– Routers

– Just where is the cable really

– A server is not quite as good as the manufacturer said

Again, situations which might have gone unnoticed will make themselves known

But you are right there to fix them...

Page 21: The ROB SDO data system

ROB for BNC 2009 Brussels Slide 21

Where it's at

The data network is runningThe data transfer system is testingThe system is being documented

So ... it's looking good

Page 22: The ROB SDO data system

ROB for BNC 2009 Brussels Slide 22

Further reading SDO at the ROB

http://wissdom.oma.be

Belnethttp://www.belnet.be

SDOhttp://sdo.gsfc.nasa.gov

ESnet Network Performance Knowledge Base http://fasterdata.es.net

High Performance Enabled SSH/SCP http://www.psc.edu/networking/projects/hpn-ssh

Page 23: The ROB SDO data system

ROB for BNC 2009 Brussels Slide 23

The ROB SDO data system

Thanks for your interestand success with your projects