Jean-Yves Nief, CC-IN2P3, Lyon First Latin American EELA Workshop April 24th – 26th, 2006 Data...

44
Jean-Yves Nief, CC- IN2P3, Lyon First Latin American EELA Workshop April 24th – 26th, 2006 Data distribution and aggregation over geographically distant sites.

Transcript of Jean-Yves Nief, CC-IN2P3, Lyon First Latin American EELA Workshop April 24th – 26th, 2006 Data...

Jean-Yves Nief, CC-IN2P3, Lyon

First Latin American EELA Workshop April 24th – 26th, 2006

Data distribution and aggregation over geographically distant sites.

Data distribution and aggregation over geographically distant sites.

First Latin American EELA Workshop, April 24th-26th 2006

2

Talk overview.Talk overview.

• Introduction: big science, big data, big problem.• SRB: an example of a mature data management tool.• Data management and distribution @ CC-IN2P3: a

few examples in various fields (HEP, astrophysics, biomedical applications) using SRB.

• Data management elsewhere: some interesting data management applications in various area.

• Pitfalls and challenges: having chosen the right architecture for your project is not the end of the game.

• Prospects.

Introduction.Introduction.

First Latin American EELA Workshop, April 24th-26th 2006

4

The present situation.The present situation.

• Large amounts of data produced by scientific projects.• Order of magnitude right now: 100 TB, ~ PB,

millions of records.• In many fields:

– High Energy Physics (SLAC, Fermilab, CERN etc..).

– Astrophysics (simulation projects: Enzo, Terascale Supernova Initiative…, observational data: Eros, MACHO, 2MASS, USNO-B, SDSS, IVOA …).

– Earth sciences (Terashake, Terra …).

– Biology / Biomedical research (BIRN …).

First Latin American EELA Workshop, April 24th-26th 2006

5

Prospects for the future.Prospects for the future.

• Hard to tell but already some indications.• Some examples (next decade):

– DOE Genomics, GTL program.

– Digital libraries for the US administration (NARA).

• Order of magnitude: ~ EB, trillions of records!Amount of data and information exploding. Wider variety of actors: not only big science!• Also true for the networking (next slide: source ESnet).

Science Areas Today End2End Throughput

5 years End2End Throughput

5-10 Years End2End Throughput

Remarks

High Energy Physics

0.5 Gb/s 100 Gb/s 1000 Gb/s high bulk throughput

Climate (Data & Computation)

0.5 Gb/s 160-200 Gb/s N x 1000 Gb/s high bulk throughput

SNS NanoScience

Not yet started

1 Gb/s 1000 Gb/s + QoS for control channel

remote control and time critical throughput

Fusion Energy 0.066 Gb/s(500 MB/s burst)

0.198 Gb/s(500MB/20 sec. burst)

N x 1000 Gb/s time critical throughput

Astrophysics 0.013 Gb/s(1 TBy/week)

N*N multicast

1000 Gb/s computational steering and collaborations

Genomics Data & Computation

0.091 Gb/s(1 TBy/day)

100s of users 1000 Gb/s + QoS for control channel

high throughput and steering

First Latin American EELA Workshop, April 24th-26th 2006

7

Living in a digital world.Living in a digital world.

• Lots of science or digital library projects involving collaborators / users geographically spread.

• Large computing needs (both CPU and storage).• Need for data backup.• And / or need for data closed to the users (replica over

different sites).• Need for collaborative tools to exchange data. Federate geographically distributed computing

facilities.

First Latin American EELA Workshop, April 24th-26th 2006

8

The dawn of cyberinfrastructure (I).

The dawn of cyberinfrastructure (I).

• What is that ?

« An infrastructure based on grids and on application-specific software, tools, and data repositories that support research in a particular discipline. » 

• Why is it needed ?– Need to handle heterogeneous hardware.

– Need to handle heterogeneous OS.

– Need to handle heterogeneous storage devices.

– Need to handle various preservation policies across the distributed environment.

First Latin American EELA Workshop, April 24th-26th 2006

9

The dawn of cyberinfrastructure (II).

The dawn of cyberinfrastructure (II).

• Virtualization of the storage.• Necessary in order to develop client applications

transparent to the technology evolution of the underlying storage systems.

• Virtual organization:– Access rights.

– Groups, domains handling: policies for data sharing.

– Preservation policies.

First Latin American EELA Workshop, April 24th-26th 2006

10

Requirements (I).Requirements (I).

Infrastructure independence:• Data virtualization:

– Management of name spaces independently of the storage repositories.

– Support for access operations independently of the storage repositories.

Authentication:– Certificate: GSI etc…– Challenge-response mechanism: no pwd sent over the

network.– Encrypted password.– Ticket: valid for a given amount of time to access the

virtual organization.

First Latin American EELA Workshop, April 24th-26th 2006

11

Requirements (II).Requirements (II).

Data ownership / Authorization:– Management of the files’ownership across

multiple sites (partial or total decoupling between each sites organization and virtual organization).

– Access Control Lists valid for the entire virtual organization across the physical domain (group, user etc… levels).

First Latin American EELA Workshop, April 24th-26th 2006

12

Requirements (III).Requirements (III).Data operations:

– File access:• Open, close, read, write, stat…

• Audit, versions, pinning, checksums, synchronize etc…

• Parallel I/O, firewall interactions.

– Latency management:• Bulk operations: register, load, unload, delete etc…

• Remote procedures: replicate, aggregate, file parsing, I/O requests (FITS, DICOM, HDF5 …).

– Metadata management:• Annotations, metadata/auditing queries, interface with various

information systems (schema extension of the core system).

SRB: Storage Resource Broker.

SRB: Storage Resource Broker.

First Latin American EELA Workshop, April 24th-26th 2006

14

What’s SRB ?What’s SRB ?

• Storage Resource Broker: developed by SDSC (San Diego).

• Provides an uniform interface to heterogeneous storage system (disk, tape, databases) for data distributed in multiple sites.

• Collaborative tool to share files.• Who is using SRB ?

– Biology, biomedical applications (e.g: BIRN).– Astrophysics, Earth Sciences (e.g: NASA).– Digital libraries (e.g: NARA).

• Used world wide: USA, Europe, Asia, Australia.

First Latin American EELA Workshop, April 24th-26th 2006

15

SRB architecture.SRB architecture.• 1 zone:

– 1 SRB/MetaCatalog server: contains list of files, physical resources, users registered, etc…

– several SRB servers to access the data at their physical location.

Site 1

SRB

Site 2

SRB

Site 3

Application(asking for test1.txt, connecting to site 2)

SRBMCAT

(1)

(4)(2)

test1.txt(3)

First Latin American EELA Workshop, April 24th-26th 2006

16

Some SRB features.Some SRB features.

• Files organized in a logical name space with directories, subdirectories.

> /home/nief.ccin2p3 # dir /home/nief.ccin2p3 evs_g_isPhysicsEvents_aod004051 # on tape @ CC-IN2P3 test1.txt # on disk @ Merida

• Handling replica.• Search for the files based on their attributes instead of

their physical name and location (site, storage type: disk, tape, databases).

Search by metadata « attached » to the files.

First Latin American EELA Workshop, April 24th-26th 2006

17

Users and ACLs management.Users and ACLs management.

• Users belong to:– 1 zone (ex: IN2P3, Venezuela …).– 1 domain (ex: ccin2p3, Merida, Caracas).– 1 or several groups.

• ACL on files and directories.• Tickets:

– Rights given to temporary users for a limited amount of time.

First Latin American EELA Workshop, April 24th-26th 2006

18

Storage.Storage.• Mass Storage System (MSS):

• interface provided for HPSS, Castor and many other MSS.• small files management (containers).

MSS usage (tapes etc…) transparent for the end user.

Logical resources: set of physical resources.• resource1: file system Unix @ IAP• resource2: hpss file system @ CC-IN2P3• resource3: file system Unix @ Merida

• Able to put a file in the 3 resources in one shot: > Sput –S logical-res test1.txt <SRB filename>

logical-res

First Latin American EELA Workshop, April 24th-26th 2006

19

Databases.Databases.• Access to databases through SRB:

– Security: SRB server = proxy server Database can be shielded from the outer world, control on

the requests submitted to the database server.– Duplication: very simple copy from a database at one site to

an other one. (e.g.: copy of tables from a Oracle db in Lyon to a mySQL

db at site X in one shot).

• Schema extension:– Possibility to link the SRB-MCAT with some other

databases (search on SRB objects based on attributes stored in an other db).

First Latin American EELA Workshop, April 24th-26th 2006

20

Interfaces, portability.Interfaces, portability.• Interfaces:

– Binary commands (Scommands).– APIs: C, Java, Perl, Python.– Web interface (mySRB).– GUI client for Windows (inQ).

• Portability:– Linux, Windows, Mac OS, Solaris and many more…

• Databases: – Oracle, DB2, Sybase, PostgreSQL, Informix, mySQL…

Data management and distribution @ CC-IN2P3:

examples using SRB.

Data management and distribution @ CC-IN2P3:

examples using SRB.

First Latin American EELA Workshop, April 24th-26th 2006

22

Who is using SRB @ CC-IN2P3 ?Who is using SRB @ CC-IN2P3 ?

In green = pre-production.• High Energy Physics:

– BaBar (SLAC, Stanford).– CMOS (International Linear Collider R&D).– Calice (International Linear Collider R&D).

• Astroparticle:– Edelweiss (Modane, France).– Pierre Auger Observatory (Argentina).

• Astrophysics:– SuperNovae Factory (Hawaii).

• Biomedical applications:– Neuroscience research.

First Latin American EELA Workshop, April 24th-26th 2006

23

Babar, SLAC & CC-IN2P3.Babar, SLAC & CC-IN2P3.

• BaBar: High Energy Physics experiment closed to Stanford (California).

• SLAC and CC-IN2P3 first opened to the BaBar collaborators data analysis.

• Both held complete copies of data (Objectivity).• Now only SLAC hold a complete copy of the data.• Natural candidates for testing and deployment of

grid middleware.• Data should be available in a delay of 24/48 hours.• SRB: chosen for data distribution of hundreds of

TBs of data.

First Latin American EELA Workshop, April 24th-26th 2006

24

SRB BaBar architecture.SRB BaBar architecture.

CC-IN2P3 (Lyon)

HPSS/Lyon

SRB

SLAC(Stanford, CA)

SRB

SRB

SRBMCAT

(1)

(3)

(2)

HPSS/SLACSRB

SRBMCAT

2 Zones (SLAC + Lyon)

First Latin American EELA Workshop, April 24th-26th 2006

25

Extra details (BaBar).Extra details (BaBar).• Hardware:

– SUN servers (Solaris 5.8, 5.9): NetraT 1405, V440.• Software:

– Oracle 10g for the SLAC MCAT.– Oracle 9i for the Lyon MCAT (migration to 10g foreseen).

• MCATs synchronization: only users and physical resources.

• Comparison of the MCATs contents to transfer the data.• Step (1), (2), (3) multithreaded under client control: very

little latency.• Advantage:

– External client can pick up data from SLAC or Lyon without interacting with the other site.

First Latin American EELA Workshop, April 24th-26th 2006

26

Overall assessment for BaBar.Overall assessment for BaBar.

• A lot of time saved for developping applications thanks to the SRB.

• Transparent access to data: – Very useful in an hybrid environment (disk, tape).– Easy to scale the service (adding new servers on the fly).– Not dependent of physical locations changes in the client application.

• Fully automated procedure.• Easy for SLAC to recover corrupted data.• 270 TB (460,000 files) shipped to Lyon.• Up to 3 TB /day from tape to tape (minimum latency).• Going to 5 TB / day soon/

First Latin American EELA Workshop, April 24th-26th 2006

27

Fermila

b (US)

CERN

SLAC (US)

IN2P3 (F

R)

1 T

erab

yte/

day

SLAC (US)

INFN P

adva (I

T)

Fermila

b (US)

U. C

hicago (U

S)

CEBAF (US)

IN2P3 (F

R)

INFN P

adva (I

T) S

LAC (US)

U. Toro

nto (CA)

Ferm

ilab (U

S)

Helmholtz

-Karls

ruhe (D

E) S

LAC (US)

DOE Lab D

OE Lab

DOE Lab D

OE Lab

SLAC (US)

JANET (U

K)

Fermila

b (US)

JANET (U

K)

Argonne (U

S) Leve

l3 (US)

Argonne

SURFnet (

NL)

IN2P3 (F

R) S

LAC (US)

Fermila

b (US)

INFN P

adva (I

T)

ESNET Traffic with one server on both sides (April 2004).

Neuroscience research (P. Calvat).Neuroscience research (P. Calvat).

DICOM

DICOM

DICOM

DICOM

IRMSiemens MAGNETOM

Sonata Maestro Class 1.5 T (Lyon hospital)

ConsolSiemens Celsius Xeon

(Window NT)

Ac

qu

isit

ion

DICOM

Export PCDell PowerEdge 800

FTP,

File sharing,

DICOM

DICOM

DICOM

First Latin American EELA Workshop, April 24th-26th 2006

29

Neuroscience research (II).Neuroscience research (II).

• Goal: make SRB invisible to the end user.• More than 500,000 files registered.• Now interfaced within the MATLAB environment:

– Data pushed where the CPUs are (CC-IN2P3, ENS Lyon).

• ~ 1.5 FTE for 3 months…• Next step:

– Ever growing community (a few TBs / year): Strasbourg hospital to join the project (maybe Marseille, St Etienne…).

– Goal: Join the BIRN network (US biomedical network).

SuperNovae Factory.SuperNovae Factory.• Telescope data stored into the SRB, processed in Lyon

(almost online).• Collaborative tool + backup (files exchanged between

French and US users).

Hawaii telescope

HPSS/Lyon

SRBCC-IN2P3

a few GBs / day

SRBHPSS/NERSC

Berkeley (project)

Data management elsewhere: a few examples.

Data management elsewhere: a few examples.

Neuroscience: BIRN (I).Neuroscience: BIRN (I).• BIRN = BioInformatics Research Network• Brain imagery (human, animals: mice, apes):

- fMRI etc…• Data sharing and exchange of experimental data for each lab and project.

First Latin American EELA Workshop, April 24th-26th 2006

33

Neuroscience: BIRN (II).Neuroscience: BIRN (II).

• BIRN Coordination Center in San Diego:– 1 rack (SRB server, database etc…) on each site.– Administration centralized from the BIRN-CC: 24/7.– Sharing software, APIs…– 15 millions of files registered (16 TB), 360 users: file search on

metadata over the entire sample (impressive!).• John Hopkins Hospital: « done more in 6 months than in 18

years ».• BIRN: 30 people at the first meeting (2001), 115 in Feb. 2005,

more than 200 now success.• Some sites already starting in Europe: Edinburgh, Manchester.• Hoping for a french site in the near future.

First Latin American EELA Workshop, April 24th-26th 2006

34

ROADNet (UCSD).ROADNet (UCSD).

• Real-time Observatories, Applications, and Data-Management Network.

• The Problem: – Integrated real-time management of large, distributed, heterogeneous data

streams from sensor networks.– Sensors: Seismometers, Accelerometers, Displacement, Barometric

pressure, Temperature, Wind Speed, Wind Direction, Infrasound, Hydroacoustic, Differential Pressure Gauges, Strain, Solar Insolation, pH, Electric Current, Electric Potential, Dilution of oxygen, Still Camera Images, Codar.

– Multidisciplinary project:• Sismology.• Oceanography.• Hydrology.• Meteorology.• Etc…

First Latin American EELA Workshop, April 24th-26th 2006

35

ROADNet (UCSD).ROADNet (UCSD).

DATASCOPE

Archives/Processing/Review

It’s a grid for online studies, handling data streams.

(ORB= Online Ring Buffer)

Pitfalls and challenges.Pitfalls and challenges.

First Latin American EELA Workshop, April 24th-26th 2006

37

Potential pitfalls.Potential pitfalls.

• To build a successfull environment for data management and distribution over many sites:– Good coordination and communication between the sites

administrators: « social » factor.– Manpower: expertise needed in several area (network,

sys. admin. and database administration).– Working in different time zones does not make things

easy.– Development of monitoring tools.– Automatic recovery of the services in case of services

problem: decrease downtime of the services.

First Latin American EELA Workshop, April 24th-26th 2006

38

Hardware requirements.Hardware requirements.• Network:

– % packet loss must be low.– High latency network (Round Trip Time > 100 ms):

potential show stopper.– Duplication of information services (databases) should

be considered (e.g: Belle grid extending in Australia, Japan, South Korea).

• Servers hardware: – Disk arrays quality: data corruption etc…– Data duplication can be a show stopper in terms of

budget.– Database servers scaled correctly.

First Latin American EELA Workshop, April 24th-26th 2006

39

Other requirements.Other requirements.• Data integrity (checksum).• Backup policy in order to prevent data loss. • Scalability of the middleware.• Middleware must be multi OS.• Fault tolerance of the system. • Compatibility of the client application version as a

function of the midleware evolution: prevent tough and painfull migration to newer version.

• Middleware must be as transparent as possible to hardware, databases etc…evolution.

First Latin American EELA Workshop, April 24th-26th 2006

40

Challenges.Challenges.

• Is a grid environment always the solution?

• Not sure !!!

• Cost in terms of:– Hardware.– Networking.– Manpower (more duplicated sites, more data,

more admins).

can be prohibitive.

Prospects.Prospects.

First Latin American EELA Workshop, April 24th-26th 2006

42

Summary and outlook (I).Summary and outlook (I).

• Middleware needed for an efficient data management over multiple site.

• Scalability might be an issue in the future for the information systems (databases) linked to these middleware:– Inflation of metadata.– Inflation of files.

Web services: not sure that it should not be at the centre of data distribution.

• Economic and manpower costs often neglected.

First Latin American EELA Workshop, April 24th-26th 2006

43

Summary and outlook (II).Summary and outlook (II).

• SRB: a very good candidate.

Is there a real competitor at the moment ?• RODS (Rule Oriented Data management System):

– Replacement of SRB (open source).

– Compatible with SRB (SRB client application could connect to a RODS server).

– SDSC leading the project.

– CC-IN2P3, one of a few partners going to be involved in the first step.

First Latin American EELA Workshop, April 24th-26th 2006

44

Acknowledgement.Acknowledgement.Many thanks to:

– Reagan Moore and his team (SDSC, USA).– Adil Hasan (CCLRC-RAL, UK).– Wilko Kroeger (SLAC, USA).– Pascal Calvat (CC-IN2P3, France).

BaBar: http://www.slac.stanford.edu/BFROOT/Belle: http://belle.kek.jp/ BIRN: http://www.nbirn.net/CC-IN2P3: http://cc.in2p3.fr/ ESnet: http://www.es.net/ROADNet: http://eqinfo.ucsd.edu/projects/roadnet/index.htmlSRB: http://www.sdsc.edu/srb/index.php/Main_Page