Data Intensive Astronomy Group Talk II ICRAR Con 4 September 2015 Chen Wu.

18
Data Intensive Astronomy Group Talk II ICRAR Con 4 September 2015 Chen Wu

Transcript of Data Intensive Astronomy Group Talk II ICRAR Con 4 September 2015 Chen Wu.

Page 1: Data Intensive Astronomy Group Talk II ICRAR Con 4 September 2015 Chen Wu.

Data Intensive Astronomy Group Talk II

ICRAR Con

4 September 2015

Chen Wu

Page 2: Data Intensive Astronomy Group Talk II ICRAR Con 4 September 2015 Chen Wu.

2

– MWA data system

– MWA data usage

– MWA storage modelling

– GLEAM archive

– GLEAM VO usage

– In-archive processing

Agenda

Page 3: Data Intensive Astronomy Group Talk II ICRAR Con 4 September 2015 Chen Wu.

3

MWA Data System

Page 4: Data Intensive Astronomy Group Talk II ICRAR Con 4 September 2015 Chen Wu.

4

MWA Archive

Page 5: Data Intensive Astronomy Group Talk II ICRAR Con 4 September 2015 Chen Wu.

5

MWA Archive

Page 6: Data Intensive Astronomy Group Talk II ICRAR Con 4 September 2015 Chen Wu.

6

MWA ingest and retrieval

24 Jan 2015

5 Aug 2014

Page 7: Data Intensive Astronomy Group Talk II ICRAR Con 4 September 2015 Chen Wu.

7

Data access by region

14 million successful retrievals

Page 8: Data Intensive Astronomy Group Talk II ICRAR Con 4 September 2015 Chen Wu.

8

MWA usage

Page 9: Data Intensive Astronomy Group Talk II ICRAR Con 4 September 2015 Chen Wu.

9

Distribution of file freshness and staleness

Page 10: Data Intensive Astronomy Group Talk II ICRAR Con 4 September 2015 Chen Wu.

Data storage @ MWA LTA Pawsey

10

10 Gbps

Stage

Release

Fast Disk Storage

Tape Library

DM-GET

DM-PUT

Archive

Stage

FE Nodes

CXFS

API

POSIX

MWA LTA @ Pawsey

Hierarchical Storage Management (DMF)

Tape Library x2(CSIRO - library + cables + optics)

x32

x32

Bulk Disk Storage

ScienceDB

M&CDB

Page 11: Data Intensive Astronomy Group Talk II ICRAR Con 4 September 2015 Chen Wu.

11

Staging time

AGE_WEIGHT = constant + multiplier*<file_age_in_day>

Page 12: Data Intensive Astronomy Group Talk II ICRAR Con 4 September 2015 Chen Wu.

12

Storage performance modelling

“simulation” using the MWA data access stream (25 million successful requests)

(a, b, c, d, c, a) 3 is the reuse distance

(a, b, c, d, c, a) a is staged from the tape

Page 13: Data Intensive Astronomy Group Talk II ICRAR Con 4 September 2015 Chen Wu.

13

GLEAM

IVOA Interface

GLEAM VO Server

WebInterface

GLEAM Archive Store 04

GLEAM Archive Store 06

NGAS Client

Over 800,000 images20,000 MeasurementSet220 TB

Page 14: Data Intensive Astronomy Group Talk II ICRAR Con 4 September 2015 Chen Wu.

14

GLEAM usage on all-sky view

All sky view in Aladin Lite!

Page 15: Data Intensive Astronomy Group Talk II ICRAR Con 4 September 2015 Chen Wu.

15

In archive processingSome “real” requirements from both MWA and GLEAM:– Interactive processing

• Cutout and regridding, NGAS Tasks

– Batch (re-)processing - Process all files satisfying some conditions currently in the archive: e.g.

• Compress all visibility files that are (1) EoR project and (2) Observed on last Friday (MWA)

• Rescale flux of all snapshot images of GLEAM Phase 1 that are ingested in the past two weeks

• Make movies from images formed in DEC -26 strip scans• Re-index all WCS headers of images ingested from last November

– Incremental processing - Asynchronously, continuously, and selectively processing "newly" ingested files

• After a snapshot image tar is ingested, decompress it, and for each FITS image, compute its sky coverage, and update VO database indexes accordingly

• As soon as a 32MHz image is ingested, if its Robustness is 0, send a copy to RRI at India before transferring it to RDSI

– NGAS Job Framework • With the same spirit of MapReduce• File Object Container or DROP

Page 16: Data Intensive Astronomy Group Talk II ICRAR Con 4 September 2015 Chen Wu.

16

Data re-processing Web UI

Page 17: Data Intensive Astronomy Group Talk II ICRAR Con 4 September 2015 Chen Wu.

17

– MWA data system

– MWA data usage

– MWA storage modelling

– GLEAM archive

– GLEAM VO usage

– In-archive processing

Conclusion

Page 18: Data Intensive Astronomy Group Talk II ICRAR Con 4 September 2015 Chen Wu.

18

Thank you!Q & A