Post on 21-Jan-2016
Data Intensive Astronomy Group Talk II
ICRAR Con
4 September 2015
Chen Wu
2
– MWA data system
– MWA data usage
– MWA storage modelling
– GLEAM archive
– GLEAM VO usage
– In-archive processing
Agenda
3
MWA Data System
4
MWA Archive
5
MWA Archive
6
MWA ingest and retrieval
24 Jan 2015
5 Aug 2014
7
Data access by region
14 million successful retrievals
8
MWA usage
9
Distribution of file freshness and staleness
Data storage @ MWA LTA Pawsey
10
10 Gbps
Stage
Release
Fast Disk Storage
Tape Library
DM-GET
DM-PUT
Archive
Stage
FE Nodes
CXFS
API
POSIX
MWA LTA @ Pawsey
Hierarchical Storage Management (DMF)
Tape Library x2(CSIRO - library + cables + optics)
x32
x32
Bulk Disk Storage
ScienceDB
M&CDB
11
Staging time
AGE_WEIGHT = constant + multiplier*<file_age_in_day>
12
Storage performance modelling
“simulation” using the MWA data access stream (25 million successful requests)
(a, b, c, d, c, a) 3 is the reuse distance
(a, b, c, d, c, a) a is staged from the tape
13
GLEAM
IVOA Interface
GLEAM VO Server
WebInterface
GLEAM Archive Store 04
GLEAM Archive Store 06
NGAS Client
Over 800,000 images20,000 MeasurementSet220 TB
14
GLEAM usage on all-sky view
All sky view in Aladin Lite!
15
In archive processingSome “real” requirements from both MWA and GLEAM:– Interactive processing
• Cutout and regridding, NGAS Tasks
– Batch (re-)processing - Process all files satisfying some conditions currently in the archive: e.g.
• Compress all visibility files that are (1) EoR project and (2) Observed on last Friday (MWA)
• Rescale flux of all snapshot images of GLEAM Phase 1 that are ingested in the past two weeks
• Make movies from images formed in DEC -26 strip scans• Re-index all WCS headers of images ingested from last November
– Incremental processing - Asynchronously, continuously, and selectively processing "newly" ingested files
• After a snapshot image tar is ingested, decompress it, and for each FITS image, compute its sky coverage, and update VO database indexes accordingly
• As soon as a 32MHz image is ingested, if its Robustness is 0, send a copy to RRI at India before transferring it to RDSI
– NGAS Job Framework • With the same spirit of MapReduce• File Object Container or DROP
16
Data re-processing Web UI
17
– MWA data system
– MWA data usage
– MWA storage modelling
– GLEAM archive
– GLEAM VO usage
– In-archive processing
Conclusion
18
Thank you!Q & A