John Cunniffe Dunsink Observatory Dublin Institute for Advanced Studies Evert Meurs (Dunsink...
-
Upload
jayden-fisher -
Category
Documents
-
view
212 -
download
0
Transcript of John Cunniffe Dunsink Observatory Dublin Institute for Advanced Studies Evert Meurs (Dunsink...
John CunniffeDunsink Observatory
Dublin Institute for Advanced Studies
Evert Meurs (Dunsink Observatory)
Aaron Golden (NUI Galway)
Aus VO 18/11/03
Efficient X-ray Data Mining
2
Once you make doing science with your VO service easy,
everyone will want to use your server.
Analagous to oversubscribed observatory time
- how do users successfully ‘compete’ for query timeQuery modelling in a proposal?
Need data simulators/previewers to run query on.and/or data subset for test run.
3
Future X-ray missionsCurrent Missions - XMM/Chandra/RXTE - download data (typ. few GB/pointing) - processed on local machine
XEUS, Constellation-X, Astro-E2, etc
-very large data sets (few 100GB/pointing)-online data processing
proposed framework involves users submitting web based requests for processing pipelines
-derived data products very importantsource catalogues
images, spectra, lightcurves, etc
4
Efficient X-ray Data MiningEfficient -
Don’t want to reprocess the data archive unless really needed– maximise use of metadata
X-ray - Data processing pipelines more complicated (than e.g. optical)Treatment of faint sources/sky background statistically complexInstrument response complex
(not exclusive to X-ray)
Data Mining -Interested in the sources found in the data but also in the context (i.e. why we found them in that selection)
Not simply interested in finding objects through cone searches and stopping there.
5
Science Use CaseInterested in variable/transient X-ray objects
short-term: e.g. flare stars (~1 dataset) long-term: e.g. variability of normal/active galaxies (multi-dataset)
Current approach:• use http-get scripts to Heasarc - create cross-correlated source cats.• where known objects are not present in a catalogue
– retrieve original dataset & calculate upper flux limit (Expensive) N.B. if source catalogue was generated from the whole data archive then we may need to re-analyse a significant fraction of it.
To understand space density/flaring rate/etc of populations in the catalogues we need to know the volume of space covered by archive:
area coverage (RA, dec) temporal coverage (t1,t2,...,ti)
spectral (Energy) flux limit
6
ROSAT All-Sky SurveyDuration: 1990 June - 1991 Jan
E = 0.1 - 2.4 keV
RASS-BSC (Bright Source Catalogue)
RASS-FSC (Faint Source Catalogue)
Selection Criteria BSC FSC
Count Rate > 0.05/sec BSC
Probability (MaxLik) 15 (~5) 7 (~3)
N(photons) 15 6
Accepted Sources 18,811 105,924
NB: Catalogues have non-uniform sky coverage &
sensitivity.
7
Regions with different sensitivity included in
same source catalogues.
c.f. XMM-Serendipitous Source Cat
(created from pointed mode observations with different exposure
times & instrument modes)
Need a good coverage/sensitivity model of the data archive to understand volume of space
contained in source catalogue.66 binned image of RASS data set
Survey depth
8
Model Method 1: Upper Limit predictor
Combine:Instument model (ARFs, PSFs, modes, ...)Exposure time .... (0-30ksec)
NH information, .…
… source spectral model, ....
create a high resolution flux limit map of the RASS sky …. ….. in progress.
9
Model Method 2: Upper limit flux tabulation
Reprocess the data archive and determine the upper limit statistics from the photon data directly
… combine with ….
NH information, .…
… source spectral model, ....
create a high resolution flux limit map of the RASS sky …. ….. in progress.
10
Results in a sensitivity map of the RASS sky- adds usefulness to the source catalogue
Doing this with RASS is straightforward (though not quick) as the total data archive is a few 10s of GB.
Doing it for future observatories will have to be done on the archive curator’s server
11
The role of Archive/Source Catalogue Metadata
Data
Archive
Source
Catalogue
How should contents (not parameters) of a source catalogue best be described in the metadata?
- why are the sources in it - in it?
- describe the selection criteria
X-ray photon lists/ancilliary instrument data
Computationally expensive to reprocess
Selection
Criteria
12
Flux limit maps, limiting magnitude calculators,
observation simulators …..
VO Data Model?“These are an integral part of the sensitivity/coverage description”
Enhance the metadata (face larger metadata)
Theory?“This is really telescope simulation”
Build separate model/simulator
13
Other wavebandsSimilar challenges otherwavebands.
Complex coverage andsensitivity descriptionsplus catalogue selection criteria.
How many brown dwarves are there?
In general, how much data description should go in the metadata and how much should be left in secondary resources?
14
Final Questions.How big (Kbytes) should data archive metadata be?
– Should it include preview data (e.g. ‘large’ FITS files)?
– Should selection criteria be described in the metadata(or simply a reference to the original publication)
– Provide partially reduced or preview data as externally held addendum to the metadata?
• Much bigger than standard metadata• Much smaller than whole archive
– What other tools are needed to allow astronomers to • assess usefullness of,• justify to Time Allocation Committees
large proposals/queries in a VO context?