Semantic Grid + Data Federation US National Virtual Observatory Roy Williams California Institute of...
-
Upload
rosemary-mccarthy -
Category
Documents
-
view
220 -
download
0
Transcript of Semantic Grid + Data Federation US National Virtual Observatory Roy Williams California Institute of...
Semantic GridSemantic Grid + + Data FederationData Federation
US National Virtual Observatory
Roy WilliamsCalifornia Institute of Technology
NVO co-director
What is NVO?
– Standard protocols, standard data types• XML transfer protocol (VOTable)• Resource description (VOResource etc)• Publish/discover to federated registry (OAI)• Semantic Types (UCD)• Services: Cone search, Simple Image Access
– Computing with big data on the Grid• Database Crossmatch• Image Federation: Atlases
First NVO Discovery
Database Fuzzy Join
2MASS versus SDSS cross-identification with- j_m as 2MASS magnitude and - I_mtotn as SDSS magnitude
2MASS : j_m ,+ 15SDSS: I_mtotn <= 18
Billion Source Cross-Identification: A Computational Challenge
SDSS unmatched
2MASS matched
SDSS matched
2MASS unmatched
Crossmatch Services
SDSSdatabase
2MASSdatabase
query
query
Crossmatchservice
query
scientificknowledge!
NVO protocols
First NVO Discovery
Database crossmatch of two massive
databases creates new science
“The sum is greater than the parts”
Semantic Grid
Cone Search
• First VO standard service• Input: RA, DEC, SR must be present
– decimal degrees J2000
• Output: VOTable of sky-located data records– must have columns with UCDs:
POS_EQ_RA_MAIN, POS_EQ_DEC_MAIN, ID_MAIN
RA=300DEC=25SR=0.1
ID RA DEC x y z
RequestResponse
Cone Search Registry
POS_EQ_RA_MAIN
POS_EQ_DEC_MAINPOS_EQ
ID
URLbase RA=200&DEC=20&SR=2Request: HTTPget of shape:
Response: VOTable of shape:
A collection of services that have the same shape
Cone Search + Density Probe
Cone
Search
Density
Probe
baseURL
Spacing
Search radius
interoperating NVO-compliant services!
Federation of Multiple Services
NVO Image ProtocolSIAP
• Specify box by position and size• SIAP server returns relevant images
• Footprint• Logical Name• URL
Can choose:
standard URL:http://.......
SRB URLsrb://nvo.npaci.edu/…..
Simple Image Access Service
• Query is sky region• May query on image type, image geometry
• Response is VOTable of images• Each has WCS (geometry) parameters• Plus a URL to fetch the image
• Designed for• Set of pointed observations (eg Hubble)• Wide-area survey (eg Sloan)• Image service
– Mosaicking
– Reprojection
Data Inventory Service
• What data covers a position in the sky?
Registry
OAIPublish
Registry
OAI QueryRegistry
OAIPublish
DIS
1
2
3
4Caltech
NCSA
JHU/StSci
Goddard
Data Inventory Service
Request is a cone on the sky
Data Inventory Service
Relevant Images and Catalogs
NVSS Image
ROSAT catalog
Image Federation
VO Registry
VORegistry
Schemas & Service Types
VOResourceID ivo://me.com/file123
Query service
R R
Portals Tools& Services
DatabasesGridVirtual Data
md server for ivo://
VOView Fill-in forms Visualization Reports
Publishing
OAI
Publish service
AladinOASIS
DIS
What is in the Registry?
• Answer: “Entities”• It has a global identifier ivo://…….
– Must be resolved by authority
• It has “VOViews”– Queries return these
• …..and that’s all!
3 Views of an Entitiy
Zoo-keeper metadata:<diet>carrots</diet><excrement>yes</excrement><fencing>strong</fencing>
Transportation metadata:<weight>4000 kg</weight><poisonous>no</poisonous><claws>no</claws><food>carrots</food><waste-mgmt>heavy</waste-mgmt>
Zoo-manager metadata:<popularity>9</popularity><visitors>2500 per day</visitors><feeding>carrots</feeding>
“entity”
VOResource
A mandatory form plus other supporting forms
Schemas and Service Types
• VOResource– Entity description form
• Organzation, project, data collection, service• Has ivo:// identifier
• VORegion– sky coverage form (α/δ/λ)
• VOTable– star catalog, image list, other tables
• OAI– Registry harvesting– Distributed virtual registry
• CONE– Request-response for catalog
• SIAP– Request-response for images
When can I publish my own schema to VO?
Dublin Core Metadata
Title A name given to the resource.
Creator An entity primarily responsible for making the content of the resource.
Subject A topic of the content of the resource.
Description An account of the content of the resource.
Publisher An entity responsible for making the resource available
Contributor An entity responsible for making contributions to the content of the resource.
Date A date of an event in the lifecycle of the resource.
Type The nature or genre of the content of the resource.
Format The physical or digital manifestation of the resource.
Identifier An unambiguous reference to the resource within a given context.
Source A Reference to a resource from which the present resource is derived.
Language A language of the intellectual content of the resource.
Relation A reference to a related resource.
Coverage The extent or scope of the content of the resource.
Rights Information about rights held in and over the resource.
Curation data for “any human creation”
Dublin Core
Dublin Core is how the VO will interoperate with libraries of the world
A global metadata standard
Prototype Registry
OrganizationData CollectionProjectServiceSIA service
VOViews
VOResource view
Dublin Core view
OAI: Open Archives Initiative Harvesting Protocol
OAI is popular– Ask your University librarian
Distributed Comprehensive Registry– Harvesting
Different views for different purposes– Six blind men and the elephant
OAI Harvesting Protocol
6 magic verbs of OAI
VO Identifiers
ivo://mydomain.com / mySkySurvey # file00037.fits
• URI form• Still in flux
Authority ID• Registered with IVOA• Must correspond to a registry
Resource ID• Created by Authority• Resolved by registry
Record ID• Not known to registry
delimiter delimiter
Image Federation
Multispectral Imagery
Crab Nebula.3 channels: X-ray in blue, optical in green, and radio in red.
Moffet Field California. 224 channels from 400 nm to 2500 nm
Image Federation
detection
Stacking allows detection of faint sources. A 1-sigma detection in each of many bands becomes a 3-sigma detection.
Images of the same galaxy taken several days apart are automatically subtracted from one another, and remaining bright spots may be supernova candidates. (NEAT project)
Image subtraction allows detection of narrow-line features that are not also wide-band (eg Hα but not R-band)
Principle Components
SDSS (5 channel) SDSS+2MASS (8 channel)
Mosaicking and Federation
Every Astronomical image has a different projection
• different pointing of the telescope
• We want to mosaic different images• We want to federate different information
Compute intensive:flux in each pixel is carefully
distributed into a new pixel grid
Mosaicking
Federation
Infrared map
Xray map today
Xray map last year
AtlasmakerUses Montage, Yoursky
Project
Project Estimate & correct Background
Co-Add
Data
Chart
David H
ockney Pearblossom
Highw
ay 1986
Images and Charts
Image• Big data
Chart• Map: sphere → plane• FITS-WCS header• small data
An atlas is a collection of chartsHyperatlas is an attempt to standardize atlases
HyperatlasStandard naming for atlases and vcharts
TM-5-SIN-20Vchart TM-5-SIN-20-1589
Standard Scales:scale s means 220-s arcseconds per pixel
SIN projection
TAN projection
TM-5 layout
HV-4 layout
Standard Projections
StandardLayout
Parallel Atlasmaker
MPI Parallellism• ~2% serial work (Amdahl)• Projection is parallel• All nodes share filespace
Making a single Image Making an Atlas of 1736 Images
Teragrid Distributed• Federated Scheduling wanted• SRB as Virtual Data Catalog
Atlasmaker Architecture
NVO/IVONED
SloanDPOSS
FIRST[2MASS]
NV
O P
rotocol
making atlaspages
scalereprojectcompress
sky index
VirtualDataSystem
YourSkyVirtualSkyOasis
VIE
W B
us
federation
datamining
Hyperatlas service
SIAP services
AtlasmakerVirtual Data System
Metadata repositoriesFederated by OAI
Data repositoriesFederated by SRB
Compute resourcesFederated by TG/IPG
Mosaicked data is on
file
2a. Mosaicked data is not on file
2d: Store result &
return result
2c: Compute on TG/IPG
Userrequest
Request manager
2b. Get raw data from NVO resources
Atlasmaker stack
Mosaicking(executables)
Atlasmaker(script)
Hyperatlas(service)
NVO Image Access(service)
SRB(service)
webMontage YourSky
Virtual Data System-- Chimera?
Charts and Pages
Chart – a frame for specific data
Page – an organization for data
The virtual disk is 400,000 pixels wideS
IN projection
Background Correction
Uncorrected Corrected
Montage Background Correction
Project pixels to output chart
Fit ramps on overlap regions
Fit ramps on projected images
Subtract from Pixel values