Post on 27-Mar-2015
8 September 2006NVO Summer School 2006 - Aspen 1
Publishing and Resource Discovery with Registries
Ray PlanteGretchen Greene
THE US NATIONAL VIRTUAL OBSERVATORY
8 September 2006NVO Summer School 2006 - Aspen 2
All about Registries
• Overview of the Registry Framework
• Publishing to the NVO
• VOResource: Resource Metadata in XML
• IVOA Standard Registry Interface
• Exercise: query registry using standard interface
• Exercise: register resources in a registry
8 September 2006NVO Summer School 2006 - Aspen 3
The role of Resource Registries
• Used to discover and locate resources—data and services—that can be used in a VO application
• Resource: anything that is describable and identifiable.– Besides data and services: organizations, projects,
software, …– Presently concerned with simple set of resource types
• Registry: a list of resource descriptions– Expressed as structured metadata
to enable automated processing and searching
8 September 2006NVO Summer School 2006 - Aspen 4
An Overview of Data Discovery
• You can search the main NVO registry to find resources based on descriptive criteria
• NVO Registries are “coarse-grained”– You can find organizations, archives, catalogs– Won’t find images, celestial objects, table
records
• Registry framework contains multiple registries:– searchable registries– publishing registries
8 September 2006NVO Summer School 2006 - Aspen 5
Local PublishingRegistry Local
SearchableRegistry
FullSearchableRegistry
Local PublishingRegistry
FullSearchableRegistry
DataCenters
VOProjects
SpecializedPortals & Services
Registry Framework
8 September 2006NVO Summer School 2006 - Aspen 6
Local PublishingRegistry Local
SearchableRegistry
FullSearchableRegistry
Local PublishingRegistry
FullSearchableRegistry
DataCenters
VOProjects
SpecializedPortals & Services
Registry Framework
harvest(pull)
8 September 2006NVO Summer School 2006 - Aspen 7
Local PublishingRegistry Local
SearchableRegistry
FullSearchableRegistry
Local PublishingRegistry
FullSearchableRegistry
DataCenters
VOProjects
SpecializedPortals & Services
Registry Framework
harvest(pull)
Cross-harvest
8 September 2006NVO Summer School 2006 - Aspen 8
Local PublishingRegistry Local
SearchableRegistry
FullSearchableRegistry
Local PublishingRegistry
FullSearchableRegistry
DataCenters
VOProjects
SpecializedPortals & Services
Registry Framework
harvest(pull)
selectiveharvesting
Cross-harvest
8 September 2006NVO Summer School 2006 - Aspen 9
Local PublishingRegistry Local
SearchableRegistry
FullSearchableRegistry
Local PublishingRegistry
FullSearchableRegistry
DataCenters
VOProjects
SpecializedPortals & Services
ClientApplications
searchqueries
Registry Framework
8 September 2006NVO Summer School 2006 - Aspen 10
Local PublishingRegistry Local
SearchableRegistry
FullSearchableRegistry
Local PublishingRegistry
FullSearchableRegistry
DataCenters
VOProjects
SpecializedPortals & Services
ClientApplications
searchqueries
Registry Framework
8 September 2006NVO Summer School 2006 - Aspen 11
NVO Public Registries
Registry URL Searchable?
Publishing?
STScI/JHUNVO Registry
http://nvo.stsci.edu/voregistry/ Yes Yes
Caltech Carnivore http://nvo.caltech.edu:8080/carnivore/ Yes Yes
NCSARegistrationPortal
http://nvo.ncsa.uiuc.edu/nvoregistration.html
No Yes
Private Publishing Registries• HEASARC• CDS
Only support harvesting protocol
8 September 2006NVO Summer School 2006 - Aspen 12
Overview of Publishing
• Resources are published if one can use NVO facilities to find them.
• How to Publish to the NVO http://us-vo.org/pub/files/PublishHowto.html
– Multiple layers of publishing• Starts with registry description of resource• Data Access Services
Incremental exposure for incremental effort
– Who are you? How you publish depends on what you want to publish.
• An individual with a small data collection• An archive center• Someone with a cool service
8 September 2006NVO Summer School 2006 - Aspen 13
Small collections:VO-ready Repositories
• Repositories that allow users to deposit data to share with community– Guarantee long-term storage, availability
• Automatic support for VO publishing mechamisms– Entries into NVO Registry
– Support for standard services: Cone Search, SIA, SSA, SkyNode
• Currently available Repositories– Images: NCSA Astronomy Digital Image Library http://adil.ncsa.uiuc.edu/
– Spectra: Spectrum Service for the VO http://voservices.net/spectrum/
• Part of an emerging data-preservation effort– Focusing on processed products associated with published results
– Collaboration between NVO, journal publishers, and the library community
– Goals:• data publishing integrated into the journal publishing process
• data stored in distributed repositories run by academic libraries
• fully VO compliant
8 September 2006NVO Summer School 2006 - Aspen 14
Persistent Archives:Tools for Federation
• Registering your resources with a VO publishing registry– Enter description into registration form at one of the
available NVO registries:• STScI/JHU Registry: http://nvo.stsci.edu/voregistry/• NCSA Registration Portal:
http://nvo.ncsa.uiuc.edu/nvoregistration.html• Caltech Carnivore:
http://nvo.caltech.edu:8080/carnivore/
– If you have a large number of resources to register, you can run your own registry on your own site
• Caltech Carnivore: http://nvo.caltech.edu:8080/carnivore/
8 September 2006NVO Summer School 2006 - Aspen 15
• IVOA Standard Registry Interface– Will come on-line this fall world-wide– As part of this upgrade, NVO will unify publishing interfaces
• It won’t matter which NVO registry you register with• Improved support for all types of resources
– Will affect how users express advanced, constraint-based searches.
– In general, this presentation describes the registries and publishing in terms of the new standards
• Your feedback is valuable!– Publishing GUI– The publishing process– Client interfaces
Caution: Construction Ahead
8 September 2006NVO Summer School 2006 - Aspen 16
• What can/should you register?– Should: your Organization
• Declares yourself as a publisher with an ID– Should: your Data Collection
• Users at least know how to access it via a Browser– Can: your existing services
• Browser-based services: e.g. search page• Traditional CGI services• Web Services
The next level…• Implement and register one or more standard services
– Cone Search– Simple Image Access– Simple Spectral Access*– SkyNode
*newest service standard
Persistent Archives:Tools for Federation
8 September 2006NVO Summer School 2006 - Aspen 17
Cool Services:Integrating with the VO
1. Register your service at a registry• Currently…
• Can register a generic Web Service• If service doesn’t fall into supported categories, register it as
a generic Service• Improved support for non-standard services coming
• Feel free to let us know where the forms are inadequate
2. Integrate support for standard VO formats, schemas• FITS and VOTable• Standard Data Model schemas (emerging)
• VOResource, Space-time Coordinates, Spectra
3. Implement Standard Support Interface• a standard in development for:
Self-description, tracking health and usage
8 September 2006NVO Summer School 2006 - Aspen 18
A word about Identifiers…
• IVOA Identifier: a globally-unique URI identifying a resource
Ex: ivo://adil.ncsa/targeted/SIA
• Required as part of a registered resource description
• As publisher, you control what it looks like• Two components:
– Authority ID: e.g. adil.ncsaDefines a namespace for identifiersOwned by a single publishing organization
– Resource Key: e.g. targeted/SIAName for the resource unique within the namespaceEncourage re-use of local identifiers
8 September 2006NVO Summer School 2006 - Aspen 19
Resource Metadata: XML Schema
• Classes of Resources– Generic Resource– Extensions: e.g.
Organisation, DataCollection, Service, DataService, CatalogService, Registry, …
• Organized into separate schemas:– Core resource metadata: VOResource
– Various extensions schemas containing specific types
• Capable of describing…– Data centers, research organizations, missions, observatories– Data collections, archives – VO standard services: Cone Search, Simple Image Access,
Simple Spectral Access, SkyNode– Existing Browser/CGI-based services– Web Services
8 September 2006NVO Summer School 2006 - Aspen 20
Resource Metadata: Services
• Service resource records extends the core by adding capability metadata– capability = the interfaces/protocols and behavior supported by
the service– Each standard protocol is considered a different capability
• A service can support several capabilities (e.g. ConeSearch and SkyNode)
– There are associated standard capability metadata extensions for standard protocols
• For Simple Image Access, can state…– Maximum number of records returned– Maximum query region– Whether returned images are cutouts or static images…
• Capability metadata includes a description of the service interface– All interface descriptions include a service or access URL– For Web Services, access URL is usually sufficient– For REST-like interfaces, more descriptions of inputs can be
described.
• Capability model allows description of support for different versions of protocol standards
8 September 2006NVO Summer School 2006 - Aspen 21
Sample Resource Descriptionadilsia.xml
<Resource xsi:type="vs:CatalogService" created="2003-06-23T19:02:32" updated="2004-04-05T17:07:22" … namespace definitions >
<validationLevel validatedBy="ivo://nvo.ncsa/registry">2</validationLevel>
<title> NCSA Astronomy Digital Image Library Simple Image Access </title> <shortName>ADIL</shortName> <identifier>ivo://adil.ncsa/targeted/SIA</identifier>
<curation> <publisher ivo-id="ivo://rai.ncsa/RAI"> NCSA Radio Astronomy Imaging </publisher> <creator> <name>contributing authors</name> <logo>http://adil.ncsa.uiuc.edu/images/adilfooter.gif</logo> </creator> <date>2002-01-01</date> <contact> <name>Dr. Raymond Plante</name> <email>adil@ncsa.uiuc.edu</email> </contact> </curation>
<content> <description> This allows searching for ADIL images via the SIA protocol. </description> <referenceURL>http://adil.ncsa.uiuc.edu/</referenceURL> <type>Archive</type> <contentLevel>University</contentLevel> <contentLevel>Research</contentLevel> </content>
The specific classof resource
8 September 2006NVO Summer School 2006 - Aspen 22
Sample Resource Descriptionadilsia.xml
<Resource xsi:type="vs:CatalogService" created="2003-06-23T19:02:32" updated="2004-04-05T17:07:22" … namespace definitions >
<validationLevel validatedBy="ivo://nvo.ncsa/registry">2</validationLevel>
<title> NCSA Astronomy Digital Image Library Simple Image Access </title> <shortName>ADIL</shortName> <identifier>ivo://adil.ncsa/targeted/SIA</identifier>
<curation> <publisher ivo-id="ivo://rai.ncsa/RAI"> NCSA Radio Astronomy Imaging </publisher> <creator> <name>contributing authors</name> <logo>http://adil.ncsa.uiuc.edu/images/adilfooter.gif</logo> </creator> <date>2002-01-01</date> <contact> <name>Dr. Raymond Plante</name> <email>adil@ncsa.uiuc.edu</email> </contact> </curation>
<content> <description> This allows searching for ADIL images via the SIA protocol. </description> <referenceURL>http://adil.ncsa.uiuc.edu/</referenceURL> <type>Archive</type> <contentLevel>University</contentLevel> <contentLevel>Research</contentLevel> </content>
The specific classof resource
Metadata quality rating
8 September 2006NVO Summer School 2006 - Aspen 23
Sample Resource Descriptionadilsia.xml
<Resource xsi:type="vs:CatalogService" created="2003-06-23T19:02:32" updated="2004-04-05T17:07:22" … namespace definitions >
<validationLevel validatedBy="ivo://nvo.ncsa/registry">2</validationLevel>
<title> NCSA Astronomy Digital Image Library Simple Image Access </title> <shortName>ADIL</shortName> <identifier>ivo://adil.ncsa/targeted/SIA</identifier>
<curation> <publisher ivo-id="ivo://rai.ncsa/RAI"> NCSA Radio Astronomy Imaging </publisher> <creator> <name>contributing authors</name> <logo>http://adil.ncsa.uiuc.edu/images/adilfooter.gif</logo> </creator> <date>2002-01-01</date> <contact> <name>Dr. Raymond Plante</name> <email>adil@ncsa.uiuc.edu</email> </contact> </curation>
<content> <description> This allows searching for ADIL images via the SIA protocol. </description> <referenceURL>http://adil.ncsa.uiuc.edu/</referenceURL> <type>Archive</type> <contentLevel>University</contentLevel> <contentLevel>Research</contentLevel> </content>
The specific classof resource
Identity Metadata:what we call it
Metadata quality rating
8 September 2006NVO Summer School 2006 - Aspen 24
Sample Resource Descriptionadilsia.xml
<Resource xsi:type="vs:CatalogService" created="2003-06-23T19:02:32" updated="2004-04-05T17:07:22" … namespace definitions >
<validationLevel validatedBy="ivo://nvo.ncsa/registry">2</validationLevel>
<title> NCSA Astronomy Digital Image Library Simple Image Access </title> <shortName>ADIL</shortName> <identifier>ivo://adil.ncsa/targeted/SIA</identifier>
<curation> <publisher ivo-id="ivo://rai.ncsa/RAI"> NCSA Radio Astronomy Imaging </publisher> <creator> <name>contributing authors</name> <logo>http://adil.ncsa.uiuc.edu/images/adilfooter.gif</logo> </creator> <date>2002-01-01</date> <contact> <name>Dr. Raymond Plante</name> <email>adil@ncsa.uiuc.edu</email> </contact> </curation>
<content> <description> This allows searching for ADIL images via the SIA protocol. </description> <referenceURL>http://adil.ncsa.uiuc.edu/</referenceURL> <type>Archive</type> <contentLevel>University</contentLevel> <contentLevel>Research</contentLevel> </content>
The specific classof resource
Identity Metadata:what we call it
Curation Metadata:who is responsible
Metadata quality rating
8 September 2006NVO Summer School 2006 - Aspen 25
Sample Resource Descriptionadilsia.xml
<Resource xsi:type="vs:CatalogService" created="2003-06-23T19:02:32" updated="2004-04-05T17:07:22" … namespace definitions >
<validationLevel validatedBy="ivo://nvo.ncsa/registry">2</validationLevel>
<title> NCSA Astronomy Digital Image Library Simple Image Access </title> <shortName>ADIL</shortName> <identifier>ivo://adil.ncsa/targeted/SIA</identifier>
<curation> <publisher ivo-id="ivo://rai.ncsa/RAI"> NCSA Radio Astronomy Imaging </publisher> <creator> <name>contributing authors</name> <logo>http://adil.ncsa.uiuc.edu/images/adilfooter.gif</logo> </creator> <date>2002-01-01</date> <contact> <name>Dr. Raymond Plante</name> <email>adil@ncsa.uiuc.edu</email> </contact> </curation>
<content> <description> This allows searching for ADIL images via the SIA protocol. </description> <referenceURL>http://adil.ncsa.uiuc.edu/</referenceURL> <type>Archive</type> <contentLevel>University</contentLevel> <contentLevel>Research</contentLevel> </content>
The specific classof resource
Identity Metadata:what we call it
Curation Metadata:who is responsible
Content Metadata:what it contains
Metadata quality rating
8 September 2006NVO Summer School 2006 - Aspen 26
Sample Resource Descriptionadilsia.xml
<capability xsi:type="sia:SimpleImageAccess" standardID="ivo://ivoa.net/std/SIA">
<validationLevel validatedBy="ivo://…">2</validationLevel>
<interface xsi:type="vs:ParamHTTP" role="std"> <accessURL use="base"> http://adil.ncsa.uiuc.edu/cgi-bin/voimquery?survey=f& </accessURL> <queryType>GET</queryType> <resultType>application/xml+votable</resultType> </interface> <imageServiceType>Pointed</imageServiceType> <maxQueryRegionSize> <long>360.0</long> <lat>180.0</lat> </maxQueryRegionSize> ... </capability>
<coverage> <stc:STCResourceProfile> <stc:AstroCoordSystem id="UTC-ICRS-TOPO" xlink:href="ivo://STClib/CoordSys#UTC-ICRS-TOPO" xlink:type="simple"/>
<stc:AstroCoordArea coord_system_id="UTC-ICRS-TOPO"> <stc:AllSky/> </stc:AstroCoordArea> </stc:STCResourceProfile>
<waveband>Radio</waveband> <waveband>Millimeter</waveband> <waveband>Infrared</waveband> <waveband>Optical</waveband> <waveband>UV</waveband> </coverage></Resource>
Capability Metadata:what it can do
8 September 2006NVO Summer School 2006 - Aspen 27
Sample Resource Descriptionadilsia.xml
<capability xsi:type="sia:SimpleImageAccess" standardID="ivo://ivoa.net/std/SIA">
<validationLevel validatedBy="ivo://…">2</validationLevel>
<interface xsi:type="vs:ParamHTTP" role="std"> <accessURL use="base"> http://adil.ncsa.uiuc.edu/cgi-bin/voimquery?survey=f& </accessURL> <queryType>GET</queryType> <resultType>application/xml+votable</resultType> </interface> <imageServiceType>Pointed</imageServiceType> <maxQueryRegionSize> <long>360.0</long> <lat>180.0</lat> </maxQueryRegionSize> ... </capability>
<coverage> <stc:STCResourceProfile> <stc:AstroCoordSystem id="UTC-ICRS-TOPO" xlink:href="ivo://STClib/CoordSys#UTC-ICRS-TOPO" xlink:type="simple"/>
<stc:AstroCoordArea coord_system_id="UTC-ICRS-TOPO"> <stc:AllSky/> </stc:AstroCoordArea> </stc:STCResourceProfile>
<waveband>Radio</waveband> <waveband>Millimeter</waveband> <waveband>Infrared</waveband> <waveband>Optical</waveband> <waveband>UV</waveband> </coverage></Resource>
Capability Metadata:what it can do
The specific classof capability
8 September 2006NVO Summer School 2006 - Aspen 28
Sample Resource Descriptionadilsia.xml
<capability xsi:type="sia:SimpleImageAccess" standardID="ivo://ivoa.net/std/SIA">
<validationLevel validatedBy="ivo://…">2</validationLevel>
<interface xsi:type="vs:ParamHTTP" role="std"> <accessURL use="base"> http://adil.ncsa.uiuc.edu/cgi-bin/voimquery?survey=f& </accessURL> <queryType>GET</queryType> <resultType>application/xml+votable</resultType> </interface> <imageServiceType>Pointed</imageServiceType> <maxQueryRegionSize> <long>360.0</long> <lat>180.0</lat> </maxQueryRegionSize> ... </capability>
<coverage> <stc:STCResourceProfile> <stc:AstroCoordSystem id="UTC-ICRS-TOPO" xlink:href="ivo://STClib/CoordSys#UTC-ICRS-TOPO" xlink:type="simple"/>
<stc:AstroCoordArea coord_system_id="UTC-ICRS-TOPO"> <stc:AllSky/> </stc:AstroCoordArea> </stc:STCResourceProfile>
<waveband>Radio</waveband> <waveband>Millimeter</waveband> <waveband>Infrared</waveband> <waveband>Optical</waveband> <waveband>UV</waveband> </coverage></Resource>
Capability Metadata:what it can do
The specific classof capability
Coverage Metadata:how the data coversthe sky, time, & frequency
8 September 2006NVO Summer School 2006 - Aspen 29
IVOA Standard Registry Interface
• Harvesting Interface– Used by registries to exchange resource descriptions. – Defined as a profile on the Open Archives Initiative (OAI)
harvesting standard
• Search Interface– How client applications discover resources– 5 operations:
• getIdentity: returns VOResource description of registry• getResource: returns the VOResource description for a given
identifier• keywordSearch: returns all descriptions that contain words
from a given set• search: returns all descriptions that match a set of specific
constraints• xquerySearch: (optional) XQuery-based searching
– For end users, many of the details of these operations may be hidden behind user-oriented tools and libraries
• Possible exception: expressing constraint-based searches
8 September 2006NVO Summer School 2006 - Aspen 30
Advanced, constraint-based searching
• Placing constraints on values of specific metadata– Expressed as an ADQL “where” clause
e.g. title like '%Deep Field%' or shortName='HDF'
– A field or “column” name is expressed as a simple XPath to the element being constrained
• Relative to the root Resource element• Composed of /s and element names only;
[ ] predicates and special characters (*, ., .., //) are not allowed
• Must point to a primative value—e.g. contains a string• Can point to an attribute by preceeding name with @
(curation/publisher/@ivo-id='ivo://ned.ipac' or curation/publisher like '%IPAC%') and content/contentLevel='Research' and capability/validationLevel >= 3
8 September 2006NVO Summer School 2006 - Aspen 31
Advanced, constraint-based searching
• Searching on xsi:type– Avoid including prefix label in xsi:type constraint
@xsi:type like '%:CatalogService'
capability/@xsi:type like '%SimpleImageAccess‘
@xsi:type like '%Service' – Matches Service, DataService, & CatalogService
• Selecting based on coverage– It is generally not useful to apply ADQL constraints to
Space-Time Coordinate (STC) metadata• e.g. anything under stc:STCResourceProfile
• STC descriptions are complex and not sufficiently unique
– Emerging footprint services will facilitate selection based on coverage