26 October 2004ADASS 2004 - Pasadena 1
Publishing and Resource Discovery with Registries
Ray Plante
THE US NATIONAL VIRTUAL OBSERVATORY
Kevin BensonSebastien DerrierePierre FerniqueMatthew GrahamGretchen Greene
Bob HanischPaul HarrisonMartin HillJeongin LeeGerard Lemson
Tony LindeTom McGlynnWil O’MullaneKeith NoddleRamon Williamson
Visit the NVO Demo Booth
26 October 2004ADASS 2004 - Pasadena 2
Summary (2003)
• We built a working prototype registry system to support an end-user VO service– Distributed Publishing and Searchable components
– Encoded descriptions using emerging VO XML standard schemas
– OAI Harvesting Standard deployed easily
– Used to discover Cone Search and SIA services
• What’s next: Interoperable registries IVOA-wide – Stablize XML metadata standard
– Standardize registry interfaces
26 October 2004ADASS 2004 - Pasadena 3
Summary (2004)
• We built a working production registry system to support an end-user VO services– DataScope: discovers Cone Search, Simple Image Access
services
– OpenSkyQuery Portal: discovers OpenSkyNodes
• What’s next: Interoperable registries IVOA-wide – Stabilize XML metadata standard
– Standardize registry interfaces
=> IVOA: Frozen working draft standard for January ’05 releases
26 October 2004ADASS 2004 - Pasadena 4
Registries 2004
• Review of Registry architecture• Resource Metadata Model• IVOA Registry Interface Standard
– Harvesting– Searching
• The NVO Publishing Process• Searching for Resources• Curation Issues
26 October 2004ADASS 2004 - Pasadena 5
The role of Resource Registries
• Used to discover and locate resources—data and services—that can be used in a VO application
• Resource: anything that is describable and identifiable.– Besides data and services: organizations, projects,
software, …– Presently concerned with simple set of resource types
• Registry: a list of resource descriptions– Expressed as structured metadata
to enable automated processing and searching
26 October 2004ADASS 2004 - Pasadena 6
Selected Requirements
• Allow user to select resources that are likely to pertain to a scientific question
• Select resources based on characteristics…– Type of resource: catalogs, image archives, EPO, services– Coverage in space, time, and frequency– Where data comes from, who curates it
• Dynamic: resources will come and go
• Distributed: Should not depend on a single point of failure or single view of the VO.
• Preserve the data providers’ control over their data– Curators control what gets registered, content, updates– Allow integration with existing resource management
• Allow extension to new types of resources
26 October 2004ADASS 2004 - Pasadena 7
IVOA Registry Working Group (RWG)
IVOA = International Virtual Observatory Alliance
• Common, global approach to registries
• Towards a standard framework– Registry Model– Resource Identifiers– Metadata schemas– Registry Interface
• Distributed model for registries
26 October 2004ADASS 2004 - Pasadena 8
Local PublishingRegistry Local
SearchableRegistry
FullSearchableRegistry
Local PublishingRegistry
FullSearchableRegistry
DataCenters
VOProjects
SpecializedPortals & Services
Registry Model
26 October 2004ADASS 2004 - Pasadena 9
Local PublishingRegistry Local
SearchableRegistry
FullSearchableRegistry
Local PublishingRegistry
FullSearchableRegistry
DataCenters
VOProjects
SpecializedPortals & Services
Registry Model
harvest(pull)
26 October 2004ADASS 2004 - Pasadena 10
Local PublishingRegistry Local
SearchableRegistry
FullSearchableRegistry
Local PublishingRegistry
FullSearchableRegistry
DataCenters
VOProjects
SpecializedPortals & Services
Registry Model
harvest(pull)
replicate
26 October 2004ADASS 2004 - Pasadena 11
Local PublishingRegistry Local
SearchableRegistry
FullSearchableRegistry
Local PublishingRegistry
FullSearchableRegistry
DataCenters
VOProjects
SpecializedPortals & Services
Registry Model
harvest(pull)
replicate
selectiveharvesting
26 October 2004ADASS 2004 - Pasadena 12
Local PublishingRegistry Local
SearchableRegistry
FullSearchableRegistry
Local PublishingRegistry
FullSearchableRegistry
DataCenters
VOProjects
SpecializedPortals & Services
ClientApplications
searchqueries
Registry Model
26 October 2004ADASS 2004 - Pasadena 13
Local PublishingRegistry Local
SearchableRegistry
FullSearchableRegistry
Local PublishingRegistry
FullSearchableRegistry
DataCenters
VOProjects
SpecializedPortals & Services
ClientApplications
searchqueries
Registry Model
26 October 2004ADASS 2004 - Pasadena 14
Local PublishingRegistry Local
SearchableRegistry
FullSearchableRegistry
Local PublishingRegistry
FullSearchableRegistry
DataCenters
VOProjects
SpecializedPortals & Services
ClientApplications
searchqueries
Registry Model
26 October 2004ADASS 2004 - Pasadena 15
Local PublishingRegistry
FullSearchableRegistry
Local PublishingRegistryCaltech
JHU/STScI
harvest(pull)
DataScope
search forservices
Registries in Use:DataScope
NCSA DS
26 October 2004ADASS 2004 - Pasadena 16
Local PublishingRegistry
Local PublishingRegistry
Local PublishingRegistry
FullSearchableRegistry
Local PublishingRegistryCaltech
JHU/STScI
harvest(pull)
DataScope
search forservicesNCSA DS
HEASARC
CDS
FullSearchableRegistry
AstroGridRegistries in Use:DataScope
26 October 2004ADASS 2004 - Pasadena 17
ConeSearchService
ConeSearchService
Simple ImageAccess
Simple ImageAccess
FullSearchableRegistry
JHU/STScI
search forservices
ConeSearchService
Simple ImageAccess
DataProviders
DataScope
DS
FullSearchableRegistry
AstroGrid
Local PublishingRegistry
Local PublishingRegistry
Local PublishingRegistry
Local PublishingRegistryCaltech
harvest(pull)
NCSA
HEASARC
CDS
Registries in Use:DataScope
26 October 2004ADASS 2004 - Pasadena 18
Registries in Use
• Registries in the NVO are currently operating and functional– DataScope: discovers Cone Search, Simple Image Access (SIA)
services– OpenSkyQuery Portal: discovers OpenSkyNodes– CDS Aladin/GLU: (Pierre Fernique)
• harvests Cone Search and SIA services • converts them into GLU dictionary records• Accessible directly by the Aladin image and catalog viewer
• AstroGrid Registry foundation for building workflows– Portal uses descriptions to stitch services together– (Previous talk by Keith Noddle)
• Cross-project harvesting– NVO, AstroGrid, AVO (Vizier, GLU)
• Registries are at the leading edge of VO development
26 October 2004ADASS 2004 - Pasadena 19
Resource Metadata ModelIVOA Recommendation:
Resource Metadata
26 October 2004ADASS 2004 - Pasadena 20
Resource Metadata ModelIVOA Recommendation:
Resource Metadata
Resource
Organisation Service
IVOA Working Draft: VOResource
as XMLCore Metadata
26 October 2004ADASS 2004 - Pasadena 21
Resource Metadata ModelIVOA Recommendation:
Resource Metadata
Resource
OrganisationAuthority
Registry
Service
IVOA Working Draft: VOResource
VORegistry
DataCollection
SkyService
TabularSkyService
VODataService
as XML
26 October 2004ADASS 2004 - Pasadena 22
Resource Metadata ModelIVOA Recommendation:
Resource Metadata
Resource
OrganisationAuthority
Registry
Service
IVOA Working Draft: VOResource
VORegistry
DataCollection
SkyService
TabularSkyService
VODataService
SimpleImageAccess
SIA
ConeSearch
ConeSearch
as XML
26 October 2004ADASS 2004 - Pasadena 23
Resource Metadata ModelIVOA Recommendation:
Resource Metadata
Resource
OrganisationAuthority
Registry
Service
IVOA Working Draft: VOResource
VORegistry
DataCollection
SkyService
TabularSkyService
VODataService
SimpleImageAccess
SIA
ConeSearch
ConeSearch
CEAApplication
CEAService
VOCEA
as XML
26 October 2004ADASS 2004 - Pasadena 24
IVOA Working Draft:
Registry Interface (RI) StandardKevin Benson (AstroGrid), Editor• Harvesting
Delivering resource descriptions from publishers to searchable registries
– Adoption of Open Archives Initiative (OAI) standard: Protocol for Metadata Harvesting
http://www.openarchives.org/– RI defines application of OAI to VO resource records
• Plug in VOResource as metadata format– Optional SOAP version to augment HTTP Get standard
• Searching– Returns XML VOResource records– Keyword search– Advanced search
• Uses the Astronomical Dataset Query Language (ADQL)• Refer to metadata items via a simplified XPath
– Easily mapped to either SQL for an RDBMS implementation, XQuery for an XML DB implementation
26 October 2004ADASS 2004 - Pasadena 25
Publishing to the NVOhttp://www.us-vo.org/publish.cfm
• Resources are published if one can use VO facilities to find them.
• Multiple layers of publishing– Starts with registry description of resource– Data Access Services
Incremental exposure for incremental effort
• Who are you? How you publish depends on what you want to publish.– An individual with a small data collection– An archive center– Someone with a cool service
• Extinction Correction Service– Developed by C. Miller, K. S. Krughoff– In one day of the NVO Summer School using VO tools
26 October 2004ADASS 2004 - Pasadena 26
Small collections:VO-ready Repositories
• Repositories that allow users to deposit data to share with community– Guarantee long-term storage, availability
• Automatic support for VO publishing mechanisms– Entries into NVO Registry– Support for standard services:
Cone Search, SIA, SSA, SkyNode
• Currently available Repositories– Images: NCSA Astronomy Digital Image Library
http://adil.ncsa.uiuc.edu/– Spectra: Spectrum Services for the VO
http://voservices.net/spectrum/
• More public repositories are expected to emergeCheck NVO website (http://us-vo.org/) for latest
26 October 2004ADASS 2004 - Pasadena 27
Persistent Archives:Tools for Federation
• Registering your resources with a public VO publishing registry
Choose resourcetype
Edit Form
STScI Registry
NCSA Registry
26 October 2004ADASS 2004 - Pasadena 28
Persistent Archives:Tools for Federation
• Registering your resources with a VO publishing registry– Enter description into registration form at one of the
available NVO registries:• STScI/JHU Registry: http://nvo.stsci.edu/voregistry/• NCSA Registration Portal:
http://nvo.ncsa.uiuc.edu/nvoregistration.html• Caltech Carnivore:
http://mercury.cacr.caltech.edu:8080/carnivore/
– If you have a large number of resources to register, you can run your own registry on your own site
• NCSA VORegistry-in-a-Box http://nvo.ncsa.uiuc.edu/VO/software/
• Caltech Carnivore: http://mercury.cacr.caltech.edu:8080/carnivore/
26 October 2004ADASS 2004 - Pasadena 29
• What can/should you register?– Should: your Organization
• Declares yourself as a publisher with an ID– Should: your Collection
• Users at least know how to access it via a Browser– Can: your existing services
• Browser-based services: e.g. search page• Traditional CGI services• Web Services
The next level…• Implement and register one or more standard services
– Cone Search– Simple Image Access– SkyNode*– Simple Spectral Access*
*standard still in development• NVO Summer School Software package: server-side templates
and toolkits http://www.us-vo.org/summer-school/
Persistent Archives:Tools for Federation
26 October 2004ADASS 2004 - Pasadena 30
Cool Services:Integrating with the VO
1. Register your service at a registry
2. Integrate support for standard VO formats, schemas• FITS and VOTable Enable integration with existing tools & visualizers
• Standard Data Model schemas (emerging)• VOResource, Space-time Coordinates, Spectra
Enable integration with other services using these models
3. Implement Standard Support Interface• a standard in development for:
Self-description, tracking health and usage
26 October 2004ADASS 2004 - Pasadena 31
Searching the Registry
• Use a searchable registry to find data and services– NVO has two searchable registries available:
• STScI/JHU Registry: http://nvo.stsci.edu/voregistry/• Caltech Carnivore:
http://mercury.cacr.caltech.edu:8080/carnivore/
• Two types of searches:– Simple keyword-based search– Advanced search
• STScI/JHU: SQL-based• Caltech: XQuery-based
• Currently working on user-oriented improvements to interactive interface
G. Greene & W. O’Mullane @ STScI– Help with advanced searches– Improved organization of returned results
26 October 2004ADASS 2004 - Pasadena 32
Accessing the Registry from Applications
• Custom Web Service Interfaces available– keyword and advanced search functions– Currently used by DataScope and SkyPortal
• IVOA Standard Web Service interface– Full support targeted for January 2005 roll-out– Beta support available from Caltech Carnivore
• Available Java client software – Currently available via NVO Summer School software
distribution• Zip file: http://chart.stsci.edu/twiki/bin/view/Main/Software• HowTos: http://chart.stsci.edu/twiki/bin/view/Main/NVOSummerSchoolCourseNotes
– Includes:• Client library for IVOA Standard search interface• Sample client code for both custom and standard
interfaces
26 October 2004ADASS 2004 - Pasadena 33
Curation Issues
• NVO Registries now contain over 3000 records Lots of problematic metadata:
– Missing information, incorrect usage, truncated values– Duplicates, deprecated records, missing resources– Broken/non-compliant services
• People need to assume responsibility for curation– Software can help, but is not sufficient– Role of Registry administrator?
26 October 2004ADASS 2004 - Pasadena 34
A practical approach to Curation
• Proposal: “VerificationLevel” tag attached to resource descriptions by a registry curator– 3 levels:
• Unverified• Verified by software• Verified by human curator
– Tag exposed to users/apps: e.g. select only highly verified resources
– Tag is specific to a registry; can by overridden when harvested by another registry.
• Software verification– NCSA: building a suite of software verifiers– Can be incorporated directly into registries
Either locally or by calling a remote web service– First example: Cone Search Verifier
http://nvo.ncsa.uiuc.edu/services/csvalidate.html
26 October 2004ADASS 2004 - Pasadena 35
Summary 2004
• NVO is operating production registries– serving end-user applications– greater emphasis on user interfaces– registry searches easily integrated into applications– Full release of latest improvements by January 2005
• Interoperable exchange between IVOA registries• Extensible Resource Metadata model• IVOA Registry Interface Standard is emerging
What’s next: shift from development to curation• Finalize RI standard• Address curation issues• No talk on registries next year
Top Related