Registration and Harvest IIB Presentation May 1, 2014 1.
-
Upload
kelley-webster -
Category
Documents
-
view
220 -
download
3
Transcript of Registration and Harvest IIB Presentation May 1, 2014 1.
Registration and Harvest
IIB PresentationMay 1, 2014
1
GEOSS and Registration
• The ‘pledge’ of data or services to GEO requires the act of registration
• Registration collects basic information about the resource and its context (e.g. related SBAs, availability, extent) called “metadata”
• Data or system owners can register metadata with GEOSS:– If you have never described your dataset or service
with metadata, use the form at geossregistries.info– If you have a catalog of metadata that is not
registered with data.gov, register the catalog with geossregistries.info
2
You may be registered already
• You don’t need to do anything if:– You have created metadata for your dataset or service in
an agency, national, or professional catalog (i.e. INSPIRE, data.gov.uk) and it is already registered with GEOSS
– The data or service resource is already described with a current DIF or SERF in the CEOS IDN
• But… – Be sure to include any search tags (i.e. geossdatacore,
EO Vocabulary) in your metadata, however, and provide a context for any URLs in the metadata as to format or protocol
3
Proposed Changes
• General concern about the complexity of registration and the fact that items can be registered either in CSR or the DAB
• DAB functionality now includes support for harvest of remote collections as well as distributed search
• GEOSS Clearinghouse function can be deprecated• A single registration facility is required to pledge
(self-descriptive) metadata, catalogs, or services for processing/integration with the DAB – along with a ‘last resort’ catalog to collect full metadata by form
4
Process• Consolidated Requirements for the GCI were prepared in
advance of selection of a GEO Web Portal. Included:– Common requirements across components to enable
interoperability– Component-specific requirements– Executable tests to verify items that could be tested for
assessment• UML diagrams were developed to describe the use cases,
required information classes, and interactions between components and beyond to access of pledged resources
• DAB is just now being included in the architecture of GCI, along with requirements, and its interaction with the rest
5
Data Publishing in GEOSS
URLs to data, maps, web services (HTTP)
National or CommunityMetadata Catalogs
StructuredMetadata
Users(browser)
Users(browser)
Client Software
Client Software
geoportal.org
GEOSS DAB*
GEOSS DAB*
formform
GEOSS Portal
GEOSS Portal
Data.gov
Agency catalog
IDN
MetadataEditor /
Manager
MetadataEditor /
Manager
Component and Service Registry
Component and Service Registry
Options
*GEOSS Discovery and Access Broker
CSW, opensearch
geossregistries.info
Data Systems
Data Systems
also registered
Links to access data
Standards
Registry (SIR)
Standards
Registry (SIR)
share type codes
33
22
11
1. Create metadata2. Save in catalog3. Indexed by DAB4. Search via Portal5. Link via metadata
44
55
GEOSS Current Architecture
Agency Services and SW Applications: Web services, Applications. Community Portals
Official Catalogs
Centralized EO Inventory Clients
GENESI*, CWIC*, FedEO*Component & Service
Registry (CSR)
Clearinghouse(Catalog)
GEOSSCommonInfrastructure
GEO WebPortal
GEOSS User
Access
Register
ResourceProviders
Discovery and Access Broker* (DAB)
GEOSS Registries:•Standards & Interoperability•User Requirements•Best Practices Wiki•Semantic
Member and POResources
harvest
metadatalinks to
real-timesearch
Monitoring Services
GEOSS Proposed Architecture
Agency Assets
Official Catalogs
Centralized EO Inventory Clients
GENESI*, CWIC*, FedEO*
GEOSSCommonInfrastructure
GEO WebPortal
GEOSS User
Access
Register
ResourceProviders
Discovery and Access Broker* (DAB)
GEOSS Registries:•Standards & Interoperability•User Requirements•Best Practices Wiki•Semantic
Member and PO Resources
harvest or search
metadatalinks to
real-timesearch
Create and manage
access
Metadata data
Monitoring Services
Users
Collections
Data
Contacts and Systems Registration• Contact management is required, capturing the affiliation of
individuals to their immediate organizations and also with GEO Members and Participating Organizations
• GEO DAB requires a minimum of configuration information to effect search and access functions on pledged Discovery Assets – metadata, catalogs, and self-descriptive services
• New focus on registering and configuring these Discovery Assets that provide access to structured metadata as a means to find data, services, applications, models, documents, websites, etc.
• This fundamentally simplifies the function and scope, like a “Yellow Pages”
10
Contact information• Originally envisioned in the architecture was a mechanism to
capture and manage GEO Member and Participating Organization (M&PO) information
• Suggest the ability to also manage individual user information and associations with GEO M&PO (Could we use OpenID to manage email-based identities?)
• Users in this system would then login to register Discovery Assets and tracking of pledged resources by organization - to be maintained for internal reporting
• Do not want to manage significant Personally Identifiable Information (PII) that is not already managed elsewhere
• Could still allow non-affiliated registrations…11
Discovery Assets
• Discovery Assets that the DAB can configure include URLs to:– Individual metadata files– Collections or series of metadata files– Metadata catalog services (CSW, opensearch, OAI-PMH)– Service endpoints (OGC WxS, OPeNDAP, RESTful) – Potentially also websites/pages that are self-descriptive
using microcodes or Schema.org tags
• Entries hosted in the Contributor’s Catalog
12
Discovery Asset Registration• Only a few fields of information need to be collected:
– URL to the resource– Description of the resource– Protocol used (HTTP, REST, OGC WxS, OPeNDAP, FTP)– Format of the resource (file type: i.e. XML, json, KML, GeoRSS,
netCDF, etc.), directory– Schema reference of resource, if applicable– Contact email– (Organization link, GEO MP link, date-time stamp possible)
• These will allow for configuration of harvest/access in DAB• Reference to standard protocol and format standards should
be drawn from a published codelist in the SIR (OSGeo work)
13
Contributor’s Catalog• Currently in CSR there is the ability to form-enter metadata
when there is no community, national, or organizational catalog otherwise available
• Suggest the establishment of a form-based ISO metadata editor and a catalog for the GCI to host these records and also the registration of Discovery Assets
• This would also enable metadata management for few, individual records by end-user publishers
• This would need to be added to the architecture (green bubble) diagram and UML models.
• This would be realized through a single form/catalog to hold full records and Discovery Assets
14
15
Roles and responsibilities
• Articulate DAB functions in the Consolidated Requirements• New core registration and catalog function proposed to be
designed jointly between USGS, GMU, and CNR with review and collaboration with the GCI Providers (IN-03)
• Propose to eventually deploy at the GEOSec for them to administer GEO M&PO enrollment and host the Contributor’s Catalog
• Registrations will trigger notifications to the DAB operators• Need a mechanism to track registrations, their configuration
within the DAB, and notification to publishers• May need to create or use an external OpenID Provider (like
Google or Facebook Connect) to support single sign-on
16
Proposed GEOSS Workflow
17
Register user
Describe
Publish
Discover
Access
Relative to GEO M&PO, agencies, use email address as key identifier
Create description of online data, service, or other resource using formal but simple metadata
Contribute descriptions to a Community Catalog in a country, COI, or international (i.e. CEOS IDN) – or Contributor’s Catalog for DAB harvest
Search for information via Portal UI or catalog API (opensearch or CSW)
Access information through described APIs / URLs on the resource
Issues
• GEOSS Portal and DAB developments have been done outside the existing Task framework, stimulated by GEOSec exigencies
• DAB has not been introduced and integrated with the existing GCI architecture
• Roles and repsonsibilities for NextGen capabilities have not been clarified
18
Next Steps• Use the GCI Providers calls and membership of IN-03 to
document the GCI architecture to include the DAB and nextGen registry
• Simplify and approve an updated Consolidated Requirements doc and reflect on the architecture and relation to overall architecture
• Deploy updated GCI Components to streamline registration (nomination to DAB) and collect metadata as a catalog of last resort
• Develop training materials to promote propser use of GEOSS (let’s get beyond even mentioning GCI)
19