Discovery of parameterised data access services through hierarchical classification schemes.
-
Upload
regina-atkinson -
Category
Documents
-
view
218 -
download
3
Transcript of Discovery of parameterised data access services through hierarchical classification schemes.
![Page 1: Discovery of parameterised data access services through hierarchical classification schemes.](https://reader036.fdocuments.us/reader036/viewer/2022072005/56649ced5503460f949ba8cf/html5/thumbnails/1.jpg)
Discovery of parameterised data access services through hierarchical classification
schemes.
![Page 2: Discovery of parameterised data access services through hierarchical classification schemes.](https://reader036.fdocuments.us/reader036/viewer/2022072005/56649ced5503460f949ba8cf/html5/thumbnails/2.jpg)
The Problems…
• Specific term (e.g. a species name) and services that support queries– OBIS has all fauna – how do you find it if
looking for “whales”– Cant classify with all 50K names..
• General term “fauna” and many services– Eg 100 map layers with whale species
distribution maps
![Page 3: Discovery of parameterised data access services through hierarchical classification schemes.](https://reader036.fdocuments.us/reader036/viewer/2022072005/56649ced5503460f949ba8cf/html5/thumbnails/3.jpg)
Problem restated
• Types of service supported– Records– Specimens– Distribution models– Management zones– Survey effort
• Sparsely populated (in general)
![Page 4: Discovery of parameterised data access services through hierarchical classification schemes.](https://reader036.fdocuments.us/reader036/viewer/2022072005/56649ced5503460f949ba8cf/html5/thumbnails/4.jpg)
User wants…
• Samples of services against type– link to more
• Type = named data access query template? – Think so for phase 1 : propose this as a single
solution (minimise metadata, maximise consistency)– Try stuffing the data in!– Review in phase 2
• Implication – don’t know type in advance!– Advanced search – browse DAQT by topic
![Page 5: Discovery of parameterised data access services through hierarchical classification schemes.](https://reader036.fdocuments.us/reader036/viewer/2022072005/56649ced5503460f949ba8cf/html5/thumbnails/5.jpg)
Granularity of DAQT
• WMS = map
• Or– Distribution map– Sampling map– Tracking map– Corridors– Management Zones
• Orthogonal to topic?
![Page 6: Discovery of parameterised data access services through hierarchical classification schemes.](https://reader036.fdocuments.us/reader036/viewer/2022072005/56649ced5503460f949ba8cf/html5/thumbnails/6.jpg)
Requirements
• Finding services that may contain the data content being searched for
• Avoiding false positives– Services that don’t possibly contain the data– Allow “no records” where meaningful
• Easily find key general services in masses of specific services
• Clear, predictable rules for managing metadata
![Page 7: Discovery of parameterised data access services through hierarchical classification schemes.](https://reader036.fdocuments.us/reader036/viewer/2022072005/56649ced5503460f949ba8cf/html5/thumbnails/7.jpg)
Multi-dimensional classifications
• Classify services with three sets of associations/metadata slots:– parameterType– contentDescriptor– contentClassifier
• Search strategies:– Specific then general
• Harvest terms, don’t search hierarchy at runtime
![Page 8: Discovery of parameterised data access services through hierarchical classification schemes.](https://reader036.fdocuments.us/reader036/viewer/2022072005/56649ced5503460f949ba8cf/html5/thumbnails/8.jpg)
parameterType
This identifies the “type” of term from the classification ontology that can be used to parameterise the service interface.– eg. OBIS has
parameterType=speciesTaxonomy:species
• Does this get used in discovery?• Yes – because it tells you that for a given
discovery target this service may be used if it contains the right content
![Page 9: Discovery of parameterised data access services through hierarchical classification schemes.](https://reader036.fdocuments.us/reader036/viewer/2022072005/56649ced5503460f949ba8cf/html5/thumbnails/9.jpg)
contentDescriptor
• Contains a set of taxonomy terms for which the service provides “semantic coverage”
• e.g. Whale migration DB has contentDescriptor =speciesTaxonomy:order:cetacea
• Declares what class of content it may be searched for, not how to search
• Can be multivalued if not all sub terms are represented classify by those that do – eg. Mammalia, Reptilia
![Page 10: Discovery of parameterised data access services through hierarchical classification schemes.](https://reader036.fdocuments.us/reader036/viewer/2022072005/56649ced5503460f949ba8cf/html5/thumbnails/10.jpg)
Search Strategy 1Action Example Result
Enter phrase “blue whale” targetTermType=Species
targetTermValue=Tursiops tursiops
Classifier=kingdom:mammalia
Classifier=family:baleenidae
Search
parameterType=targetTermType
AND
contentDescriptor in [targetTermValue, Classifiers]
parameterType=Species
contentDescriptor in (kingdom:mammalia, family:baleenidae etc)
OBIS (pt=Species, cD=fauna)
Whale Distributions DB (pt=Species, cD=cetacea)
Blue Whale Sanctuary (pt=Species, cD=species:XX
![Page 11: Discovery of parameterised data access services through hierarchical classification schemes.](https://reader036.fdocuments.us/reader036/viewer/2022072005/56649ced5503460f949ba8cf/html5/thumbnails/11.jpg)
Good?
• Can discover specific services and more general ones from a search phrase
• Can auto-populate classifier search terms from name service
• Cant find specific services from more general term…
![Page 12: Discovery of parameterised data access services through hierarchical classification schemes.](https://reader036.fdocuments.us/reader036/viewer/2022072005/56649ced5503460f949ba8cf/html5/thumbnails/12.jpg)
contentClassifier
• Describes specific data set against more general terms it may be discovered by– Eg Blue Whale Sanctuary can be discovered
looking for “whales”• Actually searching Order=cetacea
• Fully populated heirarchy– Automated generation– Easier search
![Page 13: Discovery of parameterised data access services through hierarchical classification schemes.](https://reader036.fdocuments.us/reader036/viewer/2022072005/56649ced5503460f949ba8cf/html5/thumbnails/13.jpg)
Search Strategy 2Action Example Result
Enter phrase “blue whale” targetTermType=Species
targetTermValue=Tursiops tursiops
Classifier=kingdom:mammalia
Classifier=family:baleenidae
Search
contentClassifier in [targetTermValue, Classifiers]
contentClassifier in (kingdom:mammalia, family:baleenidae etc)
OBIS (pt=Species, cD=fauna)
Whale Distributions DB (pt=Species, cD=cetacea)
Blue Whale Sanctuary (pt=Species, cD=species:XX
![Page 14: Discovery of parameterised data access services through hierarchical classification schemes.](https://reader036.fdocuments.us/reader036/viewer/2022072005/56649ced5503460f949ba8cf/html5/thumbnails/14.jpg)
Issues
• Potentially large number of hits against general term
• But how do you know until you try?– MaxResults strategy good enough…– But don’t want to lose more specific results
• The closer the parameterType is to the search term (in hierarchy) the more relevant?
• Parameterised service reflects governance?
![Page 15: Discovery of parameterised data access services through hierarchical classification schemes.](https://reader036.fdocuments.us/reader036/viewer/2022072005/56649ced5503460f949ba8cf/html5/thumbnails/15.jpg)
Search workflow
• Search strategies 1 and 2 are quite different– Can they be reconciled?
• Search workflow:– Search (type 1)– Wider search if results not found– Search type 2
![Page 16: Discovery of parameterised data access services through hierarchical classification schemes.](https://reader036.fdocuments.us/reader036/viewer/2022072005/56649ced5503460f949ba8cf/html5/thumbnails/16.jpg)
“Brute force” alternative
• Search 1 U Search 2– 1 or 2 calls?
• Max results
• Ordering Search 1 > Search 2– Or max for each?
• Related services – may be the key?– Two different models for species distribution– Archive vs current boundaries