Data discovery from a digital library perspective
-
Upload
lankston-joey -
Category
Documents
-
view
18 -
download
0
description
Transcript of Data discovery from a digital library perspective
Data discoveryfrom a
digital library perspective
Greg Janée, Darren HardyUC Santa Barbara
2
Outline
• Questions– grappling with granularity– struggling with search– dithering over distribution– pondering process
• Integrating search with access
3
Granularity
institution (NASA)
data center (GSFC)
program (MODIS)
product (sea surface temperature)
resolution (1km)
space
time
granule
datum
type
organization
4
Approaches I
• ADL– uniform object (metadata) representation– flat list of collections (=containers)– possible extensions:
• collections as first-order objects• nested containers
• THREDDS– hierarchical “collection” datasets– “coherent” datasets (=aggregation server?)– “direct” datasets
5
Approaches II
• Granularity on the Web...– webpage– multi-page document– website
• ...and sidestepping it– uniform representation (webpage)– page linking– visible, decomposable identifiers (URLs)
6
• Use heuristics to return “best” match
dataset
inheritdescriptive metadata
aggregateintrinsic metadata
Flattening granularity
7
Search
• Type– text, numeric, space, time, ...
• Source– data itself– intrinsic metadata– added (usually descriptive) metadata– 3rd party
8
Distribution
• Centralized system– eg. Google, ECHO– SPOF; requires resources
• Peer-to-peer– eg. BRICKS, built on P-GRID– MPOF; requires commitment
• ADL: incomplete peer-to-peer
9
A “textbook” search process
• Classic process (Lancaster 1979)– Information need– Stated request– Selection of database– Search strategy– Search in database– Screening of output
• Web search - about the same 25 years later
10
What’s the real process?• Irrational search (Pharo & Järvelin 2006)
– Textbook search processes insufficient– Disjointed incrementalism theory
• Many smaller steps• Learning during a search• Subjective & dynamic information needs over time
• What’s the ideal for earth science data users?– How do you inform choices during search?– How do you formulate a search, and what’s the
context?– When is enough enough?
11
Integrating search with access
• File menu– Open...– Search library...– Close– Quit
• Query results returned as a THREDDS catalog?
12
We’re funded to do this!