SessionJuggler - Stanford Crypto Group - Stanford University
1 Stanford InterLib Technologies Hector Garcia-Molina and the Stanford DigLib Team.
-
Upload
loreen-marsh -
Category
Documents
-
view
221 -
download
0
Transcript of 1 Stanford InterLib Technologies Hector Garcia-Molina and the Stanford DigLib Team.
2
Stanford Digital Libraries Team
Faculty:– Dan Boneh, Hector Garcia-Molina, Terry Winograd
Research Scientist– Andreas Paepcke
Librarians– Vicky Reich, Rebecca Wesley
Partners:– InterLib Partners, ACM, Dialog, Hitachi, IBM, Intel,
Microsoft, NASA Ames Library, Stanford Libraries,SUL HighWire Press, Xerox
3
Barriers to Effective DLs
Economic Concerns
Information Loss
Information Overload
Service Heterogeneity
Physical Barriers
4
Thrusts
Economic Concerns
Information Loss
Information Overload
Service Heterogeneity
Physical Barriers
• Interoperability
• Value Filtering
• Mobile Access
• IP Infrastructure
• Archival Repository
5
DL Interoperability Challenges
Growing number of players, formats, countries,... Repositories Services Dynamic artifacts Reliability
Digital Libraries
6
DL Interoperability Challenges
Growing number of players, formats, countries,... Repositories Services Dynamic artifacts Reliability
Solution:
InfoBus InterServ
InfoBus Example
Folio Dialog DigiCash F.V.
FolioProxy
DialogProxy
DigiCashProxy
F.V.Proxy
DLite GlossQueryTrans
MetaData U-Pai
Con-tracts
Q: Find Ti distributed (W) systems
InfoBus Example
Folio Dialog DigiCash F.V.
FolioProxy
DialogProxy
DigiCashProxy
F.V.Proxy
DLite GlossQueryTrans
MetaData U-Pai
Con-tracts
Q: Find Ti distributed (W) systems
Suggested: Folio, Dialog
InfoBus Example
Folio Dialog DigiCash F.V.
FolioProxy
DialogProxy
DigiCashProxy
F.V.Proxy
DLite GlossQueryTrans
MetaData U-Pai
Con-tracts
Q: Find Ti distributed (W) systems
Q’: Find Ti distributed AND systems
Query Translation
InfoBus Example
Folio Dialog DigiCash F.V.
FolioProxy
DialogProxy
DigiCashProxy
F.V.Proxy
DLite GlossQueryTrans
MetaData U-Pai
Con-tracts
Q: Find Ti distributed (W) systems
Pay per View
11
InterServ
InfoBus
Perpetual Activity
Services
Dynamic Artifacts
InfoBus Pro
“Maturity”
“Sop
hist
icat
ion”
13
Perpetual Activity Service
P.A.S.P.A.S.
Service
UserRequest
register
state & plans
restart service,use alternate
restore state,try alternatives
check
check
14
SDLIP
Simple Digital Library Interoperability Protocol Goal: get InterLib (and DLI2) to interoperate!!
15
Search Protocol: Initial Goals
Trivial to implement! Works over CORBA/COM, DASL/HTTP Use XML Does not prescribe query format Does not prescribe result format Small footprint (Desktop/Laptop/PDA) Allows for stateful or stateless operation
But lets you say whatyou’re using
16
Interface Consists of Four Components
InformationClient
DeliveryInterface
InterLibWrapper
Result AccessInterface
SourceMetadataInterface
SearchInterface
18
SDLIP Status
Design Meeting June 22, 1999
Client & Server Toolkits Available Extensive Documentation See
http://www-diglib.Stanford.EDU/~testbed/doc2/SDLIP/
19
Current SDLIP Sources
Some Web sources– People Lookup: www.switchboard.com
– Altavista
– IMDB (movies)
NCSTRL services: www.ncstrl.org– Dienst compliant services, e.g., CoRR?
Z39.50 servers– e.g., Library of Congress
Stanford WebBase CDL
– e.g., MELVYL gateway
DASL-compliant servers
20
Existing Clients
Java– command line
– applet
C++– Palm Pilot
TCL (Ray Larson) DASL-compliant clients
27
Value Filtering Challenges
Collection of Value Information Scalability Privacy of Value Information Understanding Page Rank Searching Non-Text Objects Combining Value Information HCI Aspects
28
WebBase Goals
Manage very large collections of Web pages Enable large-scale Web-related research Locally provide a significant portion of the Web Efficient wide-area Web data distribution
29
Challenges
Huge information space– Wide area distribution
– URL space (to remember while crawling)
– Web content (to store)
Limited resources– Disk
– Time
– Memory
– Bandwidth
– Server administrator tolerance
Continuous evolution– More pages– Pages change/disappear– Mirror sites installed– Keeping data “fresh”
Crawling issues– Data ‘fiefdoms’: firewalls;
access permissions; load controls
– Overhead per site: DNS lookups; processing robots.txt
– Parallelization– Ability to interrupt & restart
RepositoryMulticastEngine
WWW
FeatureRepository
RetrievalIndexes
Webbase API
Web CrawlerWeb
CrawlerWeb CrawlerWeb Crawlers
Client Client Client Client
Client ClientWebBase Architecture
32
Mobile Access Challenges
Limited Resources Transitions Between Devices Exploiting Context
Solutions: Power Browsing Information Tiles Information Paging
34
Power Browsing
Techniques• Show only text headers• Show URLs, anchors, titles• Order URLs by page rank• Summarize text• Summarize set of pages• Low-resolution pictures• Display “relevant” text• ...
39
IP Management Challenges
Heterogeneity Complexity of Interactions Varied Information Appliances Mobile Access Security/Privacy
40
Fundamental Problem
Safeguards (security, privacy, authentication, payment, non-repudiation...) are afterthought
“Spaghetti” code for safeguards
Experience at Stanford:•InterPay, CommPacts, Copy Detection•Goal was interoperability•Correctness, complexity were problems
41
Example: Simple Pay Per View
patron library bank
view(docId, account, amt)
transfer(amt, account, libAccount)
42
Example: Simple Payment
Goals• Do not want others to see data• Do not want library to see account number• Need receipt from bank
patron library bank
view(docId, account, amt)
transfer(amt, account, libAccount)
43
Example: Simple Payment
Goals• Do not want others to see data• Do not want library to see account number• Need receipt from bank
Result: A Mess!!
patron library bank
view(docId, account, amt)
transfer(amt, account, libAccount)
44
Declarative Safeguards for DLs
Safeguards built in at system design time Declare goals, not mechanisms
– Players, data, ...– Who can see what, who can do what, ...
(Note: access information can also be protected)
Declarative Infrastructure
SecureDLs
Components:IP Mgmt, Wallets, ...
45
Solution
Extended Interface Definition Language– Corba or D-COM like
Example:
class artRecord { authorized(policy) setOwner(encrypted string ownerName, encrypted(bank) int price, picture pic; ) …}
46
Declarative Safeguards for DLs
Declarative Infrastructure
SecureDLs
Components:IP Mgmt, Wallets, ...
47
Information Preservation Challenges
Preserving the Bits– Evolving hardware
– Evolving software
– Evolving organizations
Preserving the Meaning
48
Stanford Archival Repository
Object Identifier Signature
No Deletions (never ever!)
handle
set set
new version?
49
Repository Layers
IdentityIdentity
Object StoreObject Store
Complex ObjectsComplex Objects
ReliabilityReliability
Indexing, NamingIndexing, Naming
Intellectual PropertyIntellectual Property
52
Archiving the Web - Our Solution
File System
Web Server
InfoMonitor
usersusers
Archival Repository