New Approaches To Resource Discovery In The UK HE Community Brian Kelly UK Web Focus UKOLN...
-
Upload
bruce-perkins -
Category
Documents
-
view
217 -
download
2
Transcript of New Approaches To Resource Discovery In The UK HE Community Brian Kelly UK Web Focus UKOLN...
New Approaches To Resource Discovery In The UK HE Community
Brian Kelly
UK Web Focus
UKOLN
University of Bath
Bath, BA2 7AYEmail: [email protected]: http://www.ukoln.ac.uk/
UKOLN is funded by Resource: The Council for Museums, Archives and Libraries, the Joint Information Systems Committee (JISC) of the Higher and Further Education Funding Councils, as well as by project funding from the JISC and the European Union. UKOLN also receives support from the University of Bath where it is based.
Aims of Talk:• Review approaches taken
by UK HE community• Overview of eLib phase 3
projects and development of the DNER
• Discussion of architectural models, software development and funding regimes
Aims of Talk:• Review approaches taken
by UK HE community• Overview of eLib phase 3
projects and development of the DNER
• Discussion of architectural models, software development and funding regimes
2
Contents
DNERDistributed National Electronic Resource
DNERDistributed National Electronic Resource
eLib Phase 3Hybrid Libraries
and Clumps
eLib Phase 3Hybrid Libraries
and Clumps
Issues•Software•Server or site?•File formats•User interface•Administrator’s interface
Web Manager’s View
Web Manager’s View
Librarian’s View
Librarian’s View
Issues•Web-enabled OPAC
•Integration with other OPACs
•Cross-searching or union catalogue
•Z39.50•Metadata•IdentifiersOther
approaches
Other approaches
Other InitiativesEU & US projects
Other InitiativesEU & US projects
3
Which To Choose?• Alkaline (Vestris) • AltaVista - Search Intranet • ASTAWare SearchKey • atomz Search (remote) • BooleanSearch • BBDBot • BRS/Search (Dataware) • Compass Server (Netscape)
• Cybotics • DataWare BRS/Search • DocFather (formerly
SiteSearch) • dtSearch Web • Excalibur RetrievalWare • EWS (Excite) • Excerpt (Obsolete) • Extense • FAST Search Server
• Findex (code library)
• Folio siteDirector
• FreeFind (remote)
• Fulcrum
• Glimpse
• Harvest
• ht://Dig
• ICE
• iHound (ICATT)
• Index Search (Xavatoria)
• Index Server (Microsoft)
• IndexMySite (remote)
• Infoseek - Ultraseek
• Intermediate Search
• intraSearch (remote)
• I-Search
• Isearch
• ITMS
• Isys:web
• Java Applets
• JHLSearch
• JObjects QuestAgent
• Lycos / InMagic
• Magnifi Enterprise Server
• Matt's SimpleSearch
• Microsoft Index Server
• Microsoft Site Server
• MiniSearch (remote)
• MondoSearch
• Muscat
• NetResults (now SearchKey Plus)
• Netscape - Compass Server
• OpenText - LiveLink
• Perl Scripts
• Perlfect Search
• Phantom (Maxum)
• PicoSearch (remote)
• Etc.
Software from <http://searchtools.com/tools/tools.html>Which to choose? What software may be obsolete? What does remote mean?
Software from <http://searchtools.com/tools/tools.html>Which to choose? What software may be obsolete? What does remote mean?
Can choose byreading reviews, web
sites, etc. or by looking at usage in
community
4
Findings: UK HE Web SitesMain findings of 3 surveys:
0
10
20
30
40
50
60
Nos.
ht://Dig
eXcite
M icrosoft
Harvest
Ultraseek
Other
None
SoftwareNos. in Jul 1999
ht://DigeXciteMicrosoftHarvestUltraseekOtherNone
Nos. inMar 2000
25191287
2960
32171569
3450
—
160 163Totals
• Article published in Ariadne issue 21 - <http://www.ariadne.ac.uk/issue21/webwatch/>
• Results (including update on survey) available from:<http://www.ukoln.ac.uk/web-focus/surveys/uk-he-search-engines/>
• Article published in Ariadne issue 21 - <http://www.ariadne.ac.uk/issue21/webwatch/>
• Results (including update on survey) available from:<http://www.ukoln.ac.uk/web-focus/surveys/uk-he-search-engines/>
Nos. inAug 2000
429
183113144163
0
5
10
15
20
25
30
35
40
45
50
ht://Dig
eXcite
Microsoft
Harvest
Other
None
5
Popular Product: ht://Dig
ht://Dig• Now used at 42 (up from
25 then 32) UK HEIs• Freely available• Own domain with well-
designed web site• Robot to index multiple
servers
See <http://www.htdig.org/>See <http://www.htdig.org/>
Oxford Case Study131 servers
438,500 resources
Indexes MS Office, PDF, etc. files (external parser)
Oxford Case Study131 servers
438,500 resources
Indexes MS Office, PDF, etc. files (external parser)
Issue: Web community not interested in non-Web resources?
Issue: Web community not interested in non-Web resources?
6
National Search Engines
ACDC (Academic Directory)• (Unfunded) pilot of index of
ac.uk domain based on distributed approach using Harvest
• Set up in March 1996• Lack of development effort
resulted in degraded service (e.g. indexer not aware of JavaScript code)
http://acdc.hensa.ac.uk/http://acdc.hensa.ac.uk/
Issues: Problems with volunteer effort once enthusiasm wanesLack of user involvement can limit acceptanceLack of funding body involvement can mean lessons learnt are lost
Issues: Problems with volunteer effort once enthusiasm wanesLack of user involvement can limit acceptanceLack of funding body involvement can mean lessons learnt are lost
7
Institutional DevelopmentsMaestro robot (Dundee):
• Indexes Scottish resources• Individual or all sites• Volunteer effort• Interesting application for OS/2
Maestro robot (Dundee):• Indexes Scottish resources• Individual or all sites• Volunteer effort• Interesting application for OS/2
North East Universities (UNIS4NE):• Appearance of cross-searching• Actually interface to HotBot / AltaVista
North East Universities (UNIS4NE):• Appearance of cross-searching• Actually interface to HotBot / AltaVista
8
SOSIG is an example of subject gateway initially funded by eLib
SOSIG provides access to manually catalogued resources in Social Sciences
Involvement with Social Science community has helped acceptance
eLib Subject Gateways
9
ROADSROADS software used to support several gatewaysKey features: Open source Support for whois++ Momentum behind
software meant:– Uptake in other
communities– Additional developments
(e.g. ROADS/Z39.50 gateway)
But: Whois++ standard
failed to take off
10
Approaches Taken By Hybrid Libraries Projects Let’s look at some of the approaches taken by some of the eLib Phase 3 Hybrid Libraries projects which help users find electronic and "real world" resources:
Agora: Use of Z39.50 and Collection Level Descriptions Working with a commercial software vendor
Headline: Provision of a personalised interface An open source approach
BUILDER: Searching across Hybrid Library Web sites Authenticated access to exam papers Making use of locally available applications
11
Agora (1)
In the Agora Hybrid Library the user can choose a Landscape
In the Agora Hybrid Library the user can choose a Landscape
12
Agora (2)
The landscape may be a collection of resources; individual collections can be selected
The landscape may be a collection of resources; individual collections can be selected
13
Agora (3)
Collections are defined using the Collection Level Description agreed by eLib projects
Collections are defined using the Collection Level Description agreed by eLib projects
14
Agora (4)
Results from local collections are usually returned first
Results from local collections are usually returned first
15
Agora (5)
The results can be viewed directly or requested using ILL
The results can be viewed directly or requested using ILL
16
Agora (6)
The results are retrieved simultaneously
The results are retrieved simultaneously
17
Agora (7)
Results from AltaVista obtained using “HTML-scraping” technique
Results from AltaVista obtained using “HTML-scraping” technique
18
Headline (1)Headline’s PIE (Personal Information Environment) provides a personalised interface to Hybrid Libraries resources.
Here is Pete’s (an Economics UG student) default information landscape
http://www.headline.ac.uk/publications/pie/Pete'sPage1.html
http://www.headline.ac.uk/publications/pie/Pete'sPage1.html
19
Headline (2)
Pete selects the All Resources link
This gives a list of all the Library resources and services that Pete is entitled to use
20
Headline (3)
Pete adds the Economic Systems Research journal to his list of resources
21
Headline (4)
Pete now clicks on the Customise option near the top of the window
He can now add the journal to his resources for This Week’s Essay
22
Headline (5)
Pete now carries out additional research
He selects collections of interest and then searches for “Japan and emerging markets”
23
Headline (6)
Pete expands the results for Unicorn …
24
Headline (7)… and then views a map showing the physical location
This illustrates how Headline supports access to physical objects as well as digital resources.
25
Headline (8)
Finally Pete expands the results from Decomate
These are PDF documents which can be viewed directly
26
BUILDER (1)BUILDER (Birmingham University Integrated Library Development and Electronic Resource) provides a number of hybrid library demonstrators
The Microsoft SiteServer indexer is used to index across other Hybrid Libraries (and Clumps) projects
Notice branding of the results
The Microsoft SiteServer indexer is used to index across other Hybrid Libraries (and Clumps) projects
Notice branding of the results
Authentication is provided using the Novell NDS which provides access to the institutional network
Authentication is provided using the Novell NDS which provides access to the institutional network
27
IssuesThe different approaches to software development:
• Make use of (and work with) commercial products: Benefit from market-tested products More realistic awareness of commercial acceptance Relationships may be difficult May be sucked into use of proprietary solutions
• Develop open source software and use complementing open source products
Flexibility in adopting emerging new standards Requires technical expertise to develop and maintain Management resistance, esp. if fails to gain momentum
• Pragmatic approach in using existing tools Makes use of existing tools and expertise Can quickly develop prototypes which can help gain support for
services May be architecturally flawed and make use of proprietary solutions
28
Tools (1)A variety of open source tools are being developed within the community.DC-dot, developed by UKOLN, can be used to assist the creation of Dublin Core metadata.The metadata can be generated in various formats such as HTML and RDF.
http://www.ukoln.ac.uk/metadata/dcdot/http://www.ukoln.ac.uk/metadata/dcdot/
29
Tools (2)UKOLN has also developed a tool for creating collection level descriptions to support projects funded by RSLP (Research Support Libraries Programme), another HE funded programme
http://www.ukoln.ac.uk/metadata/rslp/tool/http://www.ukoln.ac.uk/metadata/rslp/tool/
30
From Hybrid Libraries to the DNER
Hybrid Libraries projects are addressing:• Needs for users to find variety of resources• Need to gain experiences from projects
The DNER:• Distributed National Electronic Resource• Building on Hybrid Libraries project experiences• Focus on services rather than projects• Aims to provide seamless access to quality
resources • Is developing a standards-based architectural
framework
31
DNER Architecture
Areas of interest include:• Collection descriptions• User profiles• Identifiers
Emphasis on interoperability through use of standards
Work currently in progress
32
Currently...
End user
Local content National content International content
Web Web Web Web Web Web
33
Currently...
End user
Collection Description(e.g. Agora)
User Profile(e.g. Headline)
Authentication(Athens)
Local content National content International content
Web Web Web Web Web Web
34
Future...
Web
Content
End user
Web Web Web Web
User profile
Collectiondescription
Authentication(Sparta)
35
Future...
Portal
Content
End userUser profile
Collectiondescription
Authentication(Sparta)
Subject portalor institutionalportal or MLEor ...
36
Sharing content
How do ‘portals’ and content servers interact?
Technologies currently being investigated:• HTTP• Z39.50 - Bath Profile• OAI - Open Archive Initiative• RSS - Rich Site Summary / RDF Site Summary
37
Open Archives Initiative
OAI Metadata Harvesting Framework:• Simple mechanism for sharing metadata records• Records shared over HTTP...• ... as XML• Client can ask metadata server for
– all records– all records modified in last ‘n’ days– info about databases, formats, etc.
See <http://www.openarchives.org/>
38
RSS
RSS (Rich Site Summary):• XML application for syndicated news feeds• Pointers and simple descriptions of news items (not
the items themselves)• Being transitioned to more generic RDF/XML
application (RSS 1.0)• No querying - just regular ‘gathering’ of RSS file• See <http://rssxpress.ukoln.ac.uk/>
39
Future... Z, OAI, RSS
Portal orMLE or ...
Content
End userUser profiles
Collectiondescription
Authentication(Sparta)
Z39.50
OAI RSSHTTP
HTTP
40
Content Identification
Need to persistently identify stuff to:• Enable lecturers to embed it into learning
resources• Enable students to embed it into multimedia
essays• Enable people to cite it
... so let’s look at a current example (from VADS)
41
Content Example
42
http://vads.ahds.ac.uk/ixbin/hixclient?_IXDB_=vads&_IXSPFX_=t&_MREF_=3392&_IXSR_=ea1&_IXSP_=0&_IXSS_=%2524%2brec%2bvads%2band%2bseaside%2band%2b%2528%2528Basic%2bDesign%2bCollection%2bin%2btitle_vads_collection%2529%2bor%2b%2528Halliwell%2bCollection%2bin%2btitle_vads_collection%2529%2bor%2b%2528Imperial%2bWar%2bMuseum%2bConcise%2bArt%2bCollection%2bin%2btitle_vads_collection%2529%2bor%2b%2528London%2bCollege%2bof%2bFashion%2bCollege%2bArchive%2bin%2btitle_vads_collection%2529%2529%2bsort%2btitle%2b%3d%252e%26_IXDB_%3dvads&_IXRECNUM=3392&_IXASEARCH=&SUBMIT-BUTTON=DISPLAY
Content example - the URL
Be nicer if the content URL was something like:http://vads.ahds.ac.uk/id=137234-849783
http://dx.doi.org/10.3456/1096493
Be nicer if the content URL was something like:http://vads.ahds.ac.uk/id=137234-849783
http://dx.doi.org/10.3456/1096493
43
Identifiers
Could use URLs, PURLs, DOIs, ... but...• URLs are locators not identifiers• DOIs and PURLs resolved centrally• All resolve to same thing irrespective of
who/where the user is e.g.– 10.1045/october2000-granger always resolves to
US version even though D-Lib mirrored in UK– http://purl.org/dc always resolves to US version
even though DC pages mirrored in UK
DOI and PURL are resolved using a US resolver
44
Identifiers
Need some way to encode:• identifier• citation
in such a way that resolution happens in the context of:
• The location of the end user• The access rights of the end user
this can be achieved with OpenURL and SFXSee <http://www.sfxit.com/> for further information
45
Development of StandardsAs well as designing an architecture to support interoperability based on open standards there is a need to be involved in standards development work: Warwick Framework
• A framework for metadata applications, which informed W3C’s RDF work
Dublin Core • eLib community has been actively involved with
Dublin Core development Bath Profile
• Bath Profile for Z39.50 defines core attributes for library applications
46
What’s Happening Elsewhere?
A number of EU-funded projects and joint UK/US projects are involved in related activities, including:
Renardus • EU project to develop an academic subject gateway
service for Europe
SCHEMAS• SCHEMAS provides a forum for metadata schema
designers involved in EU-funded projects and national initiatives
IMESH• Joint JISC/NSF funded project to develop a
configurable, reusable and extensible toolkit for subject gateway providers
47
Renardus
Renardus:• Will build a pilot
European broker service offering subject-based access to collections of information to support learning, teaching & research using Z39.50
• An open source approach – e.g. making use of Zebra (www.indexdata.dk)
http://www.renardus.org/http://www.renardus.org/
48
SCHEMASTo support EU projects SCHEMAS will:
• Monitor metadata developments
• Organise workshops• Provide a registry of
schemas
The use of RDF to store schemas in a machine-readable way is being investigated Will make use of commercial software (EOR from OCLC)
http://www.schemas-forum.org/http://www.schemas-forum.org/
49
IMESH
A joint JISC/FSF funded project
Will develop open sources tools for use by developers of subject gateways
http://www.desire.org/html/subjectgateways/community/imesh/
http://www.desire.org/html/subjectgateways/community/imesh/
50
Conclusions
This talk has provided examples of new approaches to resource discovery within the UK Higher Education community
A number of case studies have been looked at and the following issues addressed:
• Standards • Approaches to software development
• The funding regime
51
Standards
There is:• Awareness of the importance of standards• Some involvement in development of standards (e.g. Dublin
Core) and community agreements (e.g. collection level descriptions)
Key standards:• XML • Dublin Core• Z39.50: political backing and by library community, but less
enthusiasm from software developers• RDF: some enthusiasts, used in some projects, but also
sceptics (too complex, lack of widespread support)• DOIs, OpenURLs, etc: Interest by early adopters• Authentication (digital signatures, etc): difficult• User profiles: early days
52
Software Development
There are a variety of approaches to software development:
• Development of Open Source software • Use of commercial software / joint projects with
commercial software vendors, etc.
The pros and cons of these approaches are well known There is probably no best single approach applicable for all
Interoperability through use of open standards is the key – let’s be agnostic over this argument
53
Funding Regimes
• Volunteer effort by enthusiasts can be useful (cf. the Web in 1993) but this approach has limitations
• Large scale programmes, such as eLib, can result in significant developments
• The transition from projects to services is essential – and may be difficult
• Building on national initiatives through international collaboration will provide fresh insights and address unforeseen interoperability issues
54
Question Time
Any questions?
Acknowledgements: Thanks to Andy Powell, Leona Carpenter, Rachel Heery and my other colleagues in UKOLN and members of eLib Hybrid Libraries projects for their help with this presentation
Acknowledgements: Thanks to Andy Powell, Leona Carpenter, Rachel Heery and my other colleagues in UKOLN and members of eLib Hybrid Libraries projects for their help with this presentation