Commercial Online Databases and the Internet
-
Upload
abdul-nash -
Category
Documents
-
view
19 -
download
2
description
Transcript of Commercial Online Databases and the Internet
Commercial Online Databases and the
Internet
OSS ‘99
Global Information ForumMay 24, 1999
Anne Caputo
Dow Jones Interactive Publishing
Traditional Search Services Challenge the Web
The Internet Searchoff• September 1997-February 1998• Susan Feldman, DATASEARCH
GoalCompare searching traditional online
services with World Wide Web • Effectiveness in finding information
• When to use which one
• Strengths of each approach
Searchoff Ground Rules
Be a trained, experienced searcherUse a real question from a clientSearch either Dialog or Dow Jones
InteractiveRelevance rank the results Rank the top 30 retrieved documents on a
scale of 1 to 5
Subjects Searched
Business Technology Medicine/Pharmaceuticals Science Humanities Engineering Other
38%
18%
14%
10%
8%
6%
6%
Web Search Engines Used
Alta Vista Hotbot Excite Infoseek Lycos Webferret
45%
20%
14%
14%
5%
2%
0
200
400
600
800
1000
1200
1400
Relevance Points # Documents
Internet Search-Off Results
Web totals
Dlg/dj totals
W D
DW
484515
1400
1143
Searching time
Total minutes searching time: DIALOG/DOW JONES: 594
minutes WWW search engines: 1230
minutes Plus formatting time
Searching Assumptions:traditional search engines
Information exists on the subjectThe information is high qualityThe information is currentThe information is expensive
To find it, we need expertise and training to know how and where to search
It will be a surprise if we can’t find something
Searching assumptions:World Wide Web
There MIGHT be information on the topicQuality and timeliness is unpredictableThe information is freeThere’s no telling how the search engine works
searching requires no skill searching requires no training
It will be a surprise if we find something
Retrieved Documents by Relevance
350
306
38 34 26
147
52
108
60
111117
0
50
100
150
200
250
300
RANKED 1 RANKED 2 RANKED 3 RANKED 4 RANKED 5
Less Relevant More Relevant
Series1
Series2Web
-- DIALOG/Dow Jones
W w W
W
D
D
D
D
Conclusion
DIALOG training has influenced an entire generation of searchers: we automatically shift into Boolean
Digression:
Nested Boolean searches don’t take advantage of the strong points of Web search engines
Statistical search engines search a whole territory. Boolean engines search for a point in that territory
Web Strategies
Map the territory: Use your searching skills to create lists of
related termsOmit Boolean operators;
Let the search engine work without interference
Put the most important and most rare words first
Use MORE LIKE THIS to improve results
Web Strategies
Use phrases when possible to eliminate irrelevant materials
Ignore the useless hits and pursue the good ones
Don’t worry about finding six million documents. Just look at the top 30Rephrase the search Move to another search engine if you don’t
find anything
Conclusions: traditional search services
Predictable archivesChemical EngineeringElectrical Engineering
StrengthsHistory and background on companies History and historical figuresMarket reports, industry reports
Conclusions: traditional search services
Current drug studies (authoritative) Industry newsletters and journals Financial industry coverage Scholarly journal articles High quality information Quick searches when you know the information
is likely to be there
Conclusions: The Web
Pictures and illustrationsSome conference coverage and papersProduct information comes from companySmall companies – products/ backgroundMedical statistics (current) If you know where to find the information
Conclusions: use both
To supplement each other for: Standards Articles on topics of general interest Popular subjects Organizations Directory information Reviews/evaluations/how-to information
Government regulations and other agency information
Competitive intelligenceObscure topicsClues for finding information on and offline
Conclusions: use both
Conclusions: general
Time is money. Free information that takes too long to
find and format is expensive information The Web is a new tool.
We need to learn to use both online sources well
Vary strategies and approach to take advantage of each medium