Best Web Directories and Search Engines Order Out of Chaos on the World Wide Web.

31
Best Web Directories and Search Engines Order Out of Chaos on the Order Out of Chaos on the World Wide Web World Wide Web
  • date post

    19-Dec-2015
  • Category

    Documents

  • view

    213
  • download

    1

Transcript of Best Web Directories and Search Engines Order Out of Chaos on the World Wide Web.

Page 1: Best Web Directories and Search Engines Order Out of Chaos on the World Wide Web.

Best Web Directories and Search Engines

Best Web Directories and Search Engines

Order Out of Chaos on the World Order Out of Chaos on the World Wide WebWide Web

Page 2: Best Web Directories and Search Engines Order Out of Chaos on the World Wide Web.

How People Search on the WebHow People Search on the Web

Input URLs, surf linksInput URLs, surf links Subject directoriesSubject directories Search enginesSearch engines Metasearch enginesMetasearch engines

Page 3: Best Web Directories and Search Engines Order Out of Chaos on the World Wide Web.

Web DirectoriesWeb Directories

Small, selective databasesSmall, selective databases Created by humans not machinesCreated by humans not machines Editors select and place sites into Editors select and place sites into

categories for easy retrievalcategories for easy retrieval User browses categories and links to sitesUser browses categories and links to sites

Page 4: Best Web Directories and Search Engines Order Out of Chaos on the World Wide Web.

Why Use Directories?Why Use Directories?

Identify quality, major sitesIdentify quality, major sites Get overview, general information Get overview, general information

on topicon topic Serendipity in discovery as result of Serendipity in discovery as result of

manipulating a smaller, more manipulating a smaller, more focused filefocused file

Page 5: Best Web Directories and Search Engines Order Out of Chaos on the World Wide Web.

High Quality DirectoriesHigh Quality Directories

Librarian’s Index to the InternetLibrarian’s Index to the Internet InformineInformine Academic InfoAcademic Info WWW Virtual LibraryWWW Virtual Library

Page 6: Best Web Directories and Search Engines Order Out of Chaos on the World Wide Web.

Top Directories – Less Selective Top Directories – Less Selective

Yahoo!: 1,800,000+ Yahoo!: 1,800,000+

Open Directory: 2,600,000+ Open Directory: 2,600,000+

LookSmart:LookSmart: 2,500,000+ 2,500,000+

HyperResearch GuideHyperResearch Guide

Page 7: Best Web Directories and Search Engines Order Out of Chaos on the World Wide Web.

How Directories Work How Directories Work

Browse subject categoriesBrowse subject categories Funnel: category to topic, web site to pageFunnel: category to topic, web site to page HealthHealth Yahoo! LookSmartYahoo! LookSmart

FitnessFitness Open Directory Open Directory

• YogaYoga Most popular sitesMost popular sites

yogaclass.comyogaclass.com

http://www.yogaclass.com/http://www.yogaclass.com/

Page 8: Best Web Directories and Search Engines Order Out of Chaos on the World Wide Web.

Directory Search BoxesDirectory Search Boxes

When to use Yahoo SearchWhen to use Yahoo Search Subject categories don’t match topicSubject categories don’t match topic Want broad search of WebWant broad search of Web

Why results are differentWhy results are different Directory searches only Yahoo’s selected Directory searches only Yahoo’s selected

sites sites Search box combines Yahoo directory sites Search box combines Yahoo directory sites

and full Web search results (from engine) and full Web search results (from engine)

Page 9: Best Web Directories and Search Engines Order Out of Chaos on the World Wide Web.

Top Web Directories Top Web Directories

Yahoo!: 1,800,000+ Yahoo!: 1,800,000+

Open Directory: 2,600,000+ Open Directory: 2,600,000+

LookSmart:LookSmart: 2,500,000+ 2,500,000+

HyperResearchHyperResearch Guide Guide

Page 10: Best Web Directories and Search Engines Order Out of Chaos on the World Wide Web.

Web Search EnginesWeb Search Engines

Page 11: Best Web Directories and Search Engines Order Out of Chaos on the World Wide Web.

What Are Search EnginesWhat Are Search Engines

SoftwareSoftware Captures web sites, pagesCaptures web sites, pages Indexes full-text of web pageIndexes full-text of web page Provides interface to search web pagesProvides interface to search web pages

DatabaseDatabase Large, billions of pages (unlike directories)Large, billions of pages (unlike directories) Computer built (robots, spiders)Computer built (robots, spiders) No selectivity, no evaluationNo selectivity, no evaluation

Page 12: Best Web Directories and Search Engines Order Out of Chaos on the World Wide Web.

Why Use Search Engines?Why Use Search Engines?

Have already identified major sites Have already identified major sites from directoryfrom directory

Could find very little in directoryCould find very little in directory Want everything, comprehensive Want everything, comprehensive

information on a topicinformation on a topic Note: need to judge quality of sites Note: need to judge quality of sites

since engines are NOT selectivesince engines are NOT selective

Page 13: Best Web Directories and Search Engines Order Out of Chaos on the World Wide Web.

How Search Engines WorkHow Search Engines Work

Spider comb, “capture” web pagesSpider comb, “capture” web pages Software builds databaseSoftware builds database Words from web pages “indexed”Words from web pages “indexed” Search interface finds words on Search interface finds words on

pagespages Engine ranks, describes resultsEngine ranks, describes results How engines and directories differHow engines and directories differ

Page 14: Best Web Directories and Search Engines Order Out of Chaos on the World Wide Web.

Spiders Comb, Capture Web PagesSpiders Comb, Capture Web Pages

Software decides which web pages to Software decides which web pages to collectcollect

Spiders check for updated pages Spiders check for updated pages Spiders remove dead sitesSpiders remove dead sites

Page 15: Best Web Directories and Search Engines Order Out of Chaos on the World Wide Web.

Spider Software Builds DatabaseSpider Software Builds Database

Current web size: over 15 billion pagesCurrent web size: over 15 billion pages No engine’s database covers it allNo engine’s database covers it all

Google covers 29% (4.3 billion+)Google covers 29% (4.3 billion+) AlltheWeb covers 21% (3.2 million+)AlltheWeb covers 21% (3.2 million+) HotBot covers 20% (3 billion+)HotBot covers 20% (3 billion+) Teoma covers 10% (1.5 billion)Teoma covers 10% (1.5 billion)

Page 16: Best Web Directories and Search Engines Order Out of Chaos on the World Wide Web.

Words from Web Pages “Indexed”

Words from Web Pages “Indexed”

““Index” is list of words in database Index” is list of words in database linked to words in Web pages linked to words in Web pages

Some engines index full text in documentSome engines index full text in document Some index part of textSome index part of text

First 100 words in documentFirst 100 words in document Words in abstract, or title of documentWords in abstract, or title of document

How an engine indexes affects search How an engine indexes affects search resultsresults

Page 17: Best Web Directories and Search Engines Order Out of Chaos on the World Wide Web.

Search Interface Finds Web PagesSearch Interface Finds Web Pages

Provides keyword search boxProvides keyword search box

Offers simple or advanced searchingOffers simple or advanced searching Offers search options to affect results:Offers search options to affect results:

Most assume AND between words: Russian mafiaMost assume AND between words: Russian mafia Most accept “quotes” to search a PHRASE: Most accept “quotes” to search a PHRASE:

“Russian mafia”“Russian mafia” Most allow FIELD searches : ti:Russian mafiaMost allow FIELD searches : ti:Russian mafia

AlltheWebAlltheWeb

Page 18: Best Web Directories and Search Engines Order Out of Chaos on the World Wide Web.

Engine Ranks, Describes Results Engine Ranks, Describes Results

Software lists most “relevant” items firstSoftware lists most “relevant” items first Word popularity: word repetitions, locationWord popularity: word repetitions, location Site popularity – visitations of web siteSite popularity – visitations of web site Link popularity – how often link citedLink popularity – how often link cited

Results describedResults described Few words to a paragraphFew words to a paragraph Sometimes stars, other indicators of Sometimes stars, other indicators of

relevancyrelevancy

Page 19: Best Web Directories and Search Engines Order Out of Chaos on the World Wide Web.

How Engines and Directories Differ

How Engines and Directories Differ

Computers vs peopleComputers vs people Engine spiders not editors select documentsEngine spiders not editors select documents

Quantity vs qualityQuantity vs quality Engines big: want all, accept anythingEngines big: want all, accept anything Directories small: want “best” “important”Directories small: want “best” “important”

Technology vs human judgmentTechnology vs human judgment Engine software ranks, no human evaluationEngine software ranks, no human evaluation

Page 20: Best Web Directories and Search Engines Order Out of Chaos on the World Wide Web.

Top Search Engines Top Search Engines

GoogleGoogle 4.2 billion+4.2 billion+ AlltheWebAlltheWeb 3.2 billion+3.2 billion+ HotBot (Inktomi)HotBot (Inktomi) 3 billion+3 billion+ TeomaTeoma 1.5 billion+1.5 billion+

HyperResearch GuideHyperResearch Guide

Page 21: Best Web Directories and Search Engines Order Out of Chaos on the World Wide Web.

Metasearch EnginesMetasearch Engines

Page 22: Best Web Directories and Search Engines Order Out of Chaos on the World Wide Web.

Metasearch EnginesMetasearch Engines

Technologies that search several Technologies that search several search engines at the same timesearch engines at the same time

Page 23: Best Web Directories and Search Engines Order Out of Chaos on the World Wide Web.

ProsPros

Increase results when search engine Increase results when search engine produce littleproduce little

Save time by searching several engines at Save time by searching several engines at onceonce

Show results of several engines on one Show results of several engines on one pagepage

Page 24: Best Web Directories and Search Engines Order Out of Chaos on the World Wide Web.

ConsCons Retrieve too many hitsRetrieve too many hits Retrieve less relevant resultsRetrieve less relevant results

Do not individualize search syntax all Do not individualize search syntax all engines they searchengines they search Do not know whether to use and or AND, +, or Do not know whether to use and or AND, +, or

“or” OR, cannot interpret phrase, title search “or” OR, cannot interpret phrase, title search etc.etc.

Exclude certain large engines like GoogleExclude certain large engines like Google

Page 25: Best Web Directories and Search Engines Order Out of Chaos on the World Wide Web.

Top Metasearch EnginesTop Metasearch Engines

DogpileDogpile Refines results, covers major enginesRefines results, covers major engines

VivisimoVivisimo Categorizes results, narrows topicsCategorizes results, narrows topics

Ez2findEz2find Includes most major enginesIncludes most major engines

Page 26: Best Web Directories and Search Engines Order Out of Chaos on the World Wide Web.

A Few Words About the Web and Search Engines

A Few Words About the Web and Search Engines

Page 27: Best Web Directories and Search Engines Order Out of Chaos on the World Wide Web.

What’s In Search Engines?What’s In Search Engines?

Business, commercial informationBusiness, commercial information Organizational publicationsOrganizational publications Government resourcesGovernment resources Some magazine, newspaper articlesSome magazine, newspaper articles Some scholarly informationSome scholarly information

Teaching materials, unpublished articlesTeaching materials, unpublished articles Books, articles whose copyright expiredBooks, articles whose copyright expired

Page 28: Best Web Directories and Search Engines Order Out of Chaos on the World Wide Web.

What’s Not in Search EnginesWhat’s Not in Search Engines

Books under copyrightBooks under copyright Most Fiction, non-fiction in existenceMost Fiction, non-fiction in existence

Journal, magazine, newspaper articlesJournal, magazine, newspaper articles Most current and past researchMost current and past research

Reference materialsReference materials Recent, quality, expensive encyclopedias, Recent, quality, expensive encyclopedias,

handbooks, business advisory services, etc.handbooks, business advisory services, etc. In shortIn short

Bulk of human knowledge and researchBulk of human knowledge and research

Page 29: Best Web Directories and Search Engines Order Out of Chaos on the World Wide Web.

Search TipsSearch Tips

Check “advanced” search and optionsCheck “advanced” search and options Learn about AND, OR, ANY, ALL, PHRASELearn about AND, OR, ANY, ALL, PHRASE Know how to search in titles, URLsKnow how to search in titles, URLs Spell it rightSpell it right Switch engines, get different resultsSwitch engines, get different results Keep up to date about search enginesKeep up to date about search engines

Newspapers and magazinesNewspapers and magazines Library web sitesLibrary web sites

Page 30: Best Web Directories and Search Engines Order Out of Chaos on the World Wide Web.

Evaluating Web SitesEvaluating Web Sites AccuracyAccuracy

Is information reliable? Is information reliable? What does URL tell you (.com, .org, .gov, .edu)?What does URL tell you (.com, .org, .gov, .edu)?

AuthorityAuthority Author’s credentials? Address, email given? Author’s credentials? Address, email given?

Content and CurrencyContent and Currency Purpose of site: inform, sell, propagandize? Date?Purpose of site: inform, sell, propagandize? Date?

DocumentationDocumentation Are sources given, footnotes?Are sources given, footnotes?

Page 31: Best Web Directories and Search Engines Order Out of Chaos on the World Wide Web.

Find and EvaluateFind and Evaluate

Use Google and find Website titled:Use Google and find Website titled:The Burmese Mountain DogThe Burmese Mountain Dog

Evaluate this site forEvaluate this site for AccuracyAccuracy AuthorityAuthority Content and CurrencyContent and Currency DocumentationDocumentation

Is it a trustworthy Web site?Is it a trustworthy Web site?