Link Checking: A Path to Quality Web Sites Paul Barron [email protected] 540-286-8025 Library Manager...
-
Upload
cassie-gailey -
Category
Documents
-
view
216 -
download
0
Transcript of Link Checking: A Path to Quality Web Sites Paul Barron [email protected] 540-286-8025 Library Manager...
Link Checking: A Path to Quality Web Sites
Paul [email protected]
540-286-8025Library Manager
University of Mary WashingtonCollege of Graduate and Professional Studies
Copyright © 2005Paul Barron
All Rights ReservedCopies may be made for educational use only.
Link Checking 2
Presentation Objectives
Demonstrate Link Checking
Text – Yahoo
Graphical –
Touchgraph Google Browser (Related
Pages)
Ranking Thumbshots
Link Checking 3
Presentation Objectives
Demonstate Link Checking’s
Effectiveness as a:
Search Technique
Tool to Evaluate Web Information
Demonstrate Finding Animated Images with
Picsearch.com
Link Checking 4
Achieving the Objectives
Link Checking Why am I doing this? Review the research Refining the link check with:
Boolean expressions Top Level Domain and
Country Code Domain limiters
Why not Google!
Link Checking 5
The Web and Research
“The Web moved from the periphery of a good researcher's awareness in 1998 to the very center of it in 2004.”
“The only certainty is that we're going to need help finding anything for a long time yet to come.”
Behind the Rise of Google Lies the Rise in Internet Credibility VERLYN KLINKENBORG
The New York Times
http://www.nytimes.com/2004/02/07/opinion/27FRI4.html
Link Checking 6
Size of the Surface Web
“Our [IBM] research labs project that
internet-accessible data is increasing
at an annual rate of 300%". Doug Elix
Senior Vice President and Group Executive for IBM Global Services
http://www.worldcongress2002.org (Day 2)
Number of Surface Web Documentshttp://www.oclcpica.org/content/53/pdf/
5yearinformationformattrends.pdf13
BillionPages Added per Day to the Surface
WebCyveillance Web Study
http://www.cyveillance.com/web/us/corporate/white_papers.htm
7.3 Million
Link Checking 7
Link Checking – The Research
“A Web page author links
to the best and most popular
pages within the same
category. This creates a
small Web between pages
with similar topics.”"Growing and Navigating the
Small World Web by Local Content”Proceedings of the National Academy of Sciences
October 2002Filippo Menczer
Link Checking 8
Links Analogous to Citations
Study - Examine links to research-oriented websites; determine if links are analogous to citations
Results – In 57% of the links, the reason for linking was … to amplify the content of the source page
Conclusion – Links to research-oriented sites are analogous to citations.
Web Links as Analogues of CitationsInformation Research, Vol. 9 No. 4, July 2004
http://informationr.net/ir/9-4/paper188.html#Kim
Link Checking 9
The Power of Citation Linking
“… [R]eferences cited by authors (which) have become the primary links in publishers' digital databases. The greatest advancements in linking have been the links to cited and citing references, the technical counterparts … of referring to other works.
“Linking on Steroids” PÉTER JACSÓ
Information TodayVol. 21 No. 7 — July/August 2004
Link Checking 10
Link Checking: Why do it?
Quality sites link to other quality
sites.
Link popularity search engines
Effective search technique
Indication of web site credibility
(Sometimes!)
Link Checking 11
Link Checking Exercise
http://valley.vcdh.virginia.edu/
Link Checking 12
Query Standardization
Why? “Standardization yields predictable results.” Preparation for searching proprietary databases
like Factiva
How will queries be standardized? Phrases are enclosed in “quotation marks.” UPPER CASE Boolean operator format
Reinforce an understanding of their function as an operator
Some search engines, like Exalead (http://www.exalead.com) require UPPER CASE
Link Checking 13
Query Standardization
How will queries be standardized? Every segment of a query is joined
by an operator. Complex Boolean expressions are
nested; (enclosed within brackets). Example
link:http://lii.org AND (phishing OR “identity theft”) AND site:org
Link Checking 14
AND
AND – Both of the search terms are present in the Web documents.
Link Checking 15
OR
OR – At least one of the search terms is present in the Web documents.
Link Checking 16
AND NOT or NOT
AND NOT / NOT – Only one of the search terms is present in the Web documents.
Link Checking 17
Top Level Domains (TLD)
Purpose TLD
Commercial .com Educational .edu
Government (State & Local) .usGovernment .gov
Military .milNetwork .net
Organization .orgFor TLD statistics, see The Verisign Domain Report, http://www.verisign.com/nds/naming/domainbrief/
Link Checking 18
Top Level Domains (TLD)
Purpose TLD
Air-transport Industry .aero
Businesses .biz
Cooperatives .coop
Unrestricted Use .info
Museums .museum
For Registration by Individuals .name
Accountants, Lawyers, and
Physicians
.pro
Link Checking 19
Country Top Level Domains
Canada .ca
France .fr
Germany .de
Italy .it
Japan .jp
United Kingdom .uk
Link Checking 20
Country Codes
http://www.cia.gov/cia/publications/factbook/appendix/appendix-d.html
Link Checking 21
Yahoo Search Template
http://www.search.yahoo.com
Link Checking 22
I want to use Google! Google is Gog!
Link Checking 23
Link Check w/Boolean Expression
NOTE
Google’s link syntax does not mix (well) with other limiters.
Link Checking 24
Simple Link Check
NOTE
The Yahoo link check syntax must include the http//.
link:http://valley.vcdh.virginia.edu
Link Checking 25
Simple Link Check
NOTE
Among the first four results are .edu and .com sites.
NOTE
The George Mason University History Matters site is also on ALA’s 2004 list of Best Free Reference Web Sites. “Quality sites link to quality sites.”
Link Checking 26
Where is the link?
Link Checking 27
Top Level Domain-limited Check
NOTE
Both the AND
domain: and the
AND site: syntax will
work.
link:http://valley.vcdh.virginia.edu AND site:edu
Link Checking 28
Top Level Domain-limited Check
QUESTION
Is there a syntax that will exclude the virginia.edu
sites from the results?
NOTE
All of the results are .edu sites.
Link Checking 29
Excluding Sites within a Domain
link:http://valley.vcdh.virginia.edu
AND site:edu NOT site:virginia.edu
Link Checking 30
Excluding Sites within a Domain
NOTE
The number of results dropped to 307 after excluding
the virginia.edu sites.
RECOMMENDATION
If the site description has the words: links, references,
resources, sites, webliography,
or websites, review it!
Link Checking 31
Link Check w/Boolean Expression
QUESTION
Why did this search fail; what did I
forget?
Link Checking 32
Link Check w/Boolean Expression
link:http://valley.vcdh.virginia.edu
AND “lesson plans”
Link Checking 33
Link Check w/Boolean Expression
NOTE
“lesson plans” is boldfaced KWIC; one site is from a
k-12 school in … ?
Link Checking 34
Complex Nested Check
link:http://valley.vcdh.edu AND
(“u.s. history” AND “military history”)
Link Checking 35
Complex Nested Check
Link Checking 36
Complex (un)Nested Check
link:http://valley.vcdh.edu AND “u.s. history” OR “military history”
Link Checking 37
Complex Nested Check
link:http://valley.vcdh.edu AND (“u.s. history” OR “military history”)
Link Checking 38
Domain-limited Check w/Boolean Expression
REMEMBER
Both the AND domain: and the AND site: syntax will work.
link:http://valley.vcdh.edu AND “lesson plans” AND domain:edu
Link Checking 39
Domain-limited Check w/Boolean
Expression
NOTE
“lesson plans” is boldfaced KWIC in the .edu sites.
NOTE
Site descriptions state, “Best of History lesson plans and resources” and “TOP
SOCIAL STUDIES SITES.”
Link Checking 40
Domain-limited Check w/Boolean
Expression
REMEMBER
There are very good educational resources
on .com sites.
link:http://valley.vcdh.edu AND “lesson plans” AND domain:com
Link Checking 41
Domain-limited Check w/Boolean
Expression
http://www.kn.pacbell.com/wired/bluewebn
Link Checking 42
URL-limited Check w/Boolean Expression
link:http://valley.vcdh.edu AND “lesson plans” AND inurl:k12
Link Checking 43
URL-limited Check w/Boolean
Expression
RECOMMENDATION
When searching for material for a specific grade, use: “elementary
school” or “middle school” or “high school” or “college prep.”
NOTE
All the results are from k-12 schools.
Link Checking 44
State-limited Check
link:http://www.kn.pacbell.com/wired/bluewebn AND ("teacher resources" AND
"middle school") AND inurl:k12.va
Link Checking 45
State-limited Check
Link Checking 46
Country-limited Link Check
NOTECountries like Moldova sell the use of their top level domain.
For instance, .md is purchased for medical-related sites.
link:http://www.kn.pacbell.com/wired/bluewebn AND site:uk
Link Checking 47
Country-limited Link Check
NOTE
In the UK, .ac = .edu.
Link Checking 48
Academic Sites Around the
World
NOTE
To locate other countries that use .ac for educational institutions, run the query:
inurl:ac
Link Checking 49
Graphical Display of Web Communities
http://www.touchgraph.com/TGGoogleBrowser.html
valley.vcdh.virginia.edu
Enter the Web address without the http://.
Link Checking 50
Touchgraph Display
Touchgraph fuzzy clusters the results into
web communities.
Touchgraph fuzzy clusters the results into
web communities.
Link Checking 51
Touchgraph Display – Site Info
Link Checking 52
Loading Additional Sites
NOTEDouble click a site to retrieve surrounding
links. will
appear indicating which links are being fetched.
NOTEDouble click a site to retrieve surrounding
links. will
appear indicating which links are being fetched.
Loading
Link Checking 53
An Ocean of Information
Link Checking 54
Information Evaluation
“Our No. 1 story on the 'countdown' tonight: A
five-year study just concluded at Indiana
University suggesting that upon the birth
of their first child, 100 percent of parents
lose at least 12 IQ points, and the average
loss is 20. The loss may not be reversible. It
may be compounded for each child you have.“MSNBC talk-show host Keith Olbermann's
Sept. 7 Broadcast
Link Checking 55
Source of the Information
http://www.hoosiergazette.com/
Link Checking 56
Hoosier.com Disclaimer
“Hoosier Gazette articles
are …fictitious or
satirical [and] use
invented names. [U]se
of real names is
accidental. The reader
should suspend belief
for the sake of
enjoyment.”
Josh Whicker, a schoolteacher, makes
mischief by writing bogus stories on the
Hoosier Gazette.
Link Checking 57
Verifying Site Credibility
http://www.gatt.org
Link Checking 58
Verifying Site Credibility
http://www.wto.org
What domain-limited check might verify the
real World Trade Organization site?
Link Checking 59
Verifying Site Credibility
Link Checking 60
Verifying Site Credibility
Link Checking 61
Search Engine Overlap
“No search engine indexes more than about 16% of the web.”
Accessibility of Information on the Web
STEVE LAWRENCE AND C. LEE GILES
Nature 400, 107 (08 July 1999)
Link Checking 62
Search Template
http://ranking.thumbshots.com/
Link Checking 63
Results Overlap (or Lack of …)
Link Checking 64
picsearch Advanced Search
NOTE
Animated images can ONLY found in
Advanced Search.
NOTE
Animated images can ONLY found in
Advanced Search.
Click on the image.Click on the image.
http://www.picsearch.com
Link Checking 65
picsearch Returns
NOTE
Click on the “Image URL” to view only the animated image.
REMINDER
Let the image load completely so that you can see the animation.
Copyright law applies!
NOTE
Click on the “Image URL” to view only the animated image.
REMINDER
Let the image load completely so that you can see the animation.
Copyright law applies!
Link Checking 66
Let’s summarize! Tell me again why I am doing link
checks. Quality sites link to other quality sites. Sites link to amplify the content of the
source page. Link checking is analogous to citation
searching. Link checking may be an indication of
website credibility.
Link Checking 67
Summation
Evaluate, evaluate, and evaluate. Bookmark, bookmark, and
bookmark. Amazon’s A-9 (www.a9.com) Backflip (www.backflip.com) FURL (www.furl.net) Portaportal (www.portaportal.com)
Collaborate, share, collaborate.
Link Checking 68
For more information see …
“Link Checking — A Path
to Quality Web Sites”
MultiMedia &
Internet@Schools
VOLUME 12, NUMBER 1
January/February 2005,
Page 12