google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the...

95
1 Important Information Important Information This presentation was created by Patrick Crispen. You are free to reuse this presentation provided that you – Not make any money from this presentation. – Give credit where credit is due.

Transcript of google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the...

Page 1: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

1

Important InformationImportant Information

• This presentation was created by Patrick Crispen.

• You are free to reuse this presentation provided that you–Not make any money from this

presentation.–Give credit where credit is due.

Page 2: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

2

Google 201:Google 201:

“ Advanced “ Advanced GoogolgyGoogolgy ””

a presentation byPatrick Douglas Crispen

Page 3: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

3

Our GoalsOur Goals

•Learn how Google really works.• Discover some Google secretsGoogle secrets no one

ever tells you.• Play around with some of Google’s

advanced search operatorsadvanced search operators.• Find out where to get more

Google-related help and information.• DO ALL OF THIS IN ENGLISH!

Page 4: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

4

PART ONE: PART ONE: How Google How Google REALLYREALLY

WorksWorks

Or, at least, how I think Google really works.

Page 5: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

5

One Word of WarningOne Word of Warning

• For obvious reasons, the folks at Google would rather the Wizard of Ozstay behind the curtain, so to speak.

• So, what you are about to see on the next few slides are just plain guesses on my part.

• And, my guesses are probably completely wrong! But they’re ‘pretty’But they’re ‘pretty’. And that’s all that matters.

Page 6: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

6

Another Word of WarningAnother Word of Warning

• I also need to warn you that my guesses use a little bit of algebra, but I promise it is simple algebra.– Well, there is one intimidating-looking

equation, but we’ll get to that in a bit.

• Just remember that, in this case, X > Y > Z, and there can be different values for each variable (X1 > X2 … > Xn.)

• I’ve lost you already, haven’t I?

Page 7: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

7

How Google Works How Google Works -- PhrasesPhrases

• When you search for multiple keywords, Google first searches for all of your keywords as a phrase. I think.

• So, if your keywords are disney fantasyland pirates, any pages on which those words appear as a phrasereceive a score of X.

Image source: Google

Source: Google Hacks, p. 21

Page 8: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

8

How Google Works How Google Works -- AdjacencyAdjacency

• Google then measures the adjacency between your keywords and gives those pages a score of Y.

• What does this mean in English? Well …

Image source: Google

Source: Google Hacks, p. 21

Page 9: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

9

How Adjacency WorksHow Adjacency Works

A page that says

“My favorite Disney attraction, outside of Fantasyland, is Pirates of the Caribbean”

will receive a higher adjacency score than a page that says

“Walt Disney was a both a genius and a taskmaster. The team at WDI spent many sleepless nights designing Fantasyland. But nothing could compare to the amount of Imagineering work required to create Pirates of the Caribbean.”

Page 10: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

10

How Google Works How Google Works -- WeightsWeights

• Then, Google measures the number of times your keywords appear on the page (the keywords’ “weights”) and gives those pages a score of Z.

• A page that has the word disneyfour times, fantasyland three times, and pirates seven times would receive a higher weights score than a page that only has those words once.

Source: Google Hacks, p. 21

Page 11: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

11

You Still You Still With Me?With Me?

Page 12: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

12

Putting it All TogetherPutting it All Together

• Google takes– The phrase hits (the Xs), – The adjacency hits (the Ys), – The weights hits (the Zs), and – About 100 other secret variables

• Throws out everything but the top 2,000• Multiplies each remaining page’s individual

score by it’s “PageRank”• And, finally, displays the top 1,000 in

order.

Page 13: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

13

PageRank?PageRank?

• There is a premise in higher education that the importance of a research paper can be judged by the number of citations the paper has from other research papers.

• Google simply applies this premise to the Web: the importance of a Web page can be judged by the number of hyperlinks pointing to it … from other pages.

• Or, to put it mathematically [brace yourself – the next slide contains the intimidating-looking equation I warned you about] …

Source: Google Hacks, p. 294

Page 14: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

14

The PageRank AlgorithmThe PageRank Algorithm

+++−=

)()(...

)1()1()1()(

TnCTnPR

TCTPRddAPR

Where

• PR(A) is the PageRank of Page A

• PR(T1) is the PageRank of page T1

• C(T1) is the number of outgoing links from the page T1

• d is a damping factor in the range of 0 < d < 1, usually set to 0.85

Source: Google Hacks, p. 295

Page 15: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

15

You Can Start Breathing AgainYou Can Start Breathing Again

•• I promise there are no more equations in I promise there are no more equations in this presentationthis presentation.

• I just wanted to show you that the PAGE RANK of a Web page is the sum of the PageRanks of all the pages linking to it divided by the number of links on each of those pages.– A page with a lot of (incoming) links to it is

deemed to be more important than a page with only a few links to it.

– A page with few (outgoing) links to other pages is deemed to be more important than a page with links to lots of other pages.

Source: Google Hacks, p. 295

Page 16: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

16

PART ONE : PART ONE : SummarySummary

• Google first searches for your keywords as a phrase and gives those hits a score of X.

• Google then searches for keyword adjacency and gives those hits a score of Y.

• Google then looks for keyword weights and gives those hits a score of Z.

• Google combines the Xs, the Ys, the Zs, and a whole bunch of unknown variables, and then weeds out all but the top 2,000 scores.

• Finally, Google takes the top 2,000 scores, multiplies each by their respective PageRank, and displays the top 1,000.

• I think.

Page 17: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

17

PART TWO:PART TWO:More Stuff No One Tells YouMore Stuff No One Tells You

Google’s shocking secrets revealed!

Page 18: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

18

Google’s Boolean Default is Google’s Boolean Default is ANDAND

But there are ways to get around that.

Page 19: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

19

Boolean Default is Boolean Default is ANDAND

• If you search for more than one keyword at a time, Google will automatically search for pages that contain ALL of your keywords.

• A search for disney fantasyland pirates is the same as searching for disney AND fantasyland AND pirates

• But, if you try to use AND on your own, Google yells at you.

Source: http://www.google.com/help/basics.html

Page 20: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

20

““ PHRASES ”PHRASES ”• To search for phrases, just put your

phrase in quotes.• For example, disney fantasyland “pirates of the caribbean”– This would show you all the pages in Google’s

index that contain the word disney AND the word fantasyland AND the phrase pirates of the caribbean (without the quotes)

• By the way, while this search is technicallyperfect, my choice of keywords contains a (deliberate) factual mistake. Can you spot it?

Source: http://www.google.com/help/refinesearch.html

Page 21: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

21

Arr, There She Blows!Arr, There She Blows!

• Pirates of the Caribbean- isn’t in Fantasyland, - it’s in Adventureland

in Orlando and New Orleans Square inAnaheim.

• So searching for disney AND fantasyland AND “pirates of the caribbean” probably isn’t a good idea.

Image source: http://www.balgavy.at/

Page 22: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

22

Boolean Boolean OROR

• Sometimes the default AND gets in the way. That’s where OR comes in.

• The Boolean operator OR is always inall CAPS and goes between keywords.

• For example, an improvement over our earlier search would be disney fantasyland OR “pirates of the caribbean”– This would show you all the pages in Google’s

index that contain the word disney AND the word fantasyland OR the phrase pirates of the caribbean (without the quotes)

Source: http://www.google.com/help/refinesearch.html

Page 23: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

23

Three Ways to Three Ways to OROR at Googleat Google

• Just type OR between keywords– disney fantasyland OR “pirates of the caribbean”

• Put your OR statement in parentheses– disney (fantasyland OR

“pirates of the caribbean”)

• Use the | (“pipe”) character in place of the word OR– disney (fantasyland | “pirates of the caribbean”)

• All three methods yield the exact same results.

Source: Google Hacks, p. 3

Page 24: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

24

OR, She Blows!OR, She Blows!

• Just remember, Google’s Boolean default is AND

• Sometimes the default AND gets in the way. That’s where OR comes in.

Image source: http://www.phil-sears.com/

Page 25: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

25

Capitalization Capitalization Does NOT MatterDoes NOT Matter

The old AltaVista trick of typing your keywords in lower case is no longer necessary.

Page 26: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

26

How Insensitive !How Insensitive !

•Google is not case sensitive. • So, the following searches all yield

exactly the same results: disney fantasyland piratesDisney Fantasyland PiratesDISNEY FANTASYLAND PIRATESDiSnEy FaNtAsYlAnD pIrAtEs

Source: http://www.google.com/help/basics.html

Page 27: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

27

Google Has a Hard Limit of Google Has a Hard Limit of 10 Keywords10 Keywords

Bet you didn’t know THAT!

Source: Google Hacks, p. 19

Page 28: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

28

Google’s 10 Word LimitGoogle’s 10 Word Limit

• Google won’t accept more than 10 keywords at a time.

• Any keyword past 10 is simply ignored.

• How can you get around this limit? Well, first you need to remember that …

Source: Google Hacks, p. 19

Page 29: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

29

Google Ignores a BUNCH Google Ignores a BUNCH of Common Wordsof Common Words

Words to avoid

Page 30: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

30

Stop WordsStop Words

To enhance the speed and relevancy of your Web search, Google routinely and automatically ignores commonwords and characters known as “stop words.”

Source: http://www.google.com/press/guide/reviewguide_7.html

Page 31: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

31

Stop, _ _ Name _ Love

• This is certainly not a canonical list, but here are 28 stop words I know about.

• a, about, an, and, are, as, at, be, by, from, how, i, in, is, it, of, on, or, that, the, this, to, we, what, when, where, which, with

• You can force Google to search for a stop word by putting a + in front of it (for example pirates +of +the caribbean)

Source: 10/23/02 post by Bill Todd to news:google.public.support.general

Page 32: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

32

Dealing with the 10 Word LimitDealing with the 10 Word Limit

• Omit the stop words in your search terms and you’ll probably never run into the 10 word limit.

• Another way around the limit is to use wildcards.

Image source: http://www.alloyd.com/

Page 33: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

33

Google DOES Support Google DOES Support Wildcard Searches … Sort Of.Wildcard Searches … Sort Of.

When you wish upon a *.

Page 34: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

34

WildcardsWildcards

• Wildcards are characters, usually asterisks (*), that represent other characters.

• For example, some search engines support a technique called “stemming”– With stemming, you search for something like pirate* and the search engine shows you all the pages in its database that contain variants of the word pirate – pirates, pirated, etc.

• But, did you notice I said …“some search engines?”

Page 35: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

35

Google and WildcardsGoogle and Wildcards

• Google doesn’t support stemming.• Rather, Google offers full-word wildcards.

• For example, if you search Google for it’s +a * world, Google shows you all of the pages in its database that contain the phrase “it’s a small world” … and “it’s a nano world” … and “it’s a Linux world” … and so on.

Source: Google Hacks, p. 37

Page 36: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

36

it’s +a * worldit’s +a * world

• The ++ before a is required because it is a stop word and would otherwise be ignored.

• Most of the hits are phrases because that’s what Google looks for first.

• Oh, and I defy you to get that song out of your head!Image source: http://themeparksource.com/

Page 37: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

37

Wildcards Wildcards and the Word Limitand the Word Limit

• Remember when I said that one way to get around the 10 word limit was to use wildcards?

• Google doesn’t count wildcards toward the limit.

• For example, Google thinks that though * mountains divide * * oceans * wide it's * small world after all is exactly 10 words long.

Source: Google Hacks, p. 19

Page 38: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

38

The The OrderOrder of Your of Your Keywords MattersKeywords Matters

A me life for pirate’s?A me life for pirate’s?

Page 39: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

39

How Google WorksHow Google Works

• When you conduct a search at Google, it searches for– Phrases, then– Adjacency, then– Weights.

• Because Google searches for phrases first, the order of your keywords matters.Image source: Google

Source: Google Hacks, p. 20-22

Page 40: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

40

For ExampleFor ExampleA search for disney fantasyland pirates yields the same number of hits

as a search for fantasyland disney pirates, but the order of those hits –especially the first 10 – is noticeablydifferent.

Page 41: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

41

PART TWO:PART TWO:SummarySummary

• Google’s Boolean default is AND.• Capitalization does not matter.• Google has a hard limit of

10 keywords.• Google ignores a BUNCH of

common words.• Google does support

wildcard searches … sort of.• The order of your keywords matters.

Page 42: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

42

PART THREE:PART THREE:Advanced Search OperatorsAdvanced Search Operators

Beyond plusses, minuses, ANDs, ORs, quotes, and *s

Page 43: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

43

How Google Finds New PagesHow Google Finds New Pages• Google has special

programs called spiders (a.k.a. “Google bots”) that constantly search the Internet looking for new or updated Web pages.

• When a spider finds a new or updated page, it reads that entire page, reports back to Google, and then visits all of the other pages to which that new page links.

Image source: http://www.disobey.com/

Page 44: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

44

““ Paging Miss Paging Miss MuffetMuffet ““

• When the spider reports back to Google, it doesn’t just tell Google the new or updated page’s URL.

• The spider also sends Google a complete copy of the entire Webpage – HTML, text, images, etc.

• Google then adds that page and all of its content to Google’s cache.

Page 45: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

45

So What?So What?

When you search Google, you’re actually searching Google’s cache of Web pages.

• And because of this, you can search for more than text or phrases in the body of a Web page.

• Google has some secret, advanced search operators that let you search specific parts of Web pages or specific types of information.

Source: Google Hacks, p. 5

Page 46: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

46

Advanced OperatorsAdvanced Operators

Query modifiers• daterange:• filetype:• inanchor:• intext:• intitle:• inurl:• site:

Alternative query types• cache:• link:• related:• info:

Other information needs• phonebook:• stocks:• define:• Google Calculator

Page 47: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

47

Query ModifiersQuery Modifiers

Stuff you can add to the end your regular searches

Page 48: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

48

daterange:daterange:

• daterange: limits your search to a particular date or range of dates that a page was indexed by Google.

• daterange: only works with Julian dates, so you’ll need to find a Julian date converter online.

• The Julian date must be an integer(no decimals.)

Source: Google Hacks, p. 6

Page 49: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

49

daterange:start-stop

pirates daterange:2452401-2452766

Page 50: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

50

filetype:filetype:

• filetype: restricts your results to files ending in ".doc" (or .xls, .ppt. etc.), and shows you only files created with the corresponding program.

• There can be no space between filetype:and the file extension

• The “dot” in the file extension – .doc – is optional.

Source: http://www.google.com/help/faq_filetypes.html

Page 51: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

51

Google’s Google’s OfficialOfficial FiletypesFiletypes

• Adobe Portable Document Format (pdf)

• Adobe PostScript (ps) • Lotus 1-2-3 (wk1,

wk2, wk3, wk4, wk5, wki, wks, wku)

• Lotus WordPro (lwp) • MacWrite (mw)

• Microsoft Excel (xls) • Microsoft PowerPoint

(ppt) • Microsoft Word (doc) • Microsoft Works (wks,

wps, wdb) • Microsoft Write (wri) • Rich Text Format (rtf) • Text (ans, txt)

Source: http://www.google.com/help/faq_filetypes.html

Page 52: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

52

filetype:extension

pirates filetype:pdfpirates -filetype:pdf

Page 53: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

53

inanchor:inanchor:

• Using inanchor:restricts the results to text in a page’s link anchors.

• There can be no spacebetween inanchor:and the following word.

• You can also search for phrases. Just put your phrase in quotes.

Source: http://www.google.com/help/operators.html

Page 54: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

54

Link Anchor Text?Link Anchor Text?

…<body><p>Pirates of the Caribbean opened March 18, 1967.</p><p>Please <a href=“guestbook.html”>sign our guestbook</a></p>

</body>…

Page 55: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

55

inanchor:terms

inanchor:guestbookpirates -inanchor:”walt disney”

Page 56: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

56

intext:intext:

• intext: ignores link text, URLs, and titles, and only searches body text.

• intext: helps you find query words that are too common in URLs and links.

• There can be no space between intext: and the following word.

• You can also search for phrases. Just put your phrase in quotes.

Source: Google Hacks, p. 5

Page 57: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

57

intext:terms

intext:disneypirates -intext:”disney.com”

Page 58: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

58

intitle:intitle:

• Using intitle:restricts the results to documents containing a particular word in its title.

• There can be no space between intitle:and the following word.

• You can also search for phrases. Just put your phrase in quotes.

Source: http://www.google.com/help/operators.html

Page 59: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

59

Title?Title?

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

<html><head>

<title>Pirates of the Caribbean</title>

</head><body> ...

Page 60: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

60

intitle:terms

intitle:piratespirates -intitle:”walt disney”

Page 61: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

61

A Quick QuestionA Quick Question

What would happen if I searched forintitle:walt disney (without the quotes)?

• Google would look for every page with the world walt in its title AND the word disney somewhere in its body.

• Remember, the quotes are kind of important if you want to search for phrases using intitle:

Page 62: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

62

inurl:inurl:

• Using inurl:restricts the results to documents containing a particular word in its URL.

• There can be no space betweeninurl: and the following word.

Source: http://www.google.com/help/operators.html

Page 63: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

63

URL ?URL ?

A URL is a uniform resource locator, a string that uses a standard syntax to identify an access protocol, location, and identifier for a file or other Internet resource.–http://www.disney.com/–http://www.google.com/– ftp://wuarchive.wustl.edu/–news:google.public.support.general

Source: http://search400.techtarget.com/newsItem/0,289139,sid3_gci850,00.html

Page 64: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

64

inurl:term

inurl:disneypirates –inurl:disney

Page 65: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

65

site:site:

• Using site:restricts the results to those websites in a domain.

• There can be no space betweensite: and the domain.

Source: http://www.google.com/help/operators.html

Page 66: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

66

site:domain

pirates site:disney.com

Page 67: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

67

Using site:Using site:

• You use site: in conjunction with another search term or phrase.pirates site:disney.com

• You can also use site: to exclude sites.pirates –site:disney.com

• You can use site: to exclude or includeentire domains (and, like with filetype, the dot is optional).pirates –site:compirates site:edu

Page 68: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

68

Alternative Query TypesAlternative Query Types

Stuff you can use, if you want to search

without using any keywordswithout using any keywords

Page 69: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

69

cache:cache:

• Using cache: shows the version of a web page that Google has in its cache.

• There can be no space between cache: and the URL.

• You can use cache: in conjunction with a keyword or phrase, but few do.

Source: http://www.google.com/help/operators.html

Page 70: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

70

cache:URL

cache:disney.com

Page 71: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

71

link:link:

• Using link:restricts the results to those web pages that have links to the specified URL.

• There can be no space betweenlink: and the URL.

Source: http://www.google.com/help/operators.html

Page 72: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

72

link:URL

link:disney.com

Page 73: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

73

related:related:

• Using related:lists web pages that are "similar" to a specified web page.

• There can be no space betweenrelated: and the URL.

Source: http://www.google.com/help/operators.html

Page 74: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

74

related:URL

related:disney.com

Page 75: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

75

info:info:

• Using info:presents some information that Google has about a particular web page.

• There can be no space betweeninfo: and theURL.

Source: http://www.google.com/help/operators.html

Page 76: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

76

info:URL

info:disney.com

Page 77: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

77

Other Information NeedsOther Information Needs

Did you know that Google can look up phone numbers, stock quotes,

dictionary definitions, and … even the answer to math problems?

Page 78: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

78

phonebook:phonebook:

• There are actually three different Google phonebook operators.

• Using phonebook:searches the entire Google phonebook.

• Using rphonebook:searches residential listings only.

• Using bphonebook:searches businesslistings only.

Source: http://www.google.com/help/operators.html

Page 79: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

79

How to Use the PhonebookHow to Use the Phonebook

• first name (or first initial), last name, city (state is optional)

• first name (or first initial), last name, state

• first name (or first initial), last name, area code

• first name (or first initial), last name, zip code

• phone number, including area code • last name, city, state • last name, zip code

Page 80: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

80

phonebook:Data

phonebook:disneyland caphonebook:(714) 956-6425

Page 81: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

81

stocks:stocks:

• If you begin a query with stocks: Google will treat the rest of the query terms as stock ticker symbols, and will link to a Yahoo finance page showing stock information for those symbols.

• Go crazy with the spaces – Google ignores them!

Source: http://www.google.com/help/operators.html

Page 82: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

82

stocks:Symbol1 Symbol2 …

stocks: msftstocks: aapl intc msft macr

Page 83: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

83

define:define:

• If you begin a query with define: Google will display definitions for the word or phrase that follows, if definitions are available.

• There can be no space between define: and the word or phrase you wish to define.

• You don’t need quotes around your phrases.

Source: http://www.google.com/help/features.html#definitions

Page 84: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

84

define:term

define:piratedefine:barbary coast

Page 85: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

85

Google CalculatorGoogle Calculator

• Simply key in what you'd like Google to compute (like 2+2) and then hit enter.

• Google’s Calculator can solve math problems involving basic arithmetic, more complicated math, units of measure and conversions, and physical constants.

Source: http://www.google.com/help/features.html#calculator

Page 86: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

86

3+4456*78

1.21 GW / 88 mph100 miles in kilometers

sine(30 degrees)G*(6e24 kg)/(4000 miles)^20x7d3 in roman numerals

For instructions on how to use the Google Calculator, see

http://www.google.com/help/calculator.html

Page 87: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

87

PART THREE:PART THREE:Advanced OperatorsAdvanced Operators

SUMMARYSUMMARYQuery modifiers

• daterange:• filetype:• inanchor:• intext:• intitle:• inurl:• site:

Alternative query types• cache:• link:• related:• info:

Other information needs• phonebook:• stocks:• define:• Google Calculator

Page 88: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

88

The Last Part:The Last Part:Google ResourcesGoogle Resources

Where to get more information

Page 89: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

89

http://www.google.com/help/http://www.google.com/help/

• Google Help Central

• Free guides and FAQs that tell you about Web searching in general and Google’s features in specific.

Page 90: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

90

Google Support NewsgroupGoogle Support Newsgroup

• Google has a freeUsenet newsgroup: google.public.support.general

• You may be able to access this newsgroup through your Usenet reader.

Page 91: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

91

Google Support NewsgroupGoogle Support Newsgroup

• You can also search for the google.public.support.general newsgroup atnews.google.com.

• The easiest way to access the newsgroup is to just click on the “user support discussion forum” link at the top of the Google Help Central page.

Page 92: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

92

Google HacksGoogle Hacks

• Google Hacks by Calishain and Dornfest

• US$24.95 (ISBN 0596004478)

• This is an extremely advanced book written for Perl programmers, NOT you and me.

• But I still highly recommend it.

Image source: Amazon.com

Page 93: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

93

Our GoalsOur Goals

• Learn how Google really works.• Discover some Google secrets no one

ever tells you.• Play around with some of Google’s

advanced search operators.• Find out where to get more Google-

related help and information.• DO ALL OF THIS IN ENGLISH!

Page 94: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

94

Fair Use DisclaimerFair Use Disclaimer

This presentation was created following the Fair Use Guidelines for Educational Multimedia.

Certain materials are included under the Fair Use exemption of the U.S. Copyright Law. Further use of these materials and this presentation is

restricted.

Page 95: google 201 ACPCUG 04122004 - elmayorportaldegerencia.comPD... · Fantasyland, is Pirates of the Caribbean” will receive a higher adjacency score than a page that says “Walt Disneywas

95

GOOGLE 201:GOOGLE 201:

‘Advanced ‘Advanced GoogolgyGoogolgy’’

a presentation byPatrick Douglas Crispen