Content, Keywords, and Duplicate Content

40
Content, Keywords, and Duplicate Content

description

Content, Keywords, and Duplicate Content. Two major roles for Content. Spider food Search engines will find key phrases in your content Increases your chances of ranking for those phrases Provides exposure to the long tail of search Attract links People link to great stuff, so … - PowerPoint PPT Presentation

Transcript of Content, Keywords, and Duplicate Content

Page 1: Content, Keywords, and Duplicate Content

Content, Keywords, and Duplicate Content

Page 2: Content, Keywords, and Duplicate Content

Two major roles for Content

1. Spider food– Search engines will find key phrases in your

content– Increases your chances of ranking for those

phrases– Provides exposure to the long tail of search

2. Attract links– People link to great stuff, so …– Give them something to link to!

Page 3: Content, Keywords, and Duplicate Content

Spider Food Example

• Page Title: New York Used Cars• Content:“We offer used cars in Buffalo, Syracuse, Albany,

White Plains, New York City, and more!”• Creates the possibility of matching for “Albany

Used Cars”

Page 4: Content, Keywords, and Duplicate Content

Content is for Users Too

• Can’t forget that!• Optimizing for search engines is fine, but don’t

scare users away from your site• Also, people don’t link to sites with crappy

content

• We will talk more about links later!

Page 5: Content, Keywords, and Duplicate Content

Keyword Research• Tools for brainstorming a seed list

– Google Suggest (now integrated into Google search)

– Yahoo Assist• Tools to check popularity of keyword

searches– Wordtracker– Trellian’s Keyword Discovery– Google’s Keyword Suggestion Tool– Google Trends– Google Insights for Search

Page 6: Content, Keywords, and Duplicate Content

Google Suggest• Originally a separate testing lab in beta, rolled into web search

August 2008.• Search volume inferred based on order, but no quantifiable

value.

Page 7: Content, Keywords, and Duplicate Content

Google Suggest

• Pros– Free!– Data is from Google search data– Provides live suggestions as you type

• Cons– No quantifiable data– Based on typing order

Page 8: Content, Keywords, and Duplicate Content

• Part of Yahoo Search• Kicks in when delay while entering search

phrases• Word can match any part of search phrase

Yahoo Assist

Page 9: Content, Keywords, and Duplicate Content

Yahoo Assist

• Pros– Free!– Data is from Yahoo search data– Provides live suggestions as you type– Typed phrases can match anywhere within the

suggestions

• Cons– No quantifiable data

Page 10: Content, Keywords, and Duplicate Content

Wordtracker• Enter in keywords &

search phrases to be expound upon.

• Build out a project with relevant terms.

• Use for brainstorming as well as drilling down into specific phrases.

• Obtain quantifiable search numbers.Free version: freekeywords.wordtracker.comFree version: freekeywords.wordtracker.com

Page 11: Content, Keywords, and Duplicate Content

Wordtracker• Pros

– Based on last 130 days worth of searches– Singular vs. plural, misspellings, verb tenses all separated out– Advanced functionality: keyword “projects”, import data into

Excel, synonyms, …• Cons

– Full product requires subscription fee ($59/month or $329/year)– Data is from a small sample of Internet searches (from the

minor search engines Dogpile and MetaCrawler).– Contains bogus data from automated searches– No historical archives

Page 12: Content, Keywords, and Duplicate Content

Keyword Discovery• Similar features

as Wordtracker.• Trend graphs

provide a visual that goes beyond total searches.

• Various settings to refine data.

• Note: plural setting only pluralizes the last word.

Free version: www.keyworddiscovery.com/search.htmlFree version: www.keyworddiscovery.com/search.html

Page 13: Content, Keywords, and Duplicate Content

Keyword Discovery• Pros

– Full year of historical archives– Data is from a larger sample of Internet searches– Singular vs. plural, misspellings, verb tenses all separated out– Can segment by country– Advanced functionality: keyword “projects”, import data into

Excel, synonyms, …• Cons

– Access to the historical data requires subscription fee ($69.95/month or $599.40/year).

– Contains bogus data from automated searches

Page 14: Content, Keywords, and Duplicate Content

• Enter in lists of terms.

• Pull terms from a web page.

• Search volume– Switch to Exact

match– Show Search

Volume Trends column.

Google AdWords Keyword Tool

Free version: adwords.google.com/select/KeywordToolExternalFree version: adwords.google.com/select/KeywordToolExternal

Page 15: Content, Keywords, and Duplicate Content

Google AdWords Keyword Tool• Pros

– Free!– Accessing within Google AdWords yields more features– Data is from a large sample of Internet searches (from Google)– Singular vs. plural, misspellings, verb tenses– Can segment by country (within AdWords)– Synonyms– Monthly & average search volumes

• Cons– Numbers are approximations

Page 16: Content, Keywords, and Duplicate Content

• Provides a graphical, relative search volume comparison.

• Enter in up to 5 search terms.

• Shows related news.

• Sign-in to get relative ranking.

Google Trends www.google.com/trendswww.google.com/trends

What’s the busiest time of yearfor chocolate?

Page 17: Content, Keywords, and Duplicate Content

Chocolate

Page 18: Content, Keywords, and Duplicate Content

Google Trends• Pros

– Free!– Signing into Google account provides additional detail & features– Data is from a large sample of Internet searches (from Google)– Shows related news searches– Can segment by region or sub-region– Filter by time frame– Spot seasonal trends

• Cons– Numbers are purely relational to the query set– No way to export– Only preset data filtering– Limited to broad, popular search phrases

Page 19: Content, Keywords, and Duplicate Content

Google Insights for Search• Similar to Google Trends• Additional unique features

– Compare against a category– Geographic search volume

maps– Provides a relative index

measure against all searches performed on Google over time.

www.google.com/insights/search/www.google.com/insights/search/

Page 20: Content, Keywords, and Duplicate Content

Easy to Drill Into Regional Data

Page 21: Content, Keywords, and Duplicate Content

Google Insights for Search• Pros

– Free!– Signing into Google account provides additional detail &

features– Data is from a large sample of Internet searches (from

Google)– Shows related news searches– Can segment by region & subregion– Filter by time frame, even custom date ranges– Export as CSV

• Cons– Numbers are a normalized index– Limited to broad, popular search phrases

Page 22: Content, Keywords, and Duplicate Content

Competitiveness• Competition for that keyword should also be

considered– Calculate KEI Score (Keyword Effectiveness Indicator)

= ratio of searches over number of pages in search results.

– The higher the KEI Score, the more attractive the keyword is to target (assuming it’s relevant to your business).

– Perform advanced searches to determine difficulty• “digital cameras”• intitle:“digital cameras”

Page 23: Content, Keywords, and Duplicate Content

Duplicate Content

Page 24: Content, Keywords, and Duplicate Content

Syndicating Exact Copies = BAD

NYTimes.com Syndicate-NYTimes.com

Page 25: Content, Keywords, and Duplicate Content

What if your search results looked like this?

Page 26: Content, Keywords, and Duplicate Content

How Search Engines Prevent It

• Identify Duplicate Content• Pick One as a Winner• Ignore the Rest

• Helps prevent poor quality search experiences such as that shown on the prior slide

Page 27: Content, Keywords, and Duplicate Content

Picking the Winner

• Where Google first saw the content• Trust in the domain• Best link graph• Do copies link back to the original?• Does it look like it was scraped?• Only if it’s close, PageRank

Page 28: Content, Keywords, and Duplicate Content

Detecting Duplicate Content• Navigation / Templates ignored• Simple Text comparisons• Shingles (will illustrate in a moment)• Factor out word substitution• Content does not have to match exactly to be

duplicate– What % makes a duplicate?– Not published by search engines, changes over time

Page 29: Content, Keywords, and Duplicate Content

Word Substitution

• Our San Diego pizzas are the finest available. We provide high quality San Diego pizzas to …

• Our San Jose pizzas are the finest available. We provide high quality San Jose pizzas to …

• This is still duplicate content!

Page 30: Content, Keywords, and Duplicate Content

Reordering Content Does not Help

Page 31: Content, Keywords, and Duplicate Content

This is the Same Page!

Page 32: Content, Keywords, and Duplicate Content

Text Based Example

The clueless fool decided to make the best of it by going to find …

However, he kept bumping his head into the door, which led to a severe lowering of his already limited intellect …

Page 33: Content, Keywords, and Duplicate Content

And this is the Same Content!

However, he kept bumping his head into the door, which led to a severe lowering of his already limited intellect …

The clueless fool decided to make the best of it by going to find …

Page 34: Content, Keywords, and Duplicate Content

How Dupe Content is Created• Faceted Navigation (sort by price, sort by color …)• Session IDs on URLs ?id=890889089• Different URLs resolving to the same content:

– http://www.domain.com/content17– http://domain.com/content17– https://www.domain.com/content17

• Affiliate programs:– http://www.domain.com/content17?afid=stone

• Syndicating content

Page 35: Content, Keywords, and Duplicate Content

Costs of Duplicate Content• Wasted Crawl Budget

– Search engines limit how much they will crawl a site during a given day

– Time spent crawling dupes is wasted– Valuable (non-dupe) pages may not get crawled

• Wasted PageRank / link juice– Links to dupes provide no value

Page 36: Content, Keywords, and Duplicate Content

Solutions• In priority order:

1. Simply eliminate it2. 301 redirect un-needed copies to preferred version3. Canonical tag

• <link rel="canonical" href="http://www.example.com/product.php?item=swedish-fish" />

4. Use robots.txt to prevent crawling of the dupes• If syndicating content

1. Syndicate content not published on your site2. Differentiate content via added content (e.g. UGC)3. Include a link back to the original article

Page 37: Content, Keywords, and Duplicate Content

Watch for “Pseudo-Dupes”

• This is content on your site which competes for the same keywords

• Once again, the search engines will pick one, and ignore the others!

Page 38: Content, Keywords, and Duplicate Content

Duplicate Title Tags• Check for duplication

– Use special queries with Google to find duplication.

– Over 9,000 duplicates of this title alone … what does it say to Google?

• Purely duplicate titles• Canonicalization• Parameters & URL bloat

• Site:domain.com intitle:”title”

Page 39: Content, Keywords, and Duplicate Content

Google Helps Find Duplication

• Google’s Webmaster Central– Alerted to

duplicate titles

Page 40: Content, Keywords, and Duplicate Content

Thank You!Eric [email protected]@stonetemple(508) 485-7751

http://www.stonetemple.com/blog http://searchengineland.com/author/eric-enge http://searchenginewatch.com/sew_author_fullarchive&author=3624376 http://www.seomoz.org/users/view/18040http://www.instantetraining.com/ http://artofseobook.com