Content, Keywords, and Duplicate Content

Post on 21-Jan-2016

31 views 0 download

Tags:

description

Content, Keywords, and Duplicate Content. Two major roles for Content. Spider food Search engines will find key phrases in your content Increases your chances of ranking for those phrases Provides exposure to the long tail of search Attract links People link to great stuff, so … - PowerPoint PPT Presentation

Transcript of Content, Keywords, and Duplicate Content

Content, Keywords, and Duplicate Content

Two major roles for Content

1. Spider food– Search engines will find key phrases in your

content– Increases your chances of ranking for those

phrases– Provides exposure to the long tail of search

2. Attract links– People link to great stuff, so …– Give them something to link to!

Spider Food Example

• Page Title: New York Used Cars• Content:“We offer used cars in Buffalo, Syracuse, Albany,

White Plains, New York City, and more!”• Creates the possibility of matching for “Albany

Used Cars”

Content is for Users Too

• Can’t forget that!• Optimizing for search engines is fine, but don’t

scare users away from your site• Also, people don’t link to sites with crappy

content

• We will talk more about links later!

Keyword Research• Tools for brainstorming a seed list

– Google Suggest (now integrated into Google search)

– Yahoo Assist• Tools to check popularity of keyword

searches– Wordtracker– Trellian’s Keyword Discovery– Google’s Keyword Suggestion Tool– Google Trends– Google Insights for Search

Google Suggest• Originally a separate testing lab in beta, rolled into web search

August 2008.• Search volume inferred based on order, but no quantifiable

value.

Google Suggest

• Pros– Free!– Data is from Google search data– Provides live suggestions as you type

• Cons– No quantifiable data– Based on typing order

• Part of Yahoo Search• Kicks in when delay while entering search

phrases• Word can match any part of search phrase

Yahoo Assist

Yahoo Assist

• Pros– Free!– Data is from Yahoo search data– Provides live suggestions as you type– Typed phrases can match anywhere within the

suggestions

• Cons– No quantifiable data

Wordtracker• Enter in keywords &

search phrases to be expound upon.

• Build out a project with relevant terms.

• Use for brainstorming as well as drilling down into specific phrases.

• Obtain quantifiable search numbers.Free version: freekeywords.wordtracker.comFree version: freekeywords.wordtracker.com

Wordtracker• Pros

– Based on last 130 days worth of searches– Singular vs. plural, misspellings, verb tenses all separated out– Advanced functionality: keyword “projects”, import data into

Excel, synonyms, …• Cons

– Full product requires subscription fee ($59/month or $329/year)– Data is from a small sample of Internet searches (from the

minor search engines Dogpile and MetaCrawler).– Contains bogus data from automated searches– No historical archives

Keyword Discovery• Similar features

as Wordtracker.• Trend graphs

provide a visual that goes beyond total searches.

• Various settings to refine data.

• Note: plural setting only pluralizes the last word.

Free version: www.keyworddiscovery.com/search.htmlFree version: www.keyworddiscovery.com/search.html

Keyword Discovery• Pros

– Full year of historical archives– Data is from a larger sample of Internet searches– Singular vs. plural, misspellings, verb tenses all separated out– Can segment by country– Advanced functionality: keyword “projects”, import data into

Excel, synonyms, …• Cons

– Access to the historical data requires subscription fee ($69.95/month or $599.40/year).

– Contains bogus data from automated searches

• Enter in lists of terms.

• Pull terms from a web page.

• Search volume– Switch to Exact

match– Show Search

Volume Trends column.

Google AdWords Keyword Tool

Free version: adwords.google.com/select/KeywordToolExternalFree version: adwords.google.com/select/KeywordToolExternal

Google AdWords Keyword Tool• Pros

– Free!– Accessing within Google AdWords yields more features– Data is from a large sample of Internet searches (from Google)– Singular vs. plural, misspellings, verb tenses– Can segment by country (within AdWords)– Synonyms– Monthly & average search volumes

• Cons– Numbers are approximations

• Provides a graphical, relative search volume comparison.

• Enter in up to 5 search terms.

• Shows related news.

• Sign-in to get relative ranking.

Google Trends www.google.com/trendswww.google.com/trends

What’s the busiest time of yearfor chocolate?

Chocolate

Google Trends• Pros

– Free!– Signing into Google account provides additional detail & features– Data is from a large sample of Internet searches (from Google)– Shows related news searches– Can segment by region or sub-region– Filter by time frame– Spot seasonal trends

• Cons– Numbers are purely relational to the query set– No way to export– Only preset data filtering– Limited to broad, popular search phrases

Google Insights for Search• Similar to Google Trends• Additional unique features

– Compare against a category– Geographic search volume

maps– Provides a relative index

measure against all searches performed on Google over time.

www.google.com/insights/search/www.google.com/insights/search/

Easy to Drill Into Regional Data

Google Insights for Search• Pros

– Free!– Signing into Google account provides additional detail &

features– Data is from a large sample of Internet searches (from

Google)– Shows related news searches– Can segment by region & subregion– Filter by time frame, even custom date ranges– Export as CSV

• Cons– Numbers are a normalized index– Limited to broad, popular search phrases

Competitiveness• Competition for that keyword should also be

considered– Calculate KEI Score (Keyword Effectiveness Indicator)

= ratio of searches over number of pages in search results.

– The higher the KEI Score, the more attractive the keyword is to target (assuming it’s relevant to your business).

– Perform advanced searches to determine difficulty• “digital cameras”• intitle:“digital cameras”

Duplicate Content

Syndicating Exact Copies = BAD

NYTimes.com Syndicate-NYTimes.com

What if your search results looked like this?

How Search Engines Prevent It

• Identify Duplicate Content• Pick One as a Winner• Ignore the Rest

• Helps prevent poor quality search experiences such as that shown on the prior slide

Picking the Winner

• Where Google first saw the content• Trust in the domain• Best link graph• Do copies link back to the original?• Does it look like it was scraped?• Only if it’s close, PageRank

Detecting Duplicate Content• Navigation / Templates ignored• Simple Text comparisons• Shingles (will illustrate in a moment)• Factor out word substitution• Content does not have to match exactly to be

duplicate– What % makes a duplicate?– Not published by search engines, changes over time

Word Substitution

• Our San Diego pizzas are the finest available. We provide high quality San Diego pizzas to …

• Our San Jose pizzas are the finest available. We provide high quality San Jose pizzas to …

• This is still duplicate content!

Reordering Content Does not Help

This is the Same Page!

Text Based Example

The clueless fool decided to make the best of it by going to find …

However, he kept bumping his head into the door, which led to a severe lowering of his already limited intellect …

And this is the Same Content!

However, he kept bumping his head into the door, which led to a severe lowering of his already limited intellect …

The clueless fool decided to make the best of it by going to find …

How Dupe Content is Created• Faceted Navigation (sort by price, sort by color …)• Session IDs on URLs ?id=890889089• Different URLs resolving to the same content:

– http://www.domain.com/content17– http://domain.com/content17– https://www.domain.com/content17

• Affiliate programs:– http://www.domain.com/content17?afid=stone

• Syndicating content

Costs of Duplicate Content• Wasted Crawl Budget

– Search engines limit how much they will crawl a site during a given day

– Time spent crawling dupes is wasted– Valuable (non-dupe) pages may not get crawled

• Wasted PageRank / link juice– Links to dupes provide no value

Solutions• In priority order:

1. Simply eliminate it2. 301 redirect un-needed copies to preferred version3. Canonical tag

• <link rel="canonical" href="http://www.example.com/product.php?item=swedish-fish" />

4. Use robots.txt to prevent crawling of the dupes• If syndicating content

1. Syndicate content not published on your site2. Differentiate content via added content (e.g. UGC)3. Include a link back to the original article

Watch for “Pseudo-Dupes”

• This is content on your site which competes for the same keywords

• Once again, the search engines will pick one, and ignore the others!

Duplicate Title Tags• Check for duplication

– Use special queries with Google to find duplication.

– Over 9,000 duplicates of this title alone … what does it say to Google?

• Purely duplicate titles• Canonicalization• Parameters & URL bloat

• Site:domain.com intitle:”title”

Google Helps Find Duplication

• Google’s Webmaster Central– Alerted to

duplicate titles

Thank You!Eric Engeeenge@stonetemple.com@stonetemple(508) 485-7751

http://www.stonetemple.com/blog http://searchengineland.com/author/eric-enge http://searchenginewatch.com/sew_author_fullarchive&author=3624376 http://www.seomoz.org/users/view/18040http://www.instantetraining.com/ http://artofseobook.com