Duplicate Content - SES London

44
Identifying, Removing & Preventing Duplicate Content @SamuelCrocker London | 2024 Feb, 2012 | #seslondon

description

A look at common causes of duplicate content on websites and how to address them.

Transcript of Duplicate Content - SES London

Page 1: Duplicate Content - SES London

Identifying, Removing & Preventing Duplicate Content

@SamuelCrocker

London | 20–24 Feb, 2012 | #seslondon

Page 2: Duplicate Content - SES London

Introduction

• Personally responsible for delivery for enterprise level clients in travel, retail/ecommerce, restaurants, entertainment, etc.

• Dealt with some really large, really ugly sites suffering from a number of these issues

London | 20–24 Feb, 2012 | #seslondon

@SamuelCrocker

Page 3: Duplicate Content - SES London

Before we begin…

London | 20–24 Feb, 2012 | #seslondon

@SamuelCrocker

Page 4: Duplicate Content - SES London

Single Site Issues

London | 20–24 Feb, 2012 | #seslondon

@SamuelCrocker

Page 5: Duplicate Content - SES London

Old Architecture & Hoarders*

London | 20–24 Feb, 2012 | #seslondon

@SamuelCrocker *The worst offenders

Page 6: Duplicate Content - SES London

The First Step is to Clean House

London | 20–24 Feb, 2012 | #seslondon

@SamuelCrocker

Page 7: Duplicate Content - SES London

Target Suspected Duplicate Content First

London | 20–24 Feb, 2012 | #seslondon

@SamuelCrocker

Unknown Properties

Known Properties

Page 8: Duplicate Content - SES London

An Idealistic Approach*

London | 20–24 Feb, 2012 | #seslondon

@SamuelCrocker

Step 1

Identify Duplicate

Content

*In my experience a short-midterm “fix” only makes these issues worse and more expensive in the LT

Page 10: Duplicate Content - SES London

Followed by the Delivery Part…

London | 20–24 Feb, 2012 | #seslondon

@SamuelCrocker

Step 3

Create mapping

and rewrite rules*

Step 1

Identify duplicate

Content

Step 2

Create a sensible

I/A

*You’re going to need a lot of time for this: which page currently ranks, which page converts better, etc.

Page 11: Duplicate Content - SES London

Process is Important

London | 20–24 Feb, 2012 | #seslondon

@SamuelCrocker

Step 4

Create a process

and STICK TO IT*

*obviously processes need to evolve, but don’t forget how quickly this can get out of control (again)

Step 3

Create mapping

and rewrite rules

Step 1

Identify duplicate

Content

Step 2

Create a sensible

I/A

Page 12: Duplicate Content - SES London

That’s Cool Sam, But In The Real World We Have Budgets

London | 20–24 Feb, 2012 | #seslondon

@SamuelCrocker

“That’s true [client], but organic traffic already accounts for [x]% of your online revenue - and by my estimates, getting this problem sorted could save you [x]% in PPC spend, as well as £x per month in link building - and contribute an additional [£X] in revenue... It would also greatly improve usability” “Just out of curiosity, how much money did you spend on TV and Display last year?”

Page 13: Duplicate Content - SES London

“Legacy” & Other Orphan Pages

London | 20–24 Feb, 2012 | #seslondon

@SamuelCrocker

Ask for a full DB dump?

Dig through server logs?

Ask for a proper XML sitemap

External Links?

Yes No

301 to most relevant page, update links as

possible Lower Priority

Page 14: Duplicate Content - SES London

PPC Landing Pages

London | 20–24 Feb, 2012 | #seslondon

@SamuelCrocker

Rel = canonical (if actually duplicate)

NOINDEX, FOLLOW also possible if not

Page 15: Duplicate Content - SES London

Templated Solutions

London | 20–24 Feb, 2012 | #seslondon

@SamuelCrocker

Page 16: Duplicate Content - SES London

Why Are You Doing it?

London | 20–24 Feb, 2012 | #seslondon

@SamuelCrocker

PPC Landing Pages?

Possible.

Plausible.

Not really.

Debatable.

Page 17: Duplicate Content - SES London

Why Are You Doing it?

London | 20–24 Feb, 2012 | #seslondon

@SamuelCrocker

SEO Benefit?

*I’m not saying you can’t rank this way anymore, I’m just saying you have to work a lot harder and it’s not a great user experience

Page 18: Duplicate Content - SES London

What You Should Be Doing

London | 20–24 Feb, 2012 | #seslondon

@SamuelCrocker

Focus on the high converting pages

Create (at a minimum) some unique content.

Focus on the high volume search queries

Create (at a minimum) some unique content.

And if you know you’re never going to have unique content

for some pages – noindex.

Page 19: Duplicate Content - SES London

Product Descriptions

London | 20–24 Feb, 2012 | #seslondon

@SamuelCrocker

Page 20: Duplicate Content - SES London

Product Descriptions – Big(ish) Budget

London | 20–24 Feb, 2012 | #seslondon

@SamuelCrocker

Big Budget

•Hire loads of in-house copywriters •Outsource quality UNIQUE content •Email newsletters for reviews (UGC) •Incentives for reviews (UGC) •Video product reviews + transcripts

•For every product •Video Sitemaps

Page 21: Duplicate Content - SES London

Product Descriptions – Smaller Budgets

London | 20–24 Feb, 2012 | #seslondon

@SamuelCrocker

Smaller Budget

•Fiverr •Interns •UGC •Listen to Ralph

Page 22: Duplicate Content - SES London

Problems from Sites You Don’t Control

London | 20–24 Feb, 2012 | #seslondon

@SamuelCrocker

Page 23: Duplicate Content - SES London

Catch & Stop Your Competitors Scraping Your Content

London | 20–24 Feb, 2012 | #seslondon

@SamuelCrocker

Not Shady Options

1. Create a script to automatically set-up Google Alerts for the first couple sentences of everything you publish.

2. Snippet test your own content from time to time.

3. Use rel=canonical (if scraping entire source), infrequent.

4. Use rel=author on all internal links (can also be done in content) – makes more sense for bloggers.

5. Monitor server logs for traffic spikes and carefully block IPs

6. When all else fails, DMCA

Page 24: Duplicate Content - SES London

Dealing with Relatively Innocent Image Theft

London | 20–24 Feb, 2012 | #seslondon

@SamuelCrocker

Oh look, link targets!

Page 25: Duplicate Content - SES London

Less Innocent Image Theft/Hotlinking?

London | 20–24 Feb, 2012 | #seslondon

@SamuelCrocker

Page 26: Duplicate Content - SES London

Mobile Sites

London | 20–24 Feb, 2012 | #seslondon

@SamuelCrocker

Page 27: Duplicate Content - SES London

The Single URL SEO Camp

London | 20–24 Feb, 2012 | #seslondon

@SamuelCrocker

Champions

“Mobile copies of websites seem to me to be more likely to cause duplicate content issues, technical challenges, waste engineering resources and draw away attention from real mobile opportunities than to earn slightly higher rankings in mobile searches.”

- Rand Fishkin, 2011

Page 28: Duplicate Content - SES London

The Mobile Site Camp

London | 20–24 Feb, 2012 | #seslondon

@SamuelCrocker

Champions

“In my experience, duplicate content doesn’t apply to the mobile paradigm.”

- Bryson Meunier, 2012

Source: http://bit.ly/xwHmeW

Page 29: Duplicate Content - SES London

A Mobile Specific Experience is Preferable

London | 20–24 Feb, 2012 | #seslondon

@SamuelCrocker

• A mobile specific experience is *usually* preferable based on the data I have seen (conversions, rankings, keyword targeting, etc.).

Source: Morgan Stanley

Page 30: Duplicate Content - SES London

BUT, a Separate Mobile Site Should Serve a Purpose

London | 20–24 Feb, 2012 | #seslondon

@SamuelCrocker

Local monthly searches (mobile) Local monthly searches (desktop)

Car hire 8100 90500

Car rental 4400 33100

Car hire is 2.73x more searched than car rental

Desktop

Car hire is 1.8x more searched than car rental

Mobile

Target both terms on homepage. Focus on car hire.

In which case duplicate content shouldn’t be an issue anyhow.

Page 31: Duplicate Content - SES London

You Can’t Ignore Mobile, But Use Resource Wisely

London | 20–24 Feb, 2012 | #seslondon

@SamuelCrocker

• If you can achieve a better experience using style sheets and do not have the resource or need to target mobile users differently (with your content and/or UX) then a single URL solution will make your life easier.

Page 32: Duplicate Content - SES London

Problems With International/Multi-lingual Domains

London | 20–24 Feb, 2012 | #seslondon

@SamuelCrocker

Page 33: Duplicate Content - SES London

Translation & Other Issues from International Domains

London | 20–24 Feb, 2012 | #seslondon

@SamuelCrocker

Bigger Budget

•Localise – do not translate •Never machine translate •Be wary of your local market resources (check Wikipedia) •ccTLD’s for all markets (personal preference) •Be very careful with your Geotargeting settings in Webmaster tools •Unique servers/physical addreses (ideally) for each country

Page 34: Duplicate Content - SES London

Translation & Other Issues from International Domains

London | 20–24 Feb, 2012 | #seslondon

@SamuelCrocker

Smaller Budget

•Manual human translation preferred, even if direct translation. •If you must auto-translate, block non-unique pages with Robots.txt •Make sure you try to adhere to one language per URL •Hreflang: use with caution.

•Perhaps most appropriate for GB and US situations.

Page 35: Duplicate Content - SES London

Problems from Controlling Multiple Similar Sites, in the same niche.

London | 20–24 Feb, 2012 | #seslondon

@SamuelCrocker

Page 36: Duplicate Content - SES London

Multiple Brands/Sites, Same Niche?

London | 20–24 Feb, 2012 | #seslondon

@SamuelCrocker

Choose a brand voice and stick to it.

*It can be a hassle but tying this into everything you do can help ensure unique content.

Page 37: Duplicate Content - SES London

Multiple Brands/Sites, Same Niche?

London | 20–24 Feb, 2012 | #seslondon

@SamuelCrocker

Only speak to one audience at a time

Page 38: Duplicate Content - SES London

Multiple Brands/Sites, Same Niche?

London | 20–24 Feb, 2012 | #seslondon

@SamuelCrocker

Prioritise and go with what converts!

Page 39: Duplicate Content - SES London

Rapid Fire Survival Tips: Questions You Should Constantly be Asking

London | 20–24 Feb, 2012 | #seslondon

@SamuelCrocker

Page 40: Duplicate Content - SES London

Why does this property exist?

London | 20–24 Feb, 2012 | #seslondon

@SamuelCrocker

Who are we trying to target?

When do we want to send users here?

What terms tie into that user journey?

Page 41: Duplicate Content - SES London

Why did we choose this page/focus?

London | 20–24 Feb, 2012 | #seslondon

@SamuelCrocker

Know your strategy Support it with

numbers.

Page 42: Duplicate Content - SES London

Why are we joining this new platform?

London | 20–24 Feb, 2012 | #seslondon

@SamuelCrocker

What about problems on existing sites?

What resource would this require?

Where will this content come from?

What benefit do we expect?

Page 43: Duplicate Content - SES London

Finally: Is it worth rocking the boat?

London | 20–24 Feb, 2012 | #seslondon

@SamuelCrocker

Page 44: Duplicate Content - SES London

Image Credits • Dolly - http://www.telegraph.co.uk/science/8169817/Dolly-the-Sheep-reborn-as-four-new-clones-created.html

• Overwhelmed: http://comerecommended.com/blog/2011/09/13/dealing-with-feeling-overwhelmed-at-work/

• Single- http://www.davidwygant.com/blog/being-single-means-no-bitching-allowed/7280/

• Hoarding: http://coverlaydown.com/wp/wp-content/uploads/2011/08/messy.jpg

• Clean Office: http://www.momoy.info/uploads/interior-design/June-09/syzygy-office-02.jpg

• Pandshake: http://www.genzel.ca/wp-content/uploads/2012/02/Stephen-Harper-best-pm-ever.jpg

• Two Camps: http://www.summarynewspaper.com/high-winds-hit-the-two-camps-on-mount-everest/1735.html

• Sumo: http://www.quicksprout.com/images/littlebig.jpg

• Scrapers: http://seomemes.com/post/6314149329/scrapers-gonna-scrape

• Graffiti: http://raymondpward.typepad.com/newlegalwriter/2009/08/graffiti.html

• Pick your battles: http://www.etsy.com/listing/62656583/pick-your-battles-red-8x10-screenprint

London | 20–24 Feb, 2012 | #seslondon

@SamuelCrocker