Google Content Guidelines

27
GOOGLE CONTENT GUIDELINES

description

Google Content Guidelines

Transcript of Google Content Guidelines

Page 1: Google Content Guidelines

GOOGLE

CONTENT GUIDELINES

Page 2: Google Content Guidelines

Site title and description:

Google's generation of page titles and descriptions (or "snippets") is completely automated and takes into account both the content of a page as well as references to it that appear on the web. The goal of the snippet and title is to best represent and describe each result and explain how it relates to the user's query. The more information you give us, the better your search result snippet can be. With rich snippets, webmasters with sites containing structured content—such as review sites or business listings—can label their content to make it clear that each labeled piece of text represents a certain type of data: for example, a restaurant name, an address, or a rating. Learn more about how rich snippets can improve your site's listing in search results.

We use a number of different sources for this information, including descriptive information in the title and meta tags for each page. We may also use publicly available information—for instance, anchor text or listings from the Open Directory Project (DMOZ)—or create rich snippets based on markup on the page.

While we can't manually change titles or snippets for individual sites, we're always working to make them as relevant as possible. You can help improve the quality of the title and snippet displayed for your pages by following the general guidelines below.

* Create descriptive page titles * Create good meta descriptions * Prevent search engines from displaying DMOZ data in search results for your site

Create descriptive page titles

Titles are critical to giving users a quick insight into the content of a result and why it’s relevant to their query. It's often the primary piece of information used to decide which result to click on, so it's important to use high-quality titles on your web pages.

Here are a few tips for managing your titles:

Page 3: Google Content Guidelines

* As explained above, make sure every page on your site has a title specified in the <title> tag. If you’ve got a large site and are concerned you may have forgotten a title somewhere, the HTML suggestions page in Webmaster Tools lists missing or potentially problematic <title> tags on your site.

* Page titles should be descriptive and concise. Avoid vague descriptors like "Home" for your home page, or "Profile" for a specific person's profile. Also avoid unnecessarily long or verbose titles, which are likely to get truncated when they show up in the search results.

* Avoid keyword stuffing. It's sometimes helpful to have a few descriptive terms in the title, but there’s no reason to have the same words or phrases appear multiple times. A title like "Foobar, foo bar, foobars, foo bars" doesn't help the user, and this kind of keyword stuffing can make your results look spammy to Google and to users.

* Avoid repeated or boilerplate titles. It’s important to have distinct, descriptive titles for each page on your site. Titling every page on a commerce site "Cheap products for sale", for example, makes it impossible for users to distinguish one page differs another. Long titles that vary by only a single piece of information ("boilerplate" titles) are also bad; for example, a standardized title like "<band name> - See videos, lyrics, posters, albums, reviews and concerts" contains a lot of uninformative text. One solution is to dynamically update the title to better reflect the actual content of the page: for example, include the words "video", "lyrics", etc., only if that particular page contains video or lyrics. Another option is to just use "<band name>" as a concise title and use the meta description (see below) to describe your site's content. The HTML suggestions page in Webmaster Tools lists any duplicate titles Google detected on your pages.

* Brand your titles, but concisely. The title of your site’s home page is a reasonable place to include some additional information about your site—for instance, "ExampleSocialSite, a place for people to meet and mingle." But displaying that text in the title of every single page on your site hurts readability and will look particularly repetitive if several pages from your site are returned for the same query. In this case, consider including just your site name at the beginning or end of each page title, separated from the rest of the title with a delimiter such as a hyphen, colon, or pipe, like this:

<title>ExampleSocialSite: Sign up for a new account.</title>

* Be careful about disallowing search engines from crawling your pages. Using the robots.txt protocol on your site can stop Google from crawling your pages, but it may not always prevent them from being indexed. For example, Google may index your page if we discover it by following a link from someone else's site. To display it in search results, Google will need to display a title of some kind and because we won't have access to any of your page content, we will rely on off-page content such as anchor text from other sites. (To truly block a URL from being indexed, you can use meta tags.)

If we’ve detected that a particular result has one of the above issues with its title, we may try to generate an improved title from anchors, on-page text, or other sources. However, sometimes even pages with well-formulated, concise, descriptive titles will end up with different titles in our search results to better indicate their relevance to the query. There’s a simple reason for this: the title tag as specified by a webmaster is limited to being static, fixed regardless of the query. Once we know the

Page 4: Google Content Guidelines

user’s query, we can often find alternative text from a page that better explains why that result is relevant. Using this alternative text as a title helps the user, and it also can help your site. Users are scanning for their query terms or other signs of relevance in the results, and a title that is tailored for the query can increase the chances that they will click through.

If you’re seeing your pages appear in the search results with modified titles, check whether your titles have one of the problems described above. If not, consider whether the alternate title is a better fit for the query. If you still think the original title would be better, let us know in our Webmaster Help Forum.

Create good meta descriptions

The description attribute within the <meta> tag is a good way to provide a concise, human-readable summary of each page’s content. Google will sometimes use the meta description of a page in search results snippets, if we think it gives users a more accurate description than would be possible purely from the on-page content. Accurate meta descriptions can help improve your clickthrough; here are some guidelines for properly using the meta description.

* Make sure that every page on your site has a meta description. The HTML suggestions page in Webmaster Tools lists pages where Google has detected missing or problematic meta descriptions.

* Differentiate the descriptions for different pages. Identical or similar descriptions on every page of a site aren't helpful when individual pages appear in the web results. In these cases we're less likely to display the boilerplate text. Wherever possible, create descriptions that accurately describe the specific page. Use site-level descriptions on the main home page or other aggregation pages, and use page-level descriptions everywhere else. If you don't have time to create a description for every single page, try to prioritize your content: At the very least, create a description for the critical URLs like your home page and popular pages.

* Include clearly tagged facts in the description. The meta description doesn't just have to be in sentence format; it's also a great place to include structured data about the page. For example, news or blog postings can list the author, date of publication, or byline information. This can give potential visitors very relevant information that might not be displayed in the snippet otherwise. Similarly, product pages might have the key bits of information—price, age, manufacturer—scattered throughout a page. A good meta description can bring all this data together. For example, the following meta description provides detailed information about a book.

<meta name="Description" content="Author: A.N. Author, Illustrator: P. Picture, Category: Books, Price: $17.99, Length: 784 pages">

In this example, information is clearly tagged and separated.

* Programmatically generate descriptions. For some sites, like news media sources, generating an accurate and unique description for each page is easy: since each article is hand-written, it takes minimal effort to also add a one-sentence description. For larger database-driven sites, like product

Page 5: Google Content Guidelines

aggregators, hand-written descriptions can be impossible. In the latter case, however, programmatic generation of the descriptions can be appropriate and are encouraged. Good descriptions are human-readable and diverse, as we talked about in the first point above. The page-specific data we mentioned in the second point is a good candidate for programmatic generation. Keep in mind that meta descriptions comprised of long strings of keywords don't give users a clear idea of the page's content, and are less likely to be displayed in place of a regular snippet.

* Use quality descriptions. Finally, make sure your descriptions are truly descriptive. Because the meta descriptions aren't displayed in the pages the user sees, it's easy to let this content slide. But high-quality descriptions can be displayed in Google's search results, and can go a long way to improving the quality and quantity of your search traffic.

Prevent search engines from displaying DMOZ data in search results for your site

One source Google uses to generate snippets is the Open Directory Project. You can direct us not to use this as a source by adding a meta tag to your pages.

To prevent all search engines (that support the meta tag) from using this information for the page's description, use the following:

<meta name="robots" content="NOODP">

To specifically prevent Google from using this information for a page's description, use the following:

<meta name="googlebot" content="NOODP">

If you use the robots meta tag for other directives, you can combine those. For instance:

<meta name="googlebot" content="NOODP, nofollow">

Note that once you add this meta tag to your pages, it may take some time for changes to your snippets to appear in the index.

If you're concerned about content in your title or snippet, you may want to double-check that this content doesn't appear on your site. If it does, changing it may affect your Google snippet after we next crawl your site. If it doesn't, try searching Google.com for the title or snippet enclosed in quotation marks. This will display pages on the web that refer to your site using this text. If you contact these webmasters to request that they change their information about your site, any changes to their sites will be recognized by our crawler after we next crawl their pages.

Pagination

Sites paginate content in various ways. For example:

Page 6: Google Content Guidelines

* News and/or publishing sites often divide a long article into several shorter pages. * Retail sites may divide the list of items in a large product category into multiple pages. * Discussion forums often break threads into sequential URLs.

If you paginate content on your site, and you want that content to appear in search results, we recommend one of the following three options.

* Do nothing. Paginated content is very common, and Google does a good job returning the most relevant results to users, regardless of whether content is divided into multiple pages. * Specify a View All page. Searchers commonly prefer to view a whole article or category on a single page. Therefore, if we think this is what the searcher is looking for, we try to show the View All page in search results. You can also add a rel="canonical" link to the component pages to tell Google that the View All version is the version you want to appear in search results. * Use rel="next" and rel="prev" links to indicate the relationship between component URLs. This markup provides a strong hint to Google that you would like us to treat these pages as a logical sequence, thus consolidating their linking properties and usually sending searchers to the first page.

Using rel="next" and rel="prev"

You can use the HTML attributes rel="next" and rel="prev" to indicate the relationship between individual URLs. Using these attributes is a strong hint to Google that you want us to treat these pages as a logical sequence.

Let's say you have content paginated into the following URLs:

http://www.example.com/article-part1.html http://www.example.com/article-part2.html http://www.example.com/article-part3.html http://www.example.com/article-part4.html

1. In the <head> section of the first page (http://www.example.com/article-part1.html), add a link tag pointing to the next page in the sequence, like this:

<link rel="next" href="http://www.example.com/article-part2.html">

Because this is the first URL in the sequence, there’s no need to add markup for rel="prev". 2. On the second and third pages, add links pointing to the previous and next URLs in the sequence. For example, you could add the following to the second page of the sequence:

<link rel="prev" href="http://www.example.com/article-part1.html"> <link rel="next" href="http://www.example.com/article-part3.html">

3. On the final page of the sequence (http://www.example.com/article-part4.html>), add a link pointing to the previous URL, like this:

Page 7: Google Content Guidelines

<link rel="prev" href="http://www.example.com/article-part3.html">

Because this is the final URL in the sequence, there’s no need to add a rel="next" link.

"Google treats rel="previous" as a syntactic variant of rel="prev". Values can be either relative or absolute URLs (as allowed by the <link> tag). And, if you include a <base> link in your document, relative paths will resolve according to the base URL.

Some things to note:

* rel="prev" and rel="next" act as hints to Google, not absolute directives.

* If a component page within a series includes parameters that don't change the page's content, such as session IDs, then the rel="prev" and rel="next" values should also contain the same parameters. This helps our linking process better match corresponding rel="prev" and rel="next" values. For example, the page http://www.example.com/article?story=abc&page=2&sessionid=123 should contain the following:

<link rel="prev" href="http://www.example.com/article?story=abc&page=1&sessionid=123" /> <link rel="next" href="http://www.example.com/article?story=abc&page=3&sessionid=123" />

* rel="next" and rel="prev" are orthogonal concepts to rel="canonical". You can include both declarations. For example, http://www.example.com/article?story=abc&page=2&sessionid=123 may contain:

<link rel="canonical" href="http://www.example.com/article?story=abc&page=2"/> <link rel="prev" href="http://www.example.com/article?story=abc&page=1&sessionid=123" /> <link rel="next" href="http://www.example.com/article?story=abc&page=3&sessionid=123" />

Note:If Google finds mistakes in your implementation (for example, if an expected rel="prev" or rel="next" designation is missing), we'll continue to index the page(s), and rely on our own heuristics to understand your content.

Meta Tags:

Meta tags are a great way for webmasters to provide search engines with information about their sites. Meta tags can be used to provide information to all sorts of clients, and each system processes only the meta tags they understand and ignores the rest. Meta tags are added to the <head> section of your HTML page and generally look like this:

Page 8: Google Content Guidelines

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> <html> <head> <META NAME="Description" CONTENT="Author: A.N. Author, Illustrator: P. Picture, Category: Books, Price: £9.24, Length: 784 pages"> <META http-equiv="Content-Type" CONTENT="text/html; charset=iso-8859-1"> <META NAME="google-site-verification" CONTENT="+nxGUDJ4QpAZ5l9Bsjdi102tLVC21AIh5d1Nl23908vVuFHs34="/> <title>Example Books - high-quality used books for children</title> <META NAME="robots" CONTENT="noindex,nofollow">

About rel="Canonical"

What is a canonical page?

A canonical page is the preferred version of a set of pages with highly similar content. Why specify a canonical page?

It's common for a site to have several pages listing the same set of products. For example, one page might display products sorted in alphabetical order, while other pages display the same products listed by price or by rating. For example:

http://www.example.com/product.php?item=swedish-fish&trackingid=1234567&sort=alpha&sessionid=5678asfasdfasfd http://www.example.com/product.php?item=swedish-fish&trackingid=1234567&sort=price&sessionid=5678asfasdfasfd

If Google knows that these pages have the same content, we may index only one version for our search results. Our algorithms select the page we think best answers the user's query. Now, however, users can specify a canonical page to search engines by adding a <link> element with the attribute rel="canonical" to the <head> section of the non-canonical version of the page. Adding this link and attribute lets site owners identify sets of identical content and suggest to Google: "Of all these pages with identical content, this page is the most useful. Please prioritize it in search results." How do I specify a canonical URL?

You can specify a canonical URL in two ways:

* Add a rel="canonical" link to the <head> section of the non-canonical version of each HTML page.

To specify a canonical link to the page http://www.example.com/product.php?item=swedish-fish, create a <link> element as follows:

Page 9: Google Content Guidelines

<link rel="canonical" href="http://www.example.com/product.php?item=swedish-fish"/>

Copy this link into the <head> section of all non-canonical versions of the page, such as http://www.example.com/product.php?item=swedish-fish&sort=price.

If you publish content on both http://www.example.com/product.php?item=swedish-fish and https://www.example.com/product.php?item=swedish-fish, you can specify the canonical version of the page. Create the <link> element:

<link rel="canonical" href="http://www.example.com/product.php?item=swedish-fish"/>

Add this link to the <head> section of https://www.example.com/product.php?item=swedish-fish. * Indicate the canonical version of a URL by responding with the Link rel="canonical" HTTP header. Adding rel="canonical" to the head section of a page is useful for HTML content, but it can't be used for PDFs and other file types indexed by Google Web Search. In these cases you can indicate a canonical URL by responding with the Link rel="canonical" HTTP header, like this (note that to use this option, you'll need to be able to configure your server):

Link: <http://www.example.com/downloads/white-paper.pdf>; rel="canonical"

Google currently supports these link header elements for Web Search only.

Is rel="canonical" a suggestion or a directive?

This new option lets site owners suggest the version of a page that Google should treat as canonical. Google will take this into account, in conjunction with other signals, when determining which URL sets contain identical content, and calculating the most relevant of these pages to display in search results. Can the link be relative or absolute?

rel="canonical" can be used with relative or absolute links, but we recommend using absolute links to minimize potential confusion or difficulties. If your document specifies a base link, any relative links will be relative to that base link. Must the content on a set of pages be similar to the content on the canonical version?

Yes. The rel="canonical" attribute should be used only to specify the preferred version of many pages with identical content (although minor differences, such as sort order, are okay).

For instance, if a site has a set of pages for the same model of dance shoe, each varying only by the color of the shoe pictured, it may make sense to set the page highlighting the most popular color as the canonical version so that Google may be more likely to show that page in search results. However, rel="canonical" would not be appropriate if that same site simply wanted a gel insole page to rank higher than the shoe page. What happens if rel="canonical" points to a non-existent page? Or if more than one page in a set is

Page 10: Google Content Guidelines

specified as the canonical version?

We'll do our best to algorithmically determine an appropriate canonical page, just as we've done in the past. Can Google follow a chain of rel="canonical" designations?

Yes, to some extent, but to ensure optimal canonicalization, we strongly recommend that you update links to point to a single canonical page. Can rel="canonical" be used to suggest a canonical URL on a completely different domain?

There are situations where it's not easily possible to set up redirects. This could be the case when you need to migrate to a new domain name using a web server that cannot create server-side redirects. In this case, you can use the rel="canonical" link element to specify the exact URL of the domain preferred for indexing. While the rel="canonical" link element is seen as a hint and not an absolute directive, we do try to follow it where possible.

rel="nofollow"

"Nofollow" provides a way for webmasters to tell search engines "Don't follow links on this page" or "Don't follow this specific link."

Originally, the nofollow attribute appeared in the page-level meta tag, and instructed search engines not to follow (i.e., crawl) any outgoing links on the page. For example:

<meta name="robots" content="nofollow" />

Before nofollow was used on individual links, preventing robots from following individual links on a page required a great deal of effort (for example, redirecting the link to a URL blocked in robots.txt). That's why the nofollow attribute value of the rel attribute was created. This gives webmasters more granular control: instead of telling search engines and bots not to follow any links on the page, it lets you easily instruct robots not to crawl a specific link. For example:

<a href="signin.php" rel="nofollow">sign in</a>

How does Google handle nofollowed links?

In general, we don't follow them. This means that Google does not transfer PageRank or anchor text across these links. Essentially, using nofollow causes us to drop the target links from our overall graph of the web. However, the target pages may still appear in our index if other sites link to them without using nofollow, or if the URLs are submitted to Google in a Sitemap. Also, it's important to note that other search engines may handle nofollow in slightly different ways. What are Google's policies and some specific examples of nofollow usage?

Page 11: Google Content Guidelines

Here are some cases in which you might want to consider using nofollow:

* Untrusted content: If you can't or don't want to vouch for the content of pages you link to from your site — for example, untrusted user comments or guestbook entries — you should nofollow those links. This can discourage spammers from targeting your site, and will help keep your site from inadvertently passing PageRank to bad neighborhoods on the web. In particular, comment spammers may decide not to target a specific content management system or blog service if they can see that untrusted links in that service are nofollowed. If you want to recognize and reward trustworthy contributors, you could decide to automatically or manually remove the nofollow attribute on links posted by members or users who have consistently made high-quality contributions over time. * Paid links: A site's ranking in Google search results is partly based on analysis of those sites that link to it. In order to prevent paid links from influencing search results and negatively impacting users, we urge webmasters use nofollow on such links. Search engine guidelines require machine-readable disclosure of paid links in the same way that consumers online and offline appreciate disclosure of paid relationships (for example, a full-page newspaper ad may be headed by the word "Advertisement"). More information on Google's stance on paid links. * Crawl prioritization: Search engine robots can't sign in or register as a member on your forum, so there's no reason to invite Googlebot to follow "register here" or "sign in" links. Using nofollow on these links enables Googlebot to crawl other pages you'd prefer to see in Google's index. However, a solid information architecture — intuitive navigation, user- and search-engine-friendly URLs, and so on — is likely to be a far more productive use of resources than focusing on crawl prioritization via nofollowed links.

How does nofollow work with the Social Graph API (rel="nofollow me")?

If you host user profiles and allow users to link to other profiles on the web, we encourage you to mark those links with the rel="me" microformat so that they can be made available through the Social Graph API. For example:

<a href="http://blog.example.com" rel="me">My blog</a>

However, because these links are user-generated and may sometimes point to untrusted pages, we recommend that these links be marked with nofollow. For example:

<a href="http://blog.example.com" rel="me nofollow">My blog</a>

With rel="me nofollow", Google will continue to treat the rel="nofollow" as expected for search purposes, such as not transferring PageRank. However, for the Social Graph API, we will count the rel="me" link even when included with a nofollow.

If you are able to verify ownership of a link using an identity technology such as OpenID or OAuth, however, you may choose to remove the nofollow link.

To prevent crawling of a rel="me nofollow" URL, you can use robots.txt. Standard robots.txt exclusion rules are respected by both Googlebot and the Social Graph API.

Page 12: Google Content Guidelines

Canonicalization

Many sites make the same HTML content or files available via different URLs. Say you have a clothing site and one of your top items is a green dress. The product page for the dress may be accessible through several different URLs, especially if you use session IDs or other parameters:

http://www.example.com/products/women/dresses http://www.example.com/products?category=dresses&color=green&cruel=no http://example.com/shop/index.php?product_id=32&highlight=green+dress&cat_id=1&sessionid=123&affid=431 http://example.com/dresses/cocktail?gclid=ABCD http://www.example.com/dresses/greendress.html

To gain more control over how your URLs appear in search results, and to consolidate properties, such as link popularity, we recommend that you pick a canonical (preferred) URL as the preferred version of the page. You can indicate your preference to Google in a number of ways. We recommend them all, though none of them is required (if you don't indicate a canonical URL, we'll identify what we think is the best version).

* Set your preferred domain * Specify the canonical link for each version of a page * Use 301 redirects * Indicate your canonical (preferred) URLs by including them in a Sitemap * Indicate how you would like Google to handle dynamic parameters * Specify a canonical link in your HTTP header

Set your preferred domain

Setting your preferred domain tells Google which version of your site's URL (http://www.example.com or http://example.com) you prefer.

If you set your preferred domain as http://example.com, we'll treat links to http://www.example.com exactly the same as links to your preferred domain.

To set the preferred domain for a site, click Configuration, and then click Settings. In the Preferred domain section, pick the option you prefer. Specify the canonical link for each version of the page

If you want http://www.example.com/dresses/greendress.html to be the canonical URL for your

Page 13: Google Content Guidelines

listing, you can indicate this to search engines by adding a <link> element with the attribute rel="canonical" to the <head> section of the non-canonical pages. To do this, create a link as follows:

<link rel="canonical" href="http://www.example.com/dresses/greendress.html">

Add this extra information to the <head> section of non-canonical URLs.

http://example.com/dresses/greendress.html?gclid=ABCD http://example.com/dresses/index.php?product_id=32&highlight=green+dress&cat_id=1&sessionid=123

This tells Google that these URLs all refer to the canonical page at http://www.example.com/dresses/greendress.html. Note: We recommend using a link with the attribute rel="canonical" to indicate your preferred URL, but we can't guarantee to follow that preference in all cases.

More information about rel="canonical". Use 301 redirects

If a page can be reached in multiple ways—for instance, http://example.com/home, http://home.example.com, or http://www.example.com—it's a good idea to pick one of those URLs as your preferred (canonical) destination, and use 301 redirects to send traffic from the other URLs to your preferred URL. A server-side 301 redirect is the best way to ensure that users and search engines are directed to the correct page. The 301 status code means that a page has permanently moved to a new location.

To implement a 301 redirect for websites that are hosted on servers running Apache, you'll need access to your server's .htaccess file. (If you're not sure about your access or your server software, check with your webhoster.) For more information, consult the Apache .htaccess Tutorial and the Apache URL Rewriting Guide. If your site is hosted on a server running other software, check with your hoster for more details, or check out this article. Indicate your canonical (preferred) URLs by including them in a Sitemap

Pick a canonical (preferred) URL for each of your product pages, and tell us about your preference by submitting these canonical URLs in a Sitemap.

We don't guarantee that we'll use the URLs you submit in a Sitemap, but submitting one is a useful way to tell Google about the pages on your site you consider most important. Indicate how you would like Google to handle dynamic parameters

Use Parameter Handling to tell Google about any parameters you would like ignored. Ignoring certain parameters can reduce duplicate content in Google's index, and make your site more crawlable. For example, if you specify that the parameter sessionid should be ignored, Google will consider http://www.example.com/dresses/green.htm?sessionid=273749 to be the same as http://www.example.com/dresses/green.htm. Specify a canonical link in your HTTP header

Page 14: Google Content Guidelines

If you can configure your server, you can use rel="canonical" HTTP headers to indicate the canonical URL for HTML documents and other files such as PDFs. Say your site makes the same PDF available via different URLs (for example, for tracking purposes), like this:

http://www.example.com/downloads/white-paper.pdf http://www.example.com/downloads/partner-1/white-paper.pdf http://www.example.com/downloads/partner-2/white-paper.pdf http://www.example.com/downloads/partner-3/white-paper.pdf

In this case, you can use a rel="canonical" HTTP header to specify to Google the canonical URL for the PDF file, as follows:

Link: <http://www.example.com/downloads/white-paper.pdf>; rel="canonical"

Google currently supports these link header elements for Web Search only.

Author information in search results

Google is piloting the display of author information in search results to help users discover great content. Check out these sample queries: [steven levy google plus], [chlorine based life], [madonna], [britney spears], [google authorship], [david pogue nytimes], [pete wentz], [javascript inheritance]. Google Web Search Google Web Search result snippet showing authorship information Google News Google News Search result showing authorship information

If you want your authorship information to appear in search results for the content you create, you'll need a Google+ Profile with a good, recognizable headshot as your profile photo. Then, verify authorship of your content by associating it with your profile using either of the methods below. Google doesn't guarantee to show author information in Google Web Search or Google News results.

Option 1: Link your content to your Google+ profile using a verified email address.

Don't have an email address on the same domain as your content? Follow the instructions listed in Option 2 below.

1. Check that you have a email address (for example, [email protected]) on the same domain as your content (wired.com). 2. Make sure that each article or post you publish on that domain has a clear byline identifying you as the author (for example, "By Steven Levy" or "Author: Steven Levy"). 3. Visit the Authorship page and submit your email address to Google. No matter how many articles or posts you publish on this domain, you only need to do this process once. Your email will appear in

Page 15: Google Content Guidelines

the Contributor to section of your Google+ profile. If you want to keep your email private, change the visibility of your link. 4. To see what author data Google can extract from your page, use the structured data testing tool.

Option 2: Set up authorship by linking your content to your Google+ profile

1. Create a link to your Google+ profile from your webpage, like this:

<a href="[profile_url]?rel=author">Google</a>

Replace [profile_url] with the your Google+ profile URL, like this:

<a href="https://plus.google.com/109412257237874861202? rel=author">Google</a>

Your link must contain the ?rel=author parameter. If it's missing, Google won't be able to associate your content with your Google+ profile. 2. Add a reciprocal link back from your profile to the site(s) you just updated. 1. Edit the Contributor To section. 2. In the dialog that appears, click Add custom link, and then enter the website URL. 3. If you want, click the drop-down list to specify who can see the link. 4. Click Save. 3. To see what author data Google can extract from your page, use the structured data testing tool.

If you don't want your authorship information to appear in search results, edit your profile (using http://plus.google.com/me/about/edit), and make sure the Profile discovery option Help others discover my profile in search results is unchecked.

Link your content to a Google+ profile using rel="author"

inking your content to your Google profile is a two-step process. First, you add a link from your website or page to your Google profile. Second, you update your Google profile by adding a link back to your site. 1. Link your content to your Google profile

Create a link to your Google profile from your webpage, like this:

<a href="[profile_url]?rel=author">Google</a>

Replace [profile_url] with the your Google Profile URL, like this:

<a href="https://plus.google.com/109412257237874861202? rel=author">Google</a>

Page 16: Google Content Guidelines

Your link must contain the ?rel=author parameter. If it's missing, Google won't be able to associate your content with your Google profile. 2. Link to your content from your profile

The second step of the verification process is to add a reciprocal link back from your profile to the site(s) you just updated. To add links to your Google profile:

1. Sign in to your Google profile. pages opens in new window' 2. Click Edit profile. 3. Click the Contributor To section on the right (depending on how many photos you have, you may need to scroll to see this section), and then click Add custom link. 4. If you want, change the visibility of your link, and then click Save.

To check your markup and see what author data Google can extract from your page, use the structured data testing tool. The tool only looks at a single page, so for now, you'll need to check author pages and content pages separately to see if they are linking to each other correctly.

Automatically generated content

Automatically generated—or “auto-generated”—content is content that’s been generated programmatically. Often this will consist of paragraphs of random text that makes no sense to the reader but which may contain search keywords.

Some examples of auto-generated content include:

* Text translated by an automated tool without human review or curation before publishing * Text generated through automated processes, such as Markov chains * Text generated using automated synonymizing or obfuscation techniques * Text generated from scraping Atom/RSS feeds or search results * Stitching or combining content from different web pages without adding sufficient value

Sneaky redirects

Redirecting is the act of sending a visitor to a different URL than the one they initially requested. There are many good reasons to redirect one URL to another, for example when moving your site to a new address, or consolidating several pages into one.

However, some redirects are designed to deceive search engines or to display different content to

Page 17: Google Content Guidelines

human users than to search engines. It’s a violation of Google’s Webmaster Guidelines to use JavaScript, a meta refresh, or other technologies to redirect a user to a different page with the intent to show the user a different page than a search engine crawler sees. When a redirect is implemented in this way, a search engine may index the original page rather than following the redirect, whereas users are taken to the redirect target. Like cloaking, this practice is deceptive because it attempts to display different content to users and to Googlebot, and can take a visitor somewhere other than where they expected to go.

Using JavaScript to redirect users can be a legitimate practice. When examining JavaScript or other redirects to ensure your site adheres to our guidelines, consider the intent. For example, if you redirect users to an internal page once they’re logged in, you can use JavaScript to do so. Keep in mind that 301 redirects are best when moving your site, but you could use a JavaScript redirect if you don’t have access to your website’s server.

Doorway pages

Doorway pages are typically large sets of poor-quality pages where each page is optimized for a specific keyword or phrase. In many cases, doorway pages are written to rank for a particular phrase and then funnel users to a single destination. Whether deployed across many domains or established within one domain, doorway pages tend to frustrate users.

Therefore, Google frowns on practices that are designed to manipulate search engines and deceive users by directing them to sites other than the one they selected, and that provide content solely for the benefit of search engines. Google may take action on doorway sites and other sites making use of these deceptive practices, including removing these sites from Google’s index.

Some examples of doorways include:

* Having multiple domain names targeted at specific regions or cities that funnel users to one page * Templated pages made solely for affiliate linking * Multiple pages on your site with similar content designed to rank for specific queries like city or state names

Scraped content

Some webmasters use content taken (“scraped”) from other, more reputable sites on the assumption that increasing the volume of pages on their site is a good long-term strategy regardless of the

Page 18: Google Content Guidelines

relevance or uniqueness of that content. Purely scraped content, even from high-quality sources, may not provide any added value to your users without additional useful services or content provided by your site; it may also constitute copyright infringement in some cases. It's worthwhile to take the time to create original content that sets your site apart. This will keep your visitors coming back and will provide more useful results for users searching on Google.

Some examples of scraping include:

* Sites that copy and republish content from other sites without adding any original content or value * Sites that copy content from other sites, modify it slightly (for example, by substituting synonyms or using automated techniques), and republish it * Sites that reproduce content feeds from other sites without providing some type of unique organization or benefit to the user

Creating pages with malicious behavior

Distributing content or software on your website that behaves in a way other than what a user expected is a violation of Google’s Webmaster Guidelines. This includes anything that manipulates content on the page in an unexpected way, or downloads or executes files on a user’s computer without their consent. Google not only aims to give its users the most relevant search results for their queries, but also to keep them safe on the web.

Some examples of malicious behavior include:

* Changing or manipulating the location of content on a page, so that when a user thinks they’re clicking on a particular link or button the click is actually registered by a different part of the page * Injecting new ads or pop-ups on pages, or swapping out existing ads on a webpage with different ads; or promoting or installing software that does so * Including unwanted files in a download that a user requested * Installing malware, trojans, spyware, ads or viruses on a user’s computer * Changing a user’s browser homepage or search preferences without the user’s informed consent

Hacked content

Hacked content is any content that is placed on your site without your permission due to vulnerabilities in your site’s security. In order to protect our users and to maintain the integrity of our search results, Google tries its best to keep hacked content out of our search results. Hacked content gives our users results that are not useful and can potentially install malicious content on their machines. We recommend that you keep your site secure, and clean up hacked content when you find

Page 19: Google Content Guidelines

it.

Some examples of hacking include:

* Injected content When hackers gain access to your website, they may try to inject malicious content into existing pages on your site. This often takes the form of malicious JavaScript injected directly into the site, or into iframes.

* Added content Sometimes, due to security flaws, hackers are able to add new pages to your site that contain spammy or malicious content. These pages are often meant to manipulate search engines. Your existing pages may not show signs of hacking, but these newly-created pages could harm your site’s visitors or your performance in search results.

* Hidden content Hackers may also try to subtly manipulate existing pages on your site. Their goal is to add content to your site that search engines can see but which may be harder for you and your users to spot. This can involve adding hidden links or hidden text to a page by using CSS or HTML, or it can involve more complex changes like cloaking.

If your site has been compromised and you need help cleaning up the hacked content, see our tips here. Remember, the most effective way to combat hacking is to prevent it from happening in the first place. Here are our tips on prevention.