Publishers: Solve Tracking Code, Duplicate Content Issues with the Canonical URL Tag

by Adam Sherk on April 30, 2009

Really you shouldn’t need it. What you should be doing is avoiding duplicate content altogether. Every piece of content on your site should exist on one permanent, unique URL, and any duplicate pages should be consolidated through 301 redirects. But for a variety of business, technical or editorial reasons, sometimes publishing the same content or resolving the same page on more than one URL simply can’t be avoided.

On newspaper and magazine sites, the main culprit is typically the practice of appending tracking codes to the end of URLs, for example: www.yoursite.com/article?xid=rss (or ?xid=topstories or ?cid=partner, etc…).

Until recently the best solution offered up was to append tracking codes with a hash mark (#) instead of a question mark (?), as suggested by Stephan Spencer and Nathan Buggia, among others.

Fortunately the major engines came to the rescue in February with the Canonical URL tag, which allows sites to identify which URL is the canonical or primary version of a page. It is placed in the <head> of any duplicate pages and points to the canonical URL for the page:

<link rel="canonical" href="http://www.example.com/canonical-URL"/>

Google describes the canonical URL tag as a “hint that we honor strongly” as opposed to a firm directive, so the use of the tag does not guarantee that all duplicate content issues will be resolved. However it greatly increases the likelihood that the canonical version of a page will be the one displayed in the search results, and if it functions as described, it will also transfer the value of links pointing to the duplicate pages to the canonical URL.

This means that publishers can append tracking codes to URLs as needed while avoiding the duplicate content issues that place the content at a competitive disadvantage.

It’s still early days for the tag, but based on results from some magazine sites we work with that have been experimenting with it, I’d say so far, so good.

Share this post:
  • FriendFeed
  • StumbleUpon
  • Facebook
  • del.icio.us
  • Sphinn
  • Digg
  • Reddit

Related posts:

  1. The Most Common Causes of Duplicate Content on News Media Sites
  2. Will Publishers Add Cross-Domain Rel=Canonical to Syndication Deals?
  3. Syndication Best Practices: Reduce the Risk of Being Outranked for Your Own Content
  4. Does Google News Sitemaps New Format Help Publishers?
  5. Yahoo News Syndication: Attribution Links Not SEO-Friendly

{ 1 trackback }

The Most Common Causes of Duplicate Content on News Media Sites
October 8, 2009 at 1:30 pm

{ 2 comments… read them below or add one }

Mary Bettinson June 9, 2009 at 2:11 pm

Hi Adam, We met at an SES show a while ago. This is great information we can use for our own client sites. Tracking codes have always been a stickler for large sites. Do you recommend inserting the canonical URL tag on every page of a site, or only those that exhibit tracking codes?

Adam Sherk June 9, 2009 at 2:38 pm

Hi Mary,

There shouldn’t be any harm in having a canonical URL tag on every page (regardless of whether or not duplicates exist), and some sites are now setting this up as the default in their content tool. However since the tag is still fairly new and we’re still learning how the engines are responding to it, I would limit its use to page types that you know have duplicate content issues (such as URLs you are appending with tracking codes).

There is some more information and basic case studies in this conference session summary from last week: http://outspokenmedia.com/seo/canonical-tag/

Leave a Comment