Google Panda was designed to go after sites with a high volume of thin or low-quality content (among other things) in an effort to improve the overall strength of the search results. But the unfortunate reality is that with just about every Panda update well-established brands with good content get caught in the filter.
We work with a lot of major publishers and nearly all of them have had at least a few titles get hit by Panda in recent years. The types of sites vary but most are trusted, authoritative brands with large audiences that have produced quality, popular content for years.
Here’s an example of site that was hit by Panda 4.0 and recovered with Panda 4.1:
The good news is that each site has eventually been able to recover. But it is not always an easy process and it often takes significant changes and considerable time. And when search traffic declines substantially for several months or longer that has serious repercussions for the business.
So I thought I’d share some of the most common causes of Panda problems for reputable publishers. For in-depth Panda analysis I also recommend checking out Glenn Gabe, who frequently shares useful information on the subject.
Hopefully this will help others to avoid a similar fate. Like it or not, taking a preemptive approach against Panda and other forms of algorithmic filtering has become a fundamental part of SEO. And nearly every site has some degree of technical, design and editorial issues that are potential risk factors.
Here are 10 things that cause good sites to get caught up in Google Panda:
- Templates with limited text – many content sites have certain non-article templates that by design have relatively little editorial text. While not usually a problem by itself, when combined with other Panda risk factors this can be problematic. Instituting a minimum word count is helpful.
- Slideshows – slideshows and photo galleries are frequently used by publishers both for the visual experience and to increase pageviews. But they can also be a thin content factory and a source of duplicate page elements. So it is essential to carefully manage their setup and design, in particular the amount of content per slide and whether or not each individual slide can be indexed. Many are moving to setups in which the full content is accessible on the main page and that is the only URL that can be indexed.
- Mobile – with mobile averaging 40% of total traffic and continuing to grow, some editorial teams are modifying their writing style to be shorter and more succinct to make the content more digestible on mobile devices. Ironically this has potential to create thin content issues, so there is a balance that needs to be maintained.
- Viewability – advertising viewability has become a top priority for publishers, which has the consequence of altering the content-to-ad ratio above the fold. There is some overlap here with the “top heavy” page layout algorithm but on templates with limited editorial text this becomes a risky combination. It is critical to maintain a substantial amount of editorial content above the fold. (Update: for more on this last point see the comment from Michael Cottam below).
- Other advertisement and promotion issues – overuse of interstitial, prestitial and other forms of interruptive ads and promos can also be problematic, in part because they can lead to poor engagement signals. If it seems like it could be annoying to users, it probably is.
- Duplicate content or page elements – having a significiant number of pages with limited text that also have duplicate content or page elements (such as title tags and headings) is another common problem. Publishing the same piece on more than one URL (which often happens unintentionally due to CMS or migration issues), URL canonicalization issues and duplication caused by tracking codes can also contribute to Panda problems, particularly when there are other issues at play.
- Syndicated content – publishing a large amount of syndicated content has been problematic for some brands, particularly when that content appears on quite a few other sites. It is important that non-original content remains a small percentage of the indexable pages on the site.
- Overlapping content – overlapping content can sometimes cause problems, particularly if one version is fairly thin. An example of this would be publishing a blog post, article and gallery all on the same subject, with each version repeating certain portions of the same text and sharing the same headlines and title tags.
- Internal search and tag pages – internal search can be a source of an excessive number of thin, overlapping and partially duplicate pages. If search pages are allowed to be indexed (and often it is best not to) they need to be managed carefully. A similar issue can occur with tag pages. These are commonly used on content sites and by themselves they don’t tend to cause Panda problems. But when tag pages start to become a fairly high percentage of the total pages on the site, and many of them are overlapping or have very few links, this can be problematic.
- Incorrect or empty pages – it is not unusual for a site to have a number of incorrect or even empty pages that were never meant to exist, let alone be indexed. This can inadvertently add a lot of thin, low-value pages and potentially soft 404 errors to the site. Legacy issues and unintended problems related to redesigns and migrations are the most common sources. While typically not serious enough to trigger a Panda hit on their own, they can be a contributing factor.
The Path to Panda Recovery
Recovering from Panda typically requires a mix of template and technical changes combined with the elimination, consolidation and de-indexation of low-value pages through strategic use of 404s, permanent redirects, rel=canonical and meta robots “noindex” tags, among other things.
When we do Panda analysis for a site I always emphasize a couple things:
- The path to recovery cannot be definitively known. The most likely causes will be identified but it takes a process of trial and error to address enough of the offending issues to break a site out of the filter.
- Even after the necessary changes are made the recovery will not be instantaneous. Once a site has cleaned up its profile, breaking free of Panda requires a rolling update by Google. For a while these would happen roughly monthly and it is apparently happening more frequently now, which is encouraging. But in some cases it takes a larger Panda update for a site to break out, and that happens only periodically.
So if you have been hit by Panda, time and patience are also a necessary part of the process.
Michael Cottam says
Great job on this Adam! As someone who does a ton of Panda optimization consulting, I have to say you’ve really covered the issues well. About the only thing I’d add would be this: designs with lots of whitespace and content spaced vertically well down the page are popular today, and unfortunately these kinds of designs allow very little content above the fold. I’m seeing a number of clients struggle with this & Panda.
Adam Sherk says
Thanks Michael! That’s a good one.
Adam Melson says
Really great article to point to when a site owners needs to know what to fix/find when a traffic drop corresponds with a known Panda update.
Sites that have succeeded in recovering, from my experience, have been those that are willing to invest the resources to fixing ALL the potential issues. Limiting yourself by fixing only the top 1-2 potential issues with resources when your organic traffic just dropped 40% is a bet I wouldn’t want to take. Fix it all, get your traffic back.
Adam Sherk says
Thanks Adam, and that’s a good point. Even if certain issues end up not being directly part of the Panda problem, you can’t always know for sure, and fixing them all is good for the site regardless.
I have been reading dozens of Panda related posts and this is probably the most succinct and helpful one. Thanks!
Vitaliy Semerenko says
Great article, I have been noticing rank fluctuations through the month of October as Google has been doing a few refreshes. I think that moving forward, the quality of the content will surpass the volume of it. If you are short, straight to the point and relevant than you are on your way to success. Another thing is blogging which a lot of website owners leave out of the picture, blogs need to be updated regularly and make sure that the older posts are still eligible to be on the website.
Excellent post Adam!
We suffered twice & came out quickly. Latest one was between 14-18 Oct 2014. It took 2 weeks to recover but it happened automatically. We did not take any specific efforts.
As a policy decision, we moved from 400 words page content to 600+ words. We ensure canonical is correctly used on all pages. Our SEO team regularly evaluates page speed & broken links (ScreamingFrog is the best). Consistency with these methods seems to be the best approach.
You mention duplicate pages caused by “tracking codes”. Could you say more about that?
I commonly install phone metrics systems to track incoming calls. These do typically create an even or a page, depending upon the vendor used.
Are these the sort of “duplicate content” you are speaking of? Is “Tag Manager” itself a likely culprit?
I’m very much awaiting your response…
Adam Sherk says
Max – Thanks! I’m glad you liked it.
Vitaliy – I agree; content quality, publication frequency, trust and expertise of the author, etc. are all critical for SEO success, whether you are have having trouble with algorithmic filtering or not.
David – I’m glad you were able to break out so quickly. And since you didn’t make any changes it sounds like you benefited from some tweaking or corrections on Google’s side. So I agree it’s all the more reason to have a solid foundation in place.
Dennis – What I was referring to is when publishers append a parameter to the end of URL for traffic tracking purposes, for example site.com/page?xid=rss. This creates a form of duplicate content that needs to be offset with rel=canonical tags and preferably URL Parameters instructions in Google Webmaster Tools. It’s a common practice and certainly not something that would trigger something like Panda on it’s own. But when we look at sites that have been hit it’s often one of several sources of duplicate content, which when combined with other issues can be problematic.
I would like to see some examples also please. Maybe a full case study on a website that was affected by the Panda updates, what you deemed the problem ( like the examples above ) and then the changes you made over time and some analytics to show how these changes improved the website again.
Would you be willing to share any case studies please?
I understand not all websites have all these issues mentioned but its nice to see the suggestions above and the impact the have or had in a real working example.
Google has said it long ago that they want content. The Panda updates are just that, the search for content. Keep your website clean and with original content and you will not be affected by it. I spent several months re-writing close to one-hundred pages of content because they were exactly the same, with just some names changed within them. Once that was done and they were unique, they started receiving direct traffic from search engines and some ranked really well.
Copying content may be easier, but if you want you’re website to do well, take that little extra effort and make unique content, even if you are writing about the same thing like product instructions or describing services.
Great article nonetheless.
Adam Sherk says
Pab – I’m not able to share specific examples due to client confidentiality, but I hope this overview of the types of issues commonly found proves helpful to others.
Dorin – In this post I’m focusing on well-known media sites with lots of quality, original content that still manage to get hit by some of the larger Panda updates. In investigating those hits these are the types of problems that tend to be found.
Mick Kennys says
About this mobile thing!
So how we should deal with that? From one side Google is warning us: do not forget about mobilers – your sites should work smoothly on their devices. From the other side you say – do not look too much on mobilers!
Adam Sherk says
Mick – To clarify, mobile is a major source of traffic for publishers and technical and editorial optimization for mobile is critical. I’m just making the point that from a content perspective, you need to maintain a balance and watch out for any mobile-focused efforts that could lead to a high volume of pages with limited text content. In addition it is important to watch out for anything that could be harming engagement signals with mobile users. So mobile is very important to pay attention to, be it for Panda or your overall SEO efforts.