The Most Common Google News Errors and How to Avoid Them

Google News logoIn working with publishers one of the things I’m most frequently asked to do is troubleshoot problems with Google News.

Anyone involved with a news or content site can attest to the importance of Google News and news search optimization. And I’d venture to say just as many people have experienced a range of technical and formatting issues that have impeded the indexation and performance of their news content.

To help with that, I’ve compiled a list of the Google News errors – and publisher mistakes – that I most often come across, along with a recommended solution for each.

Eligibility and Indexation

Let’s start with a few basic mistakes that do actually occur with some frequency.

  • Not an approved source – site not showing up in Google News? First make sure you are an approved source, otherwise there is no point in creating a Google News sitemap or complying with the technical and content guidelines. A quick way to check is to do a [site:domain] search in Google News. If you are not an approved source you can apply using this form.
  • Mixing domains – a related issue that sometimes crops up is when a news property is spread out over multiple domains. If the additional domains are not in the Google News database they won’t be considered part of the approved source, and thus won’t be indexed. If a Google News sitemap has been submitted for an unapproved domain this will trigger an “unknown news site” error in the Google Webmaster Tools profile.
  • Google News sitemap – another mistake, or rather a missed opportunity, is not creating a Google News sitemap. Your articles will still get indexed via regular site crawling, but the news sitemap helps to expedite the process. It also enables you to segment your news content (i.e. articles and blog posts) from other forms of editorial content that are not eligible for indexation. With breaking news and trending topics speed counts; don’t put yourself at a competitive disadvantage. Along those lines also make sure that your Google News sitemap is updated immediately whenever new, eligible content is published.
  • The other plus of a Google News sitemap is it gives you the ability to provide additional information such as image-related tags, keywords and genre tags.

  • Unknown publication name – an error that I frequently come across is the “unknown publication name” error for the Google News sitemap, which is reported in the Sitemaps section of the site’s Google Webmaster Tools profile. This error is triggered when the publication name used in the <name> tags in the Google News sitemap is not an exact match to the publication name in Google News’ database. To determine the correct name to use do a [site:domain] search in Google News and see what publication name is displayed for the indexed articles.
  • Improper URL structure – Google News requires that articles be published on unique, permanent URLs that contain at least three digits. The three-digit requirement is not enterprise-friendly but fortunately it is not required if you use a Google News sitemap. Just make sure that your Google News sitemap contains all eligible news articles and that it is frequently updated (which you should be doing anyway).

Article Format / Content Errors

Beyond the fundamental issues covered above, the majority of Google News errors are related to the format or content of specific articles. Many of these errors are reported in the Crawl Errors section of a site’s Google Webmaster Tools profile.

There is a good rundown of all the news-specific crawl errors in the Google News publisher help section so I’m not going to cover them all here. But I will point out the ones that I most frequently come across:

  • Non-literal headlines – this is an editorial mistake as opposed to a technical error, but it is one that comes up a lot so I want to emphasize it. Unlike regular Web search, Google News places more weight on the article headline than the HTML title tag. So the tactic of offsetting witty, print-style headlines with customized title tags works relatively well for Web search (although this is still sub-optimal) but it does not work for news search.
  • The news_keywords meta tag was launched to help offset this issue and give publishers more leeway. But as I covered in Google News news_keywords Meta Tag: More Cons than Pros? you still need descriptive, well-balanced headlines for Web search, social media and even for news search. So treat the news_keywords tag as a supplement to your Google News optimization efforts, not as a replacement for sound, fundamental editorial SEO.

  • Headline not H1 – make sure your on-page headline is in an H1 heading tag, and that it is the only H1 on the page. Doing so makes it easier for Google News to correctly extract the headline. On a related note, heading tag coding is changing with HTML5, but for SEO purposes there should still be only one H1 per template at this time (especially on articles).
  • Title not found – this error is less common but occasionally an article template is coded in such a way that the editorial headline is not easily identifiable or even accessible to crawlers. Be sure to place your headlines in simple HTML text right at the top of the article body, in an H1 tag.
  • Incorrect byline or date – every now and then Google News will pull in an incorrect author or date for a particular article. This is usually related to the template design or the way it is coded. Make sure both the author byline and the date are close to the headline and easily accessible to crawlers. Also avoid including additional names or dates in prominent locations that could lead to a misinterpretation.
  • Article too short or article disproportionally short – these two crop up with some frequency especially on blogs, which are more likely to publish some fairly short pieces. The minimum requirement for Google News is 80 words, but I’ve seen articles just over that figure still trigger the error. For any news content that you want to be indexed, include a minimum of 100 words and preferably 250+.
  • Article fragmented – this is another common error that typically occurs when the article body is broken up with things like lists, tables or sometimes embedded multimedia (or even embedded tweets). Another thing that can trigger the error is placing interstitial “speedbump” links too high up in the article body. These issues can usually be offset by including at least 80-100 continuous words prior to inserting such elements into the article. A blog post with a string of very short paragraphs (many just a single sentence) will also trigger this error on occasion.
  • I should point out that many of the article too short, article disproportionally short and article fragmented errors (and sometimes the title not found errors) reported in Google Webmaster Tools end up being for non-article content like galleries that were encountered during a site crawl. Since such content is not eligible for Google News indexation anyway those errors can be disregarded. Just make sure your Google News sitemap includes only articles and blog posts.

  • Article too long – this one is fairly rare but every once in a while an article gets flagged for being too long. Google News does not provide a maximum word count, but the cases I’ve seen were very long blog posts that covered several different topics at length. Beyond the indexation issue, in such cases it is better to create a separate article for each topic so that each one is more focused and better able to compete for related searches.
  • No sentences found – I’ve seen this error recently on articles that do in fact have multiple sentences on the page. It could be a glitch, but it seems to be triggered by blog posts in which all of the editorial content is contained in a single, large block of text instead of being broken up into separate paragraphs. That type of formatting should be avoided regardless since it is more difficult for users to read.
  • Page too large – this does not come up often, but on occasion something gets added to a template (usually in a sidebar or other module outside of the article body) that greatly increases the size of the page. The maximum permitted size is 256KB. You don’t want pages that large for a variety of reasons, so stay well clear of that figure.
  • Date too old – this is one of the most frequently reported news crawl errors in Google Webmaster Tools. Anything that was encountered in a site crawl that is older than 30 days will trigger this error. Since content that old is no longer eligible for indexation these errors can be disregarded. Just make sure your Google News sitemap contains only recent articles. The Google News guideline is to only include articles from the past two days.

Reaching Out to Google News

Simply put, Google News is a strange beast and despite your best efforts there are going to be odd situations and errors that occur from time to time. Many will be triggered by something you’ve done (or haven’t done), but sometimes there will be no readily apparent cause.

Those of you who work for well-established news brands probably have some degree of contact with Google News, as publisher relations is something that is important to them. So you can utilize those relationships when needed. For those that don’t the Google News help forum is a decent option.

But even if you have a contact point, you don’t want to be hitting them up very often; save that for when you really need it.

Most of the time your best course of action when typical troubleshooting efforts have failed is to fill out one of these forms:

In my experience these forms do actually get paid attention to, particularly if you are an established news site. It may take some time to hear back but you will typically get a response. And if it is in fact something on their end, they’ll address it as possible.

Comments

  1. says

    Thanks for laying out the cases – this is definitely a part of Search that I hadn’t previously considered (not that I have any news sites as clients).

    Still, thanks for a well laid out ‘this is what you should be thinking about.’

  2. says

    Great article Adam. You really shared a lot of the insider knowledge on Google News indexing. Sites that are not well established could play by all the rules above and still not rank well. Is there any truth that Google is looking for a minimum of three articles a day to consider a site newsworthy? What can the smaller brands to get themselves more established?

  3. says

    Thanks Irving, and nice to hear from you. I don’t believe there is a required minimum, but I’d also imagine that any news site that applies for inclusion will need to be producing a frequent and steady stream of eligible content to be accepted.

    So for smaller brands that aren’t yet producing much content, I’d advise investing time and resources into building up an audience and some critical mass first. Focusing on a variety of inbound marketing and audience development efforts will ultimately be a better investment than focusing on news search.

  4. says

    Good article. Thanks. Regarding the H1 tags, do you think it matters how far down the page they come? We list the headline with a title tag near the top, but the H1 tag with the same headline doesn’t show up until, in one story I looked at, line 1,123 when viewing the source.

  5. says

    Thanks John, glad you liked it. For the headline, I’d say the use of an H1 and making sure it’s in a prominent (and easily detectible) place above the article body is more important than exactly where that falls in the source code. But as a general rule higher up seems like a good thing.

    I just checked a random article from five different sites, each of which is frequently visible in Google News. On three of them the headline was line 300 or higher in the source code; on one it was around line 500, and on the last it was around line 1500.

  6. says

    I think the robustness of an article has some weight to it. Good formatting and text are important elements but contributing quality images and videos to compliment or support the data is always crucial.

  7. says

    Thank you Adam. It is a mistake that I have made often it using multiple instances of the H1 tag to catch my readers attention, I now see that this is a huge mistake.

  8. says

    Thanks for this, good article – don’t suppose you’re planning on doing one for the Google News application process? I’ve not been able to even get on Google News with my site, LyricStatus.com, and can’t seem to figure out why – not too many decent blog posts about the Google News application process.

  9. says

    Maciej and Chris – thanks for your input.

    Ed – in my experience sites have received a response (whether it was acceptance or refusal) within 30-60 days of filling out the application form. They were all fairly well-known brands so I’m not sure if that is a typical timeline or not.

    At least from what I’ve seen from some declined applications, outside of failing to meet some of the basic requirements (http://support.google.com/news/publisher/bin/answer.py?hl=en&answer=40787) it often comes down to not producing enough “news” content. For example lots of fairly short blog posts that are more lifestyle or product oriented. I’ve also seen some magazine sites without a lot of hard news get in by creating a dedicated blog or site section dedicated to more newsy content.

    It looks like you’re producing a decent number of music news articles so it seems like you’d have a shot at being considered eligible for the Entertainment category. Maybe you’d need to experiment a bit with the length and format of your articles as well as the areas of coverage.

  10. says

    Hi Adam, just wanted to thank you for responding to my comment – apologies for not replying sooner but I’ve been hard at work getting ready for our next Google News application – I’ve taken on board what you’ve said, as well as some good points on this page – http://www.verticalleap.co.uk/blog/how-to-get-content-seen-google-news/ (sorry for external link, but it is good and relevant) – I’ll follow up in a couple of months and let you know how we get on; hopefully it’ll be good news!

  11. says

    Do you have idea where the ads can be placed on a Google News site. Can these be placed as a left aligned top rectangle with article wrapping around it.?

  12. Samantha says

    Thank you for this. I’ve seen some publications change the headlines (and as a result their title tags) on blog posts as a story develops. Does it hurt them not to maintain a consistent title throughout the life of the story?

  13. says

    Good question Samantha. Google News is most likely to recrawl articles looking for changes within the first 12-24 hours after publishing. So headline changes in that timeframe would likely get re-indexed; anything after that may not.

    In terms of tactics, sometimes news sites put out a brief article right as the news is breaking and then update it within the first few hours as more information becomes known. But for stories with a decent lifecycle, it’s often a good idea to publish multiple follow-up articles covering different aspects of the story, rather than changing the original article. This allows the sites to better compete for news search visibility in multiple timeframes (e.g. morning and afternoon, the following day, etc.).

  14. says

    Michael – I’ve been reading about that but I don’t have any first-hand knowledge as to whether it is a glitch (on either side) or something those publishers have been intentionally doing as at tactic. Either way I can’t imagine it will last for long. Will be interesting to see how it shakes out.

  15. says

    Hey Adam.

    First up, thanks for putting up this article. Really helpful.

    Do you have any idea about the posting frequency and how important it is for Google News approval. Any official citation would really help.

    Thanks. Keep up the good work.

  16. says

    Hi Vivek – I don’t believe it is listed anywhere in the requirements so there may not be an exact figure, but I’d say on average multiple articles per day (written by multiple authors) is necessary, as would be consistent with a news organization / media outlet.

  17. says

    Thanks for the reply Adam. Consistency is the key.

    Have you come across any reader’s who have received an official e-mail for site approval? Just curious.

  18. says

    Hi Adam,

    Thanks for the great post – very useful! Have been using Google News for a few months now and have noticed smaller things seem to affect whether an error is generated or not too.

    For example, longer headlines that include one or more exclamation marks, full stops or question marks seem to generate an error. It doesn’t happen every time but it’s interesting to review the errors in GWT as there is definitely a pattern. It seems to generate either a title not found or even an article too short error.

    However, sometimes the articles you think follow all the rules will generate an error, and vice versa. It can be quite frustrating!

    Do you know of any in-depth studies on GNews and the errors, apart from Google’s own News-specific crawl errors?

  19. says

    Thanks Charlotte, I’m glad you like it, and that’s an interesting pattern you found. If you find out anything more I’d love to hear about it.

    I’m not aware of any in-depth studies – that Google help file and my own experiences which I’ve written up here are all I’ve got on that.

Leave a Reply

Your email address will not be published. Required fields are marked *


3 + one =