New features rolled out by Google, both the soft 404’s and the new feature under News have raised a number of concerns.
Google announced they added “soft” or “crypto” 404s to their “crawl errors” report in Google Webmaster Tools.
Soft 404s or crypto 404s are page not found pages that return the server response code of 200. Technically, 404 pages should return a server status code 404, which means page not found. Sometimes webmasters make a mistake by creating custom not found pages but forget to return the server status code of 404, to help the search engines know that this page is a 404.
Google’s new report helps you locate these errors and shows that Google clearly understands that some webmasters set 404 pages as 200.
Google said, “Soft 404s can limit a site’s crawl coverage by search engines because these duplicate URLs may be crawled instead of pages with unique content.” Google added tips for 404 pages:
1. Check whether you have soft 404s listed in Webmaster Tools
2. For the soft 404s, determine whether the URL:
a. Contains the correct content and properly returns a 200 response (not actually a soft 404)
b. Should 301 redirect to a more accurate URL
c. Doesn’t exist and should return a 404 or 410 response
3. Confirm that you’ve configured the proper HTTP Response by using Fetch as Googlebot in Webmaster Tools
4. If you now return 404s, you may want to customize your 404 page to aid your users. Our custom 404 widget can help.
Another interesting feature under the Crawl errors section are the issues encountered when crawling Google News, this will only appear on your webmaster tool if you have your content included in Google news.
The errors observed by the crawlers are grouped into 2 main areas; In Sitemap and News-specific. I will be writing about this feature in detail in my next post.