404 Errors and Search Engine Optimization
What is a 404 Error (page not found) ?
Technically, a 404 error (also called « Page not found », ot « HTTP 404 code ») is an error message returned by web servers hosting a site to browsers or search engines that attempt to access the content of a page that no longer exists. Here is a diagram illustrating the interaction:
The visitor often receives a generical message like: 404 Error, 404 file not found, page not found or the page does not exist.
Note that in the family of error codes there is also the HTTP 410 error code that is similar to 404 code. This rare code can be used by Webmasters to explicitly indicate that a page has gone and will not be restored ("Page Gone").
To find out more about the HTTP codes, you can read our article on the HTTP protocole.
The causes of a 404 error
A Web server returns an HTTP code 404 when asked to access a resource that does not exist. The causes for that can be various :
- a URL that previously existed was permanently deleted and no redirection has been made,
- a URL where the Webmaster has made an error when entering the internal or external link, so it does not exist
- a misconfiguration of the automatedURLs has been made by the content management system (CMSs like Wordpress, Joomla...) which can generate false URLs.
The Impact of 404 errors on the Search Engine Optimization
Overall, 404 have errors are not detrimental to SEO if their proportion is reasonable.
However, there are 3 troublesome cases :
- if one of the important pages of your site returns by mistake a 404 code (for example following a technical error), this point must be urgently fixed because not only Google thinks this page has disappeared, but the visitors who want to access the page are also frustrated.
- If an interesting external website has created a backlink towards an URL of your site by using a wrong URL. In this case, we advise you to contact the webmaster and report the error. This is a good opportunity to recover an interesting backlink, good for search engine optimization.
- you havetoo many 404 errors on your site : the user experience and the analytical work of search engines could be hampered, which could ultimately have a negative impact on the SEO of your website.
Usually, as far as the search engine optimization is concerned, Google gives priority to quality websites that provide useful content and are accessible to Internet users and search engines. And to meet this level of quality requirements related to SEO, it's a good practice to regularly correct 404 errors.
It is therefore necessary to detect and correct 404 errors in order to:
- facilitate the exploration (crawling) of your website by the search engines which will return more often.
- improve user experience
- give a better image of your website.
- attract the audience
- make sure you do not lose an interesting backlink because of a simple typo on the link placed on the external site
To better understand the impact of 404 errors on the SEO and to fully understand how these errors are handled by Google, please see this video of Matt Cutts:
When is it advisable to set up a 404 error (page not found) ?
As we saw earlier, when used according to the rules, 404 errors will not have a negative impact on SEO. We should nevertheless be careful not to generate too many 404s.
To find out in which case we can set up a 404 error, here are some questions to ask:
- What is the level of traffic generated by the page to be deleted?
- Do the quality backlinks point towards the page to be deleted?
- Do we have on a different web page a content similar to the one we delete?
Depending on the answers, you can decide whether or not it is advisable to set up a 404 error:
- If the page to remove generates a large number of visits and/or backlinks, then it is important to find a page with similar content and use a 301 redirect instead of a 404 error.
- If the traffic or the number of backlinks of the page are almost zero and if no other page offers similar content, then it’s okay to return a 404 error.
Each time you delete a web page, you must withdraw or modify all internal links pointing to that URL.
How to detect and correct 404 errors?
In order to correct the 404 errors you can :
- use a website analyzer (like the Yakaferci SEO tool) to detect all the internal or external links placed on your site which point towards a 404 error code. You must then intervene on the pages containing the wrong links to correct or withdraw them.
- Use Google Webmaster Tools :
- Go to > Crawl > Crawl Errors. There you will see the list of all URLs for which the Google crawler (Googlebot) has met an 404 error
- Click on the error URL > Indicated. There you will see a list of websites that tried to send traffic to your website
- Correct the 404 errors either by contacting the webmaster or by using a 301 redirect from the URL indicated in Google Webmaster Tools towards the correct URL.
- use the information provided by thelog files (logs) of your Web server. These files are usually available even if you are not the manager. A simple grep command to search the "404" code is sufficient for finding all the situations when the web server returned a 404 code. The Referer field indicate the page with the wrong link.
For poorly written URLs placed on other websites: This step is important because you may lose some of your traffic because of poorly written URLs on external sites. How to retrieve the traffic that arrives on a 404 error page:
404 Error detection tool
Yakaferci provides a tool which can detect 404 errors found on a page. The detection of all 404 errors for a website is done only as part of a comprehensive audit.
Analyze your pages 404 errors with our free SEO Page Analyzer:
A customized 404 page
Instead of displaying a neutral technical message to the visitors which land on a 404 page, Webmasters can set up a custom 404 error page. That is a more attractive page limiting to the maximum the inconvenience for the visitor. Here is an example :
Google clearly recommends to set up the custom error pages, but only to improve the comfort of Internet users. Custom error pages have NO special impact on the search otpimization of a website. It is more like a good practice for a Webmaster.
With the Apache-based Web servers, setting up a custom error page for the 404 HTTP code can be done by placing this type of command in the .htaccess file:
ErrorDocument 404 /404.html
Here are our recommendations about setting up a custom 404 error page:
- Make sure that the custom 404 error page technically returns a 404 error code, using the Yakaferci SEO tool. It is important not return a 200 code, otherwise all error URLs on the website would actually have duplicate content.
- Display a clear message that indicates that the requested page is not found
- The 404 page must be integrated into your site, so it must resume the color codes, graphics and navigation of your site
- Encourage users to visit other pages of your site by adding links to pages that may be interesting.
What to do when Google Webmaster Tools reports too many 404 errors ?
In your Google Webmaster Tools account (in the Crawling > Crawling Errors section), you will see all HTTP errors détected by the Googlebot every time it explored your site.
Do not give too much importance to all 404 reported errors. The main information to look in the 404 error table in Google Webmaster Tools is the date on which the error was detected.
Actually, the list of 404 errors reported by Google is not always up to date, it often brings together 404 ancient errors which may have been corrected long ago.
To « force » Google to clean this list, simply select the entire 404 error list reported in Google Webmaster Tools and mark them as « Corrected ».
After several days, look again at this table and see what new URLs were reported ... it is this list of updated errors to be corrected.