Monday, October 15, 2012

HTTP status codes

When a request is made to your server for a page on your site (for instance, when a user accesses your page in a browser or when Googlebot crawls the page), your server returns an HTTP status code in response to the request.
This status code provides information about the status of the request. This status code gives Googlebot information about your site and the requested page.
Some common status codes are:
  • 200 - the server successfully returned the page
  • 404 - the requested page doesn't exist
  • 503 - the server is temporarily unavailable
A complete list of HTTP status codes is below. You can also visit the W3C page on HTTP status codes for more information.
1xx (Provisional response)
Status codes that indicate a provisional response and require the requestor to take action to continue.
Code Description
100 (Continue) The requestor should continue with the request. The server returns this code to indicate that it has received the first part of a request and is waiting for the rest.
101 (Switching protocols) The requestor has asked the server to switch protocols and the server is acknowledging that it will do so.
2xx (Successful)
Status codes that indicate that the server successfully processed the request.
Code Description
200 (Successful) The server successfully processed the request. Generally, this means that the server provided the requested page. If you see this status for your robots.txt file, it means that Googlebot retrieved it successfully.
201 (Created) The request was successful and the server created a new resource.
202 (Accepted) The server has accepted the request, but hasn't yet processed it.
203 (Non-authoritative information) The server successfully processed the request, but is returning information that may be from another source.
204 (No content) The server successfully processed the request, but isn't returning any content.
205 (Reset content) The server successfully proccessed the request, but isn't returning any content. Unlike a 204 response, this response requires that the requestor reset the document view (for instance, clear a form for new input).
206 (Partial content) The server successfully processed a partial GET request.
3xx (Redirected)
Further action is needed to fulfill the request. Often, these status codes are used for redirection. Google recommends that you use fewer than five redirects for each request. You can use Webmaster Tools to see if Googlebot is having trouble crawling your redirected pages. The Crawl Errors page under Health lists URLs that Googlebot was unable to crawl due to redirect errors.
Code Description
300 (Multiple choices) The server has several actions available based on the request. The server may choose an action based on the requestor (user agent) or the server may present a list so the requestor can choose an action.
301 (Moved permanently) The requested page has been permanently moved to a new location. When the server returns this response (as a response to a GET or HEAD request), it automatically forwards the requestor to the new location. You should use this code to let Googlebot know that a page or site has permanently moved to a new location.
302 (Moved temporarily) The server is currently responding to the request with a page from a different location, but the requestor should continue to use the original location for future requests. This code is similar to a 301 in that for a GET or HEAD request, it automatically forwards the requestor to a different location, but you shouldn't use it to tell the Googlebot that a page or site has moved because Googlebot will continue to crawl and index the original location.
303 (See other location) The server returns this code when the requestor should make a separate GET request to a different location to retrieve the response. For all requests other than a HEAD request, the server automatically forwards to the other location.
304 (Not modified) The requested page hasn't been modified since the last request. When the server returns this response, it doesn't return the contents of the page.
You should configure your server to return this response (called the If-Modified-Since HTTP header) when a page hasn't changed since the last time the requestor asked for it. This saves you bandwidth and overhead because your server can tell Googlebot that a page hasn't changed since the last time it was crawled.
305 (Use proxy) The requestor can only access the requested page using a proxy. When the server returns this response, it also indicates the proxy that the requestor should use.
307 (Temporary redirect) The server is currently responding to the request with a page from a different location, but the requestor should continue to use the original location for future requests. This code is similar to a 301 in that for a GET or HEAD request, it automatically forwards the requestor to a different location, but you shouldn't use it to tell the Googlebot that a page or site has moved because Googlebot will continue to crawl and index the original location.
4xx (Request error)
These status codes indicate that there was likely an error in the request which prevented the server from being able to process it.
Code Description
400 (Bad request) The server didn't understand the syntax of the request.
401 (Not authorized) The request requires authentication. The server might return this response for a page behind a login.
403 (Forbidden) The server is refusing the request. If you see that Googlebot received this status code when trying to crawl valid pages of your site (you can see this on the Crawl Errors page under Health in Google Webmaster Tools), it's possible that your server or host is blocking Googlebot's access.
404 (Not found) The server can't find the requested page. For instance, the server often returns this code if the request is for a page that doesn't exist on the server.
If you don't have a robots.txt file on your site and see this status on the Blocked URLs page in Google Webmaster Tools, this is the correct status. However, if you do have a robots.txt file and you see this status, then your robots.txt file may be named incorrectly or in the wrong location. (It should be at the top-level of the domain and named robots.txt.)
If you see this status for URLs that Googlebot tried to crawl, then Googlebot likely followed an invalid link from another page (either an old link or a mistyped one).
405 (Method not allowed) The method specified in the request is not allowed.
406 (Not acceptable) The requested page can't respond with the content characteristics requested.
407 (Proxy authentication required) This status code is similar 401 (Not authorized); but specifies that the requestor has to authenticate using a proxy. When the server returns this response, it also indicates the proxy that the requestor should use.
408 (Request timeout) The server timed out waiting for the request.
409 (Conflict) The server encountered a conflict fulfilling the request. The server must include information about the conflict in the response. The server might return this code in response to a PUT request that conflicts with an earlier request, along with a list of differences between the requests.
410 (Gone) The server returns this response when the requested resource has been permanently removed. It is similar to a 404 (Not found) code, but is sometimes used in the place of a 404 for resources that used to exist but no longer do. If the resource has permanently moved, you should use a 301 to specify the resource's new location.
411 (Length required) The server won't accept the request without a valid Content-Length header field.
412 (Precondition failed) The server doesn't meet one of the preconditions that the requestor put on the request.
413 (Request entity too large) The server can't process the request because it is too large for the server to handle.
414 (Requested URI is too long) The requested URI (typically, a URL) is too long for the server to process.
415 (Unsupported media type) The request is in a format not support by the requested page.
416 (Requested range not satisfiable) The server returns this status code if the request is for a range not available for the page.
417 (Expectation failed) The server can't meet the requirements of the Expect request-header field.
5xx (Server error)
These status codes indicate that the server had an internal error when trying to process the request. These errors tend to be with the server itself, not with the request.
Code Description
500 (Internal server error) The server encountered an error and can't fulfill the request.
501 (Not implemented) The server doesn't have the functionality to fulfill the request. For instance, the server might return this code when it doesn't recognize the request method.
502 (Bad gateway) The server was acting as a gateway or proxy and received an invalid response from the upstream server.
503 (Service unavailable) The server is currently unavailable (because it is overloaded or down for maintenance). Generally, this is a temporary state.
504 (Gateway timeout) The server was acting as a gateway or proxy and didn't receive a timely request from the upstream server.
505 (HTTP version not supported) The server doesn't support the HTTP protocol version used in the request.
 https://support.google.com/webmasters/bin/answer.py?hl=en&answer=40132&topic=1724951&ctx=topic