Steve Taylor photo

The holding page and the 503 status code

NOTE: I’ll leave the information here for reference as it’ll probably still be useful to some. But for anyone using WordPress who wants a convenient way of putting up a holding page without confusing search bots and without blocking yourself from using the site while it’s “down”, I’ve just found the very neat Maintenance Mode plugin. Seems to work like a treat. 14/2/08

Ever wanted to have a system in place that allows you to easily “switch on” a holding page for the whole of your site for when you need to do some maintenance? Well, that’s relatively easy to do; but what about bots? Even if you’re only down for 10 minutes, what if your luck is such that Googlebot makes its random rounds at precisely that time? Depending on how you’re holding page works, it might register a load of “404 – Not Found” errors, or replace your indexed content with your holding page… Who knows? Not I.

Well, with yet another WordPress upgrade (2.1.3) just out, I thought I would try to get to the bottom of this holding page issue. WordPress upgrades might be made smoother in a future version; but for now, it’s a slightly brutal case of deleting your live files and replacing them. Even without discovering some hair-tearing plugin incompatibilities along the way, or accidentally overwriting a crucial hack you’ve coded into your installation, a proper upgrade can take 10-20 minutes.

The only really useful page I found on the issue of bots and holding pages was over at AskApache.com. You can head over there and work out your own adaptation; I thought I’d document mine here for reference.

Note that the basics of this require PHP running on an Apache web server, with mod_rewrite enabled. The rest assumes you’re using WordPress.

The holding page

I created a file, 503.php, sitting in my site’s root. The code looks something like this:

<?php
header(&quot;HTTP/1.1 503 Service Temporarily Unavailable&quot;);
header(&quot;Status: 503 Service Temporarily Unavailable&quot;);
header(&quot;Retry-After: 3600&quot;);
?><!DOCTYPE html PUBLIC &quot;-//W3C//DTD XHTML 1.0 Strict//EN&quot;
&quot;http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd&quot;>
<html xml:lang=&quot;en&quot; lang=&quot;en&quot; xmlns=&quot;http://www.w3.org/1999/xhtml&quot;>
<head>
<meta http-equiv=&quot;Content-Type&quot; content=&quot;text/html; charset=UTF-8&quot; />
<title>Site upgrade in progress</title>
<meta name=&quot;robots&quot; content=&quot;none&quot; />
</head>
<body>
<h1>Site upgrade in progress</h1>
<p>This site is being upgraded, and can't currently be accessed.</p>
<p>It should be back up and running very soon. Please check back in a bit!</p>
<hr />
</body>
</html>

Those first two lines of PHP set the request’s HTTP headers to the “503 Service Unavailable” status code. HTTP headers are invisible when you’re browsing the web, but are read by the browser itself and by bots crawling around (and can be checked out via tools like Web-Sniffer and the Firefox Web Developer Extension). This status code means:

The server is currently unable to handle the request due to a temporary overloading or maintenance of the server. The implication is that this is a temporary condition which will be alleviated after some delay.

Just what we want.

The Retry-After bit is used to indicate how long you expect the site to be down for (in seconds). If you are hit by a search engine bot while down, it’ll probably understand this, but of course it doesn’t mean it’ll return as soon as that time’s elapsed. It’s probably got better things to do. At least it knows not to return too soon.

The rest is a plain page for humans, letting them know in English what’s going on. It needn’t be as minimalist as this; you can add branding if you want, and maybe a link to another site to be polite and give people a “way out”.

Allowing yourself access

The next step is to alter your .htaccess file to make sure all requests go to the 503 page.

But hang on – what about your good self? How do you test out all the maintenance you’re doing if you’re just being sent to 503.php with everyone else?

Basically, you filter requests by IP address. Assuming you have a static IP address, you can test that, and only let your own IP through.

Here’s the code for your .htaccess file:

Options +FollowSymLinks
RewriteEngine On
RewriteBase /
RewriteCond %{REMOTE_ADDR} !^23.23.23.23
RewriteCond %{REQUEST_URI} !^/503.php [NC]
RewriteRule .* /503.php [L]

Just replace those numbers on the fourth line with your own IP. (Keep the backslashes before each dot!)

Those two RewriteCond (“rewrite condition”) lines basically mean: only apply the following rewrite if the IP of the request (REMOTE_ADDR) doesn’t match the one given, and the file requested (REQUEST_URI) isn’t 503.php (preventing an infinite loop!). As long as it’s not you and they’re not already going to 503.php, the RewriteRule kicks in and serves that page up. (Though note that the URL in the browser address bar doesn’t change for the user; this is rewriting, not redirecting.)

Note that last [L] in the code. That means, if this rewrite rule is processed, make it the last one. This is handy if you have other rewrites (e.g. your WordPress rewrites) that still need to work when you access the site, but which shouldn’t intefere with this rule for everyone else. Of course, for this reason, the above code should go right at the top of your file.

The switch

You may have your own technique. I keep 503.php and 503.htaccess in the web root. When I need to switch my holding page in, I just rename my usual .htaccess to something like LIVE.htaccess, and (quickly!) rename 503.htaccess to .htaccess.

All should now be on hold for the rest of the world. Bots should be politely leaving your site be for the while. And you should be able to get in and muck around to your heart’s content.

Looking through other’s eyes

A pesky stumbling block I came up against developing this technique was my obvious inability to see what other people were seeing via all those other IP addresses. My friend Jim, ever abreast of tech developments, directed me to the Torpark browser. Built on the Firefox shell, this browser is designed to anonymize your web surfing by routing your requests through a labyrinthine series of IP relays (or whatever they might call them).

It’s sluggish, but it works. Whatever your IP ends up being seen as by servers while browsing with this nifty tool, it won’t be your actual IP.

Upgrading WordPress

So, with these tricks ready to go, here’s my revised WordPress upgrade guide. (Do also check the official guide if you’re new to this though!)

One thing to bear in mind with more recent WordPress versions is that if the .htaccess file doesn’t contain WordPress’ rewrite rules at the end, it may try to “fix” things – and break them. Make sure your 503 .htaccess file has your usual WordPress rewrite rules at the end.

  1. Backup all your files and your database.
  2. Make sure you’ve a record of which files you’ve customized.
  3. Do the 503 holding page switch.
  4. De-activate all your plugins.
  5. Delete all files apart from wp-config.php, the wp-content directory, and of course .htaccess and 503.php
  6. Upload all files in the new version of WordPress (excluding the wp-content directory)
  7. Run the upgrade script (e.g. http://yourdomain.com/wp-admin/upgrade.php)
  8. Re-activate the plugins. Pray.
  9. Check the site out, make sure it’s survived OK.
  10. Revert to the original .htaccess file.
  11. Done!

Update

I’ve just tried implementing this technique to display a holding page for a site that’s yet to be launched. Pretty much the same situation, but an interesting issue came up when I tried to include an image on the holding page. The image wouldn’t appear. I spent a frustrating 5 minutes checking paths and files, but eventually it dawned on me: the browser’s request for the image file was returning a 503 error, not the image!

A little extra .htaccess magic remedies this. Insert the following line before your RewriteRule line:

RewriteCond %{REQUEST_FILENAME} !.(gif|jpe?g|png)$

Of course you may need to add css or js to that list of filetypes, depending on your holding page’s needs.