Don’t let the image resizer crash your server

This is the story of many page requests in a short amount of time. Over 500 requests from the same IP in under 10 seconds. They all pinged the TimThumb script (a thumbnail generator, very popular among WordPress themes and plugins) asking it for resized, uncached images. The fact that this was done in one go saturated 16GB of RAM on the server and stopped other processes running, thus crashing the server.

The client was unhappy with their site being offline for the past day and a half (they were in fact losing business while being offline) and in reply the host expressed yet more serious concerns:

Naturally as you can imagine, we have to consider other customers on this platform, several of whom have complained about these crashes. Like yourself, downtime costs us money. We either:
a) need to be absolutely sure that you yourselves, armed with the information above, know exactly what the issue is and can rectify it.
b) cut our losses, offer you a refund pro-rata and ask you to move your hosting account away from ourselves.

The memory usage of the script does imply however that you will have the same issue anywhere else. No shared provider will accept a single site making use of 16GB of memory.

So we either needed to guarantee that our website wouldn’t be a trouble maker in the future, or pack our stuff and move to our mother’s.

There could have been multiple reasons for the hundreds of requests. Someone may have been scrapping the site (an online store) for product images. Someone may have been pinging the PHP image resizer and asking it for fresh images just for fun. One of our own Ajax scripts may have been doing that accidentally under certain circumstances. Hell, one of our Ajax scripts may have become self-aware and was trying to take over the hosting company for all I knew. I needed a solution to prevent any of these.

“500 requests in 10 seconds” I thought…  “No single IP would need to make more than a dozen requests in any given 5 seconds.”

And there’s the answer. The scripts shouldn’t honor more than 10 requests in under 5 seconds. You may be tempted to simplify that to 2 requests per second, but don’t. Sometimes you go from one page to another in under one second, or you open 5 new tabs in a couple of seconds and there’s nothing illegitimate about that. In fact, let’s change the numbers to say that “no more than 40 requests in 20 seconds” are acceptable from the same IP.

Throttling the requests on a per IP basis

No more than 40 requests in 20 seconds are acceptable from the same IP is materialized in the PHP script I will paste below. The code would need to be saved as “limit.php” and included using:

include_once('limit.php');

…at the top of any scripts that are prone suck a lot of memory. If you’re using WordPress and are paranoid enough (as I was at the time) you can include it at the top of your theme’s functions.php file. But even more importantly, if your theme is using timthumb.php or another image resizer you should also include it at the top of that file.

TimThumb is very susceptible to suck a lot of memory. It does set a “MEMORY_LIMIT”, but that only applies each time the script runs. That means processing a very large image will be prevented by default, but running the TimThumb script 500 times on regular-sized images within a few seconds will not be prevented.

If you have the timthumb.php file somewhere on your server, anyone could write a simple JavaScript that generates different URLs and continuously requests new images from your TimThumb script until the server goes the way of the Titanic.

A simple PHP throttling script

These are the contents of “limit.php”:

[embed_snipt:ughA0]

The code, explained

I’ll go through every piece of it below, but you’ll probably need to know how PHP sessions work to completely understand everything.

The limit_requests() function takes two parameters: the number of allowed requests ($nr with a default of 40) and a number of seconds ($t with a default of 20) — the time interval in which those requests can be made.

We will use a session to store a couple of variables:

  • $_SESSION['hits'] — the number of hits (requests) made in the last interval
  • $_SESSION['tzero'] — a starting point for the time interval which we’ll use to check how much time has passed (by comparing it with the current time)

We will only start a session if one isn’t currently active:

if (!session_id()) {
  start_session_based_on_ip();
}

Instead of PHP’s default “session_start()” we use a custom start_session_based_on_ip() function. If we only used session_start() the script would have worked exactly the same except a malicious user could have bypassed it by turning off their cookies. So instead we’re giving our session an id based on the visitor’s IP address.

if( !isset($_SESSION['tzero']) ) {
  $_SESSION['tzero']=time();
}

If $_SESSION['tzero'] isn’t set, it means it’s the first time this user is running the script and we’ll set it to the current time.

$since_interval_start = time() - $_SESSION['tzero'];

How much time (in seconds) has passed since $_SESSION['tzero']. We’ll need this below.

if( $since_interval_start > $t ) {
  $_SESSION['tzero'] = time();
  $_SESSION['hits'] = 1;
} else {
  $_SESSION['hits']++;
}

If $since_interval_start is larger than our time interval it means our interval is up. We reset ['tzero'] to the current timestamp (thus we start a new interval) and the hits to 1.

If not (the else bit) we increment the hits.

if( $_SESSION['hits'] > $nr ) {
  die('<h1>Too many requests!</h1> You will be able to make a new request in '.($t-$since_interval_start).' seconds.');
}

This is the most important part. If the number of hits exceeded the $nr limit we kill the running script, effectively preventing any future requests until the time interval runs out (at which point a new interval will start and the hits will be reset — see the previous paragraph).


Check out a demo of the script here. If you Refresh the page a few times, you’ll see the hit tracker increase.


Where else to use it

The best place to include this script would be CPU intensive PHP programs to prevent them from being repeatedly accessed from the same machine. I’m thinking PHP scripts that generate images, generate PDF files, scripts that work with audio or video files, scripts that compress files into archives.

Posted in blog Tagged with: , , , ,