Django performance tips

Jacob Kaplan-Moss

December 12, 2005

Django handles lots of traffic with ease; Django sites have survived slashdottings, farkings, and more. Here are some notes on how we tweak our servers to get that type of high performance.

Use a separate media server

Django deliberately doesn’t serve media for you, and it’s designed that way to save you from yourself. If you try to serve media from the same Apache instance that’s serving Django, you’re going to absolutely kill performance. Apache reuses processes between each request, so once a process caches all the code and libraries for Django, those stick around in memory. If you aren’t using that process to service a Django request, all the memory overhead is wasted.

So, set up all your media to be served by a different web server entirely. Ideally, this is a physically separate machine running a high-performance web server like lighttpd or tux. If you can’t afford the separate machine, at least have the media server be a separate process on the same machine.

Use a separate database server

If you can afford it, stick your database server on a separate machine, too. All too often Apache and PostgreSQL (or MySQL or whatever) compete for system resources in a bad way. A separate DB server — ideally one with lots of RAM and fast (10k or better) drives — will seriously improve the number of hits you can dish out.

Use PostgreSQL

I’ll probably get lots of push-back from the MySQL community about this one, but in my experience PostgreSQL is much faster than MySQL in nearly every case.

There’s no such thing as too much RAM

Even really expensive RAM costs only about $200 per gigabyte. That’s SO much cheaper than the cost of programmer time it isn’t even funny. Buy as much RAM as you can possibly afford, and then buy a little bit more.

Faster processors really won’t improve performance all that much; most web servers spend up to 90% of their time waiting on IO! As soon as you start swapping, performance will just die. Faster disks might help slightly, but they’re so much more expensive than RAM that it doesn’t really matter.

If you’ve got multiple servers, the first place to put your RAM is in the database server. If you can afford it, get enough RAM to get fit your entire DB. This shouldn’t be too hard; our database — including half a million articles dating back to 1989 — is only 1.5 gigs.

Next max out the RAM on your web server. The ideal situation is one where neither swaps — ever. If you get to that point you should be able to withstand most normal traffic.

Turn off KeepAlive

I don’t totally understand how KeepAlive works, but turning it off on our Django servers increased performance by something like 50%. Of course, don’t do this if the same server is also serving media… but you’re not doing that, right?

Use memcached

Although Django has support for a number of cache backends, none of them perform even half as well as memcached does. If you find yourself needing the cache, do yourself a favor and don’t even play around with the other backends; go straight for memcached.

Tune, tune, tune

(With apologies to the Byrds.)

Chances are the defaults for your web server, database engine, or machine are not tuned as nicely as they could be. This is far from a comprehensive list, but below are some of the resources I used to make my stuff scream:

Again, far from comprehensive, but those should help anyone involved in tuning a Django site.

Future directions

Running some simple benchmarks seems to imply that Django under lighttpd and FastCGI outperforms Apache/mod_python. I need to play around with it some more, but there’s a good chance that Django just doesn’t need the overhead of Apache.

Also, for very large sites some sort of database replication/federation is going to be needed eventually. Nothing I’ve done has hit that point yet, but when when it does that will make things very interesting. Tools like Slony and/or pg_pool will likely come in handy at that point.

Comments:

Cesar:

How stable is Django on lighttpd+FastCGI? I read a while back in the IRC logs, there were some issues with lighttpd. Has this situation changed?

Jacob:

I'm not really sure about Django's stability with FCGI. In theory it should be pretty good, and I know there are a few people doing it. It's certainly something I want to investigate.

Simon Willison:

I'm personally very interested in SCGI as an alternative to FastCGI. SCGI is a much simpler protocol (but with the same theoretical performance benefits) and was originally designed for use with Quixote, another Python framework. It's getting quite a bit of interest in the Rails community at the moment.

hugo:

SCGI works quite fine with Django and Apache, using the mod_scgi module. And with some tests I did, it's mostly equivalent, performance wise - was to be expected, as in the case of SCGI and FCGI the web server only has to hand over the actual request and there shouldn't be a big difference in that stuff.

And don't forget the additional security possibilities you gain by running the application in a different user context.

Oh, and there isn't much difference with regard to performance between SCGI and FCGI - that might be due to FLUP running in both cases on the application side to handle stuff, it's possible that dedicated FCGI or SCGI only servers will push performance a bit.

Another thing people should keep in mind: use processing instead of threading. Especially if youre application has many database queries and some of them might be larger, more involved, threading can kill your application performance because of the GIL in Python. Even though processing has a larger overhead in comparison to threading on the OS side of things, on the Python side of things processing wins hands down.

Frank Wiles:

On KeepAlive, essentially the same process waits around hoping the client's browser is going to request more information in X number of seconds. The Apache defaults are horrible IMHO at 300 seconds... but if you leave KeepAlive on, but limit it down to the 1-5 second range you should notice a slight performance increase, without any bad things happening. I guess the 300 seconds is a throw back to the days of dialup.

That being said, if a particular media server, or webserver is only serving up 1-2 things ( i.e. HTML and CSS ) then it is best to leave it off. But if you have a sight with several images, multiple CSS files, etc. then it is wise to leave it on.

Jay Pipes:

Hi,

Regarding: "but in my experience PostgreSQL is much faster than MySQL in nearly every case."

Just wondering if you could point us to specific benchmarks that show PostgreSQL outperforming MySQL, especially for Django-based sites. Our performance team is always looking for case studies of this sort, so that we can understand instances where MySQL can be improved. Thanks!

Jay Pipes
Community Relations Manager, North America
MySQL, Inc.

Jacob:

Hey Jay --

I've not done any real benchmarks; that comment is based more on aggregate experience with both DBMSes over a few years of use. I do really wish there were some really good, unbiased benchmarks out there, but I don't have any.

I do have some anecdotal evidence that I left out of this article; drop me an email (jacob at this domain) if you'd like to hear 'em.

Jay Pipes:

Will do. Thanks, Jacob!

Frankie Robertson:

Just a note to people who have read this, if you do have a DB server on a seperate machine then make sure to keep you're local network nice and fast. Giganet might be a worthwhile investment.

Peter Ferne:

Do you have any figures on the amount of RAM that Django itself uses?

Antonio Cangiano:

Thanks for the tips, Jacob.

bluszcz:

what about mysql / postgresql comparision? can you write more?

Milton:

turn off sessions and it runs a LOT quicker... far more difference than anything mentioned above.

Jeremy Dunck:

Milton, with respect, that depends quite a bit on your app and DB.

And sessions are required for auth, etc.

Rene L.:

nginx is also a very fast alternative server to deliver static (Media) files:
http://nginx.net/

There is now a MySQL Proxy being developped which will allow load balancing, failover, query analysis, query filtering and modification:
http://forge.mysql.com/wiki...

If you are looking for the slowest MySQL queries just turn on the MySQL Slow Query Log and use a free filter script (Python or PHP) like this one:
http://code.google.com/p/my...
You can even use it with MySQL 5.1 (if you deactivate the new CSV table based log) and with microsecond granularity (if you already have a very fast [AJAX] application).

bir2su:

yea django is a great framework
i love django.
to visualise performance of django statistically.
visit http://bir2su.blogspot.com/...

Eugene:

Hi!

What about working python on multiprocessor and multicore systems?
I've read python has some troubles with supporting this hardware.

Eugene:

@Eugene
You must be new here.

GIL matters if you're compute-intensive and multi-threading. There's no trouble if you're multi-process or IO-bound.

J. Heasly:

The "annotated postgresql.conf" seems to have moved (link is broken). Is this [1] what the link was pointing to?
[1] http://www.varlena.com/Gene...

Leave a comment:

Use your real name, or risk deletion.

Optional.

No markup allowed. Linebreaks will be converted; links will be linkified.

Be nice; don't be that guy.