Attention Internet Explorer users: This site won't look like ass if you use a better browser. My site, my rules.

Django performance tips

Django handles lots of traffic with ease; Django sites have survived slashdottings, farkings, and more. Here are some notes on how we tweak our servers to get that type of high performance.

Use a separate media server

Django deliberately doesn’t serve media for you, and it’s designed that way to save you from yourself. If you try to serve media from the same Apache instance that’s serving Django, you’re going to absolutely kill performance. Apache reuses processes between each request, so once a process caches all the code and libraries for Django, those stick around in memory. If you aren’t using that process to service a Django request, all the memory overhead is wasted.

So, set up all your media to be served by a different web server entirely. Ideally, this is a physically separate machine running a high-performance web server like lighttpd or tux. If you can’t afford the separate machine, at least have the media server be a separate process on the same machine.

Use a separate database server

If you can afford it, stick your database server on a separate machine, too. All too often Apache and PostgreSQL (or MySQL or whatever) compete for system resources in a bad way. A separate DB server — ideally one with lots of RAM and fast (10k or better) drives — will seriously improve the number of hits you can dish out.

Use PostgreSQL

I’ll probably get lots of push-back from the MySQL community about this one, but in my experience PostgreSQL is much faster than MySQL in nearly every case.

There’s no such thing as too much RAM

Even really expensive RAM costs only about $200 per gigabyte. That’s SO much cheaper than the cost of programmer time it isn’t even funny. Buy as much RAM as you can possibly afford, and then buy a little bit more.

Faster processors really won’t improve performance all that much; most web servers spend up to 90% of their time waiting on IO! As soon as you start swapping, performance will just die. Faster disks might help slightly, but they’re so much more expensive than RAM that it doesn’t really matter.

If you’ve got multiple servers, the first place to put your RAM is in the database server. If you can afford it, get enough RAM to get fit your entire DB. This shouldn’t be too hard; our database — including half a million articles dating back to 1989 — is only 1.5 gigs.

Next max out the RAM on your web server. The ideal situation is one where neither swaps — ever. If you get to that point you should be able to withstand most normal traffic.

Turn off KeepAlive

I don’t totally understand how KeepAlive works, but turning it off on our Django servers increased performance by something like 50%. Of course, don’t do this if the same server is also serving media… but you’re not doing that, right?

Use memcached

Although Django has support for a number of cache backends, none of them perform even half as well as memcached does. If you find yourself needing the cache, do yourself a favor and don’t even play around with the other backends; go straight for memcached.

Tune, tune, tune

(With apologies to the Byrds.)

Chances are the defaults for your web server, database engine, or machine are not tuned as nicely as they could be. This is far from a comprehensive list, but below are some of the resources I used to make my stuff scream:

Again, far from comprehensive, but those should help anyone involved in tuning a Django site.

Future directions

Running some simple benchmarks seems to imply that Django under lighttpd and FastCGI outperforms Apache/mod_python. I need to play around with it some more, but there’s a good chance that Django just doesn’t need the overhead of Apache.

Also, for very large sites some sort of database replication/federation is going to be needed eventually. Nothing I’ve done has hit that point yet, but when when it does that will make things very interesting. Tools like Slony and/or pg_pool will likely come in handy at that point.

Comments

Cesar

Dec. 12th, 2005

8:20 p.m.

How stable is Django on lighttpd+FastCGI? I read a while back in the IRC logs, there were some issues with lighttpd. Has this situation changed?

Jacob

Dec. 12th, 2005

10:02 p.m.

I'm not really sure about Django's stability with FCGI. In theory it should be pretty good, and I know there are a few people doing it. It's certainly something I want to investigate.

Simon Willison

Dec. 13th, 2005

2:06 a.m.

I'm personally very interested in SCGI as an alternative to FastCGI. SCGI is a much simpler protocol (but with the same theoretical performance benefits) and was originally designed for use with Quixote, another Python framework. It's getting quite a bit of interest in the Rails community at the moment.

hugo

Dec. 13th, 2005

2:51 a.m.

SCGI works quite fine with Django and Apache, using the mod_scgi module. And with some tests I did, it's mostly equivalent, performance wise - was to be expected, as in the case of SCGI and FCGI the web server only has to hand over the actual request and there shouldn't be a big difference in that stuff.

And don't forget the additional security possibilities you gain by running the application in a different user context.

Oh, and there isn't much difference with regard to performance between SCGI and FCGI - that might be due to FLUP running in both cases on the application side to handle stuff, it's possible that dedicated FCGI or SCGI only servers will push performance a bit.

Another thing people should keep in mind: use processing instead of threading. Especially if youre application has many database queries and some of them might be larger, more involved, threading can kill your application performance because of the GIL in Python. Even though processing has a larger overhead in comparison to threading on the OS side of things, on the Python side of things processing wins hands down.

Frank Wiles

Dec. 14th, 2005

11:25 a.m.

On KeepAlive, essentially the same process waits around hoping the client's browser is going to request more information in X number of seconds. The Apache defaults are horrible IMHO at 300 seconds... but if you leave KeepAlive on, but limit it down to the 1-5 second range you should notice a slight performance increase, without any bad things happening. I guess the 300 seconds is a throw back to the days of dialup.

That being said, if a particular media server, or webserver is only serving up 1-2 things ( i.e. HTML and CSS ) then it is best to leave it off. But if you have a sight with several images, multiple CSS files, etc. then it is wise to leave it on.

Jay Pipes

Feb. 6th, 2006

2:23 p.m.

Hi,

Regarding: "but in my experience PostgreSQL is much faster than MySQL in nearly every case."

Just wondering if you could point us to specific benchmarks that show PostgreSQL outperforming MySQL, especially for Django-based sites. Our performance team is always looking for case studies of this sort, so that we can understand instances where MySQL can be improved. Thanks!

Jay Pipes
Community Relations Manager, North America
MySQL, Inc.

Jacob

Feb. 6th, 2006

4:43 p.m.

Hey Jay --

I've not done any real benchmarks; that comment is based more on aggregate experience with both DBMSes over a few years of use. I do really wish there were some really good, unbiased benchmarks out there, but I don't have any.

I do have some anecdotal evidence that I left out of this article; drop me an email (jacob at this domain) if you'd like to hear 'em.

Jay Pipes

Feb. 7th, 2006

7:33 a.m.

Will do. Thanks, Jacob!

Frankie Robertson

April 6th, 2006

4:50 p.m.

Just a note to people who have read this, if you do have a DB server on a seperate machine then make sure to keep you're local network nice and fast. Giganet might be a worthwhile investment.

Peter Ferne

June 22nd, 2006

8:50 a.m.

Do you have any figures on the amount of RAM that Django itself uses?

Antonio Cangiano

Sept. 2nd, 2006

8:02 p.m.

Thanks for the tips, Jacob.

bluszcz

Nov. 20th, 2006

7:21 a.m.

what about mysql / postgresql comparision? can you write more?

Milton

Jan. 22nd, 2007

6:38 a.m.

turn off sessions and it runs a LOT quicker... far more difference than anything mentioned above.

Jeremy Dunck

Jan. 22nd, 2007

2:03 p.m.

Milton, with respect, that depends quite a bit on your app and DB.

And sessions are required for auth, etc.

Rene L.

July 25th, 2007

11:44 a.m.

nginx is also a very fast alternative server to deliver static (Media) files:
http://nginx.net/

There is now a MySQL Proxy being developped which will allow load balancing, failover, query analysis, query filtering and modification:
http://forge.mysql.com/wiki/MySQL_Proxy

If you are looking for the slowest MySQL queries just turn on the MySQL Slow Query Log and use a free filter script (Python or PHP) like this one:
http://code.google.com/p/mysql-log-filter/
You can even use it with MySQL 5.1 (if you deactivate the new CSV table based log) and with microsecond granularity (if you already have a very fast [AJAX] application).

bir2su

Nov. 27th, 2007

11:02 a.m.

yea django is a great framework
i love django.
to visualise performance of django statistically.
visit http://bir2su.blogspot.com/2007/11/django-rocks-with-memory-but-rails.html

Eugene

Dec. 11th, 2007

5:27 p.m.

Hi!

What about working python on multiprocessor and multicore systems?
I've read python has some troubles with supporting this hardware.

Eugene

March 28th, 2008

2:59 p.m.

@Eugene
You must be new here.

GIL matters if you're compute-intensive and multi-threading. There's no trouble if you're multi-process or IO-bound.

Your 2¢

Comment