Jacob Kaplan-Moss

Five things I hate about Python

I wrote this post in 2007, more than 17 years ago. It may be very out of date, partially or totally incorrect. I may even no longer agree with this, or might approach things differently if I wrote this post today. I rarely edit posts after writing them, but if I have there'll be a note at the bottom about what I changed and why. If something in this post is actively harmful or dangerous please get in touch and I'll fix it.

Inspired by Titus (who was in turn inspired by brian d foy), here’s what I hate about Python. I completely agree with Brian that you can’t trust any advocate who doesn’t know enough to find stuff to hate. Given that I spend a lot of time advocating Python, writing down what I hate seems a good exercise. Perhaps I’ll do the same for Django in the future…

Anyway, here are the five things I hate about Python, presented Letterman-style:

5. Where are my interfaces?

I’m pretty convinced that every piece of software has at least one good idea. Even the steaming hunk of crap that we call “Java” has one: interfaces.

Now, I’m pretty convinced that Python strikes a great balance between expressiveness and minimalism; I can do nearly everything I want with the small set of built-in types. However, the one thing I can’t effectively do is codify a set of behaviors into an interface. There are a number of third-party libraries that implement interfaces (PyProtocols, zope.interface), but they feel “clunky” compared to what first-class interfaces would solve. It seems to me that interfaces are a perfect match to duck typing; I shouldn’t need to care care about the difference between something that pretends to be a list and something that really is a list.

4. What the hell is a list, anyway?

I think this next one follows naturally out of the above. Many, many functions are documented as operating on “list-like” or “file-like” or “dict-like” objects, but there’s a general lack of agreement about what exactly constitutes a minimal interface for any of these objects. Is something that supports read() and write() enough to be a “file,” or is seek() a necessity, too?

Does a list just need to implement __getitem__, or is __len__ required? I could go on and on.

This has real – and sometimes nasty – repercussions: the HttpResponse object in Django implements (our idea of) a minimal file interface, and so much of the time you can pass one in anywhere you’d normally use a file. However, some libraries with different ideas of “file-ness” will break when handed an HttpResponse (for a while reportlab wouldn’t handle an HttpResponse; I forget what we had to add to make that work).

Python 3000 supposedly will solve this with the use of ABCs (Abstract Base Classes); that’s OK, but I’d still rather see interfaces.

3. Concurrency is is really really hard hard

Concurrency in Python sucks balls.

Threading is slow because of the GIL and hard because the built-in threading module is rudimentary (at best).

Coroutines/microthreads are only supported if you want to experiment with Stackless.

If you want asyncronous IO you’ve basically got two shitty options: the standard library asyncore module (which is far too low-level and rudimentary to be considered “easy”), and Twisted which is huge and chronically under-documented.

2. Egg on my face

OK, they’re only really a de facto part of Python, but I’m still gonna bitch about Eggs: I freeking hate them.

Sure, easy_install Something is quite awesome, but I couldn’t be more unhappy about how that actually works. I could bitch for ages about ’em, but I’ll just point out the two big things I can’t stand:

First, Have you ever looked at sys.path with a couple of eggs installed? Here’s a bit of mine:

...
'.../site-packages/setuptools-0.6c3-py2.4.egg',
'.../site-packages/mechanize-0.1.6b.dev_r36082-py2.4.egg',
'.../site-packages/ClientForm-0.2.6.dev_r36070-py2.4.egg',
'.../site-packages/geopy-0.93-py2.4.egg',
'.../site-packages/kid-0.9.4-py2.4.egg',
'.../site-packages/elementtree-1.2.6_20050316-py2.4.egg',
'.../site-packages/Unipath-0.1.0-py2.4.egg',
'.../site-packages/PyYAML-3.04-py2.4.egg',
'.../site-packages/simplejson-1.5-py2.4.egg',
'.../site-packages/Paste-1.2.1-py2.4.egg',
'.../site-packages/selector-0.8.11-py2.4.egg',
'.../site-packages/resolver-0.2-py2.4.egg',
'.../site-packages/wsgiref-0.1.2-py2.4.egg',
'.../site-packages/svnshelve-0.1-py2.4.egg',
'.../site-packages/paramiko-1.7-py2.4.egg',
'.../site-packages/pycrypto-2.0.1-py2.4-macosx-10.3-fat.egg',
...

That’s right: each egg gets its own entry on sys.path. This is bad because of the way Python’s import mechanism works; importing a package makes around ten different open syscalls for each entry on sys.path; that is, import foo looks for:

  • foo.so
  • foomodule.so
  • foo.py
  • foo.pyc
  • foo.pyo
  • foo/__init__.so
  • foo/__init__module.so
  • foo/__init__.py
  • foo/__init__.pyc
  • foo/__init__.pyo

Multiply that by 10 or 20 or however many eggs you’ve got and you’re actually talking serious import overhead here.

Second – and more distressing to me – is the fact that eggs get in the way of viewing source. Packages installed from eggs are installed as ZIP files; this means that viewing their source requires unpacking the egg. Yes, this is a bit trivial, but newbies to Python aren’t even going to know how to do this! I’ve talked to a number of first-time Python people who assume that CheeseShop packages are closed-source in some way.

You can fix this, by the way, with these lines in ~/.pydistutils.cfg:

[easy_install]
zip_ok = false

But that’s a hack, really.

1. Whose standards are these, anyway?

Finally, we come to the standard library.

I should say off the top that for many, many uses the standard library is awesome. The concept of “batteries included” is fantastic, and the particular batteries that Python includes are for the most part amazing. I feel comfortable critcizing the standard library precisely because it’s so good – with a minimal amount of work, it could be even better.

Titus covered the problems of inconsistancy and spotty documentation pretty well, so I’ll focus on my specific concern: what exactly is considered “standard”, anyway?

There’s a whole set of libraries out there that I basically consider required to get any real work done, but aren’t included in the standard library. A (partial) list would include:

Python would be about 100% more useful if the standard library included more packages that people actually consider “standard”.

Update:

Fixed the spelling of brian d foy’s name. Sorry!