What to do when PyPI goes down

Jacob Kaplan-Moss

July 20, 2010

Lately PyPI, the Python package index, has been having some availability issues. When PyPI goes down it really hurts Python developers: we can’t install new packages, bring up new development environments or virtualenvs, or deploy code with depedencies.

Work is ongoing — see PEP 381 — to add much-needed resiliency to PyPI, and the fruits of these labors are starting to become available. In particular, a number of PyPI mirrors are now available: b.pypi.python.org, c.pypi.python.org, and d.pypi.python.org are up and running. Each mirror has fairly up-to-date copies1 of all the packages and metadata on PyPI.

[1]Exactly how often each mirror updates is up to the maintainer. Each mirror maintains a last-modified file that informs you when the mirror was last updated; it’s at http://<mirror>.pypi.python.org/last-modified.

They’re not totally perfect clones — the public parts of the web interface aren’t on the clones — but they have everything that package installation tools need. This means that when PyPI goes down, you can make a few tweaks, switch to a mirror, and get on with your day.

In the future I expect most Python packaging tools will gain native, transparent support for failing over to these mirrors, but for now you’ll need to manually do so. All of the Python packaging tools support mirrors, each in a slightly different way:

pip

If you’re using a recent pip (0.8.1 or later) use the —use-mirrors flag:

pip install --use-mirrors $PACKAGE

You can also set the PIP_USE_MIRRORS environment variable.

This’ll automatically query the list of mirrors and keep trying until one responds. It can take a bit of time when PyPI’s down as it waits for PyPI to time out, but it’ll work.

For older versions of pip, or if you want to force the use of a particular mirror, use:

pip install -i http://d.pypi.python.org/simple $PACKAGE

If you want to instruct pip to always use the mirror — good if you’ll be doing a lot of installation, or if you’re using pip as part of a bigger automation tool — then put:

[global]
index-url = http://d.pypi.python.org/simple

Into ~/.pip/pip.conf.

virtualenv

Unfortunately, virtualenv uses easy_install to boostrap a new environ, so simply making the pip changes isn’t enough to allow you to create new virtualenvs when PyPI is down.

Instead, you’ll need to make the global change to ~/.pydistutils.cfg desribed. under easy_install, below.

Buildout

To use a mirror with Buildout, add:

[buildout]
index = http://d.pypi.python.org/simple

Into your buildout.cfg.

To use the mirror globally in any Buildout on your system, put those same lines above into ~/.buildout/default.cfg.

If you need to run a Buildout bootstrap.py, note that it currently uses easy_install as part of the bootstrap procedure, so you’ll need to add the lines to ~/.pydistutils.cfg described below.

Like pip, I expect that Buildout will eventually gain native mirror support.

easy_install

Really, you shouldn’t be using easy_install any more — please switch to pip.

If you must, though, you can use a mirror with easy_install:

easy_install -i http://d.pypi.python.org/simple $PACKAGE

To make easy_install use a mirror globally, put:

[easy_install]
index_url = http://d.pypi.python.org/simple

Into ~/.pydistutils.cfg.

Because “easy_install“ is no longer under active development, I don’t expect “easy_install“ to ever gain native mirror support. Once again, you should switch to “pip“.

Correction

I’m informed by PJ Eby, the maintainer of setuptools and easy_install, that those projects aren’t quite dead yet and that he’s got plans to add mirroring support to easy_install.

I apologize for getting my facts wrong about the state of setuptools and easy_install

Still, I suggest that folks switch to pip: it’s got a larger development community, more active forward progress, and it’s a better tool in general.

What’s next?

As I’ve been saying, the next steps here are to add support for automatic failover to a mirror. PEP 381 describes the mirroring protocol and the validation procedure client libraries should take to determine which mirrors exist and which are valid.

I expect that the maintainers of the various packaging tools are working on this problem, but I’m also sure that patches to distribute, pip, virtualenv, and Buildout (and any other tool that requires PyPI to run) would be more than welcome.

Comments:

Simon Willison:

I've run in to this problem a bunch recently as well.

Sadly, just mirroring PyPI itself is not enough - many PyPI packages have the actual .tar.gz file hosted elsewhere, so even if PyPI is up a file might still 404. The "python-openid" package was linking to a 404ing URL a few weeks ago, which broke my own django-openid package and meant I couldn't deploy updates to one of my sites using my Fabric deployment script.

At work, the first line in our requirements.txt file is "-f http://pypi.internal/" - which causes pip to use our own internal PyPI mirror (just an Apache directory listing which pip/setuptools is smart enough to scrape).

For my personal projects I haven't set this up yet, but I've experimented with putting full URLs to copies of the .tar.gz files that I host on my own server directly in to the requirements.txt file.

To my mind, any deployment script that can fail if a server under someone else's control throws a 404 error is a bad thing. I haven't got the perfect solution yet, but it's definitely something that the Python and Django communities would benefit from solving.

Zellyn Hunter:

We're doing something similar to Simon here at work. We're also mirroring git, svn, and mercurial repositories for those things that we install directly from source.

An easy-to-setup PyPI clone that also knew how to download from the real PyPI once and then cache indefinitely would automate all of this headache.

John Keyes:

I've tried the full URL to a .tar.gz approach but it lacks version support. Why not force packages registered in PyPi to also host the package there?

Anders Pearson:

I have a 'requirements/' directory that gets checked into every project with tarballs of every single library that it uses (even Django) and a bootstrap.py script that, when run, creates a virtualenv and installs all the tarballs from the requirements directory into it. manage.py and my .wsgi file all point at the ve/bin/python executable instead of the system one. My deployment system runs the bootstrap.py automatically on a push. This keeps me from ever having to worry about PyPI (or even a mirror on a local server) being down. It also ensures that I can have multiple projects all deployed on the same server and they can each have different versions of the same libraries without worrying about conflicts. The cost is a bit of extra disk space and bandwidth, but I find that it's trivial compared to the peace of mind of never having my deployments fail and never having to deal with version conflicts between projects deployed to the same server.

John Keyes:

Anders, I do something similar but I use pip to install the requirements. Having the files locally also means I lose the version support of PyPI (or maybe I'm missing something).

I like Simon's internal PyPI mirror idea. I'd get the best of both worlds then.

Marius Gedminas:

There's a typo in the footnote: "last-modofied".

PJ Eby:

Wow, who died and put you in charge of easy_install's development plans? For your information, PEP 381 is on my radar and I've been investigating how to implement a fail-over mechanism.

I'd also appreciate it if you didn't spread FUD about the state of setuptools' development.

Mat Lehmann:

I just recently explored the use of collective.eggproxy - and it works quite nice. But since it only caches packages, that were asked for before, it is mainly useful to ensure the setup and deployment of a product with known dependencies. But thats at least a very important point - to be able to install a product, even if PyPI is down.

Anders Pearson:

John Keyes, what exactly do you mean by "lose the version support of PyPI"?

Dominic Mitchell:

Simon, you're worried about a server somewhere giving you a 404? Fallacy#1. http://en.wikipedia.org/wik...

matt harrison:

Thanks for the post. I've been trying to migrate to automated PIP installs (from manual virtualenv) and have been hitting some bumps (upgrades not working, wrong version being installed, ...). This has given me the idea to just upload the sources and the start a little webserver. Perhaps that will cure my ails...

Anders Pearson:

To follow up a bit, here is an example of my bootstrap.py: http://github.com/thraxil/s...

You can see that it uses pip to install (and I keep a local copy in the same directory so python-virtualenv and python-setuptools are the only Ubuntu packages that I have to have globally installed to bootstrap). (it also installs one egg since I can't seem to get that one to compile reliably for some reason).

The bootstrap.py and requirements directory are automatically set up for me when I create a new project. I use this paster template: http://github.com/ccnmtl/cc... and do 'paster create --template=ccnmtldjango' instead of startproject.

Jacob Kaplan-Moss:

Phillip, please watch your tone here. This is my blog, and I don't have to tollerate hostility here.

I'm not "in charge" of anything. I'm a user, not a developer, and I'm trying to read the tea leaves and figure out where things are going. Distribute, Pip, and Buildout all have vibrent communities, multiple committers, responsive and kind maintainers, and clear forward motion.

Setuptools has you, a single maintainer with a track record of empty promises and long absences. Further, the Python community is ill-served by having a fork between Setuptools and Distribute. Again, Distribute has a large community, an active maintainer, and implicit -- if not explicit -- community blessing.

As a user, which would you choose? Which would you bet your future on?

PJ Eby:

I'm not sure why you're telling me that the community is ill-served by a fork, since I didn't fork it. I also personally think that there are some good qualities in Distribute, as you will see here:

http://mail.python.org/pipe...

So, there are definitely some things that Distribute has moved forward on. Unfortunately, bug fixing is not one of them! The post I linked to above was towards the end of a rather long thread about a bug that was fixed in setuptools 9 months ago, but as of this month, had still not been fixed in Distribute.

(And, if you look at the full thread, you will see that "responsive and kind" is not really an apt description of how Distribute's lead handles public criticism.)

All that being said, I'll be honest with you here: I do regret the tone I took with you and in fact couldn't get to sleep last night when I thought back on it and realized I was allowing my frustration with other people to spill over onto you, and that I should've considered my words more carefully.

I also realize that I've become far too easily trolled on this topic, and that I'm allowing partisan criticisms to get the better of me.

The problem is that I'm in a damned-if-you-do, damned-if-you-don't situation. If I release updates, I'm accused of doing something mean to Distribute. If I don't release updates, I'm being a bad maintainer. If I announce my development plans, I'm trying to divert effort from Distribute. If I don't announce my plans, I'm abandoning my users.

Until now, though, it hadn't really sunk in that that means I should probably just do whatever the hell I want and pretend Distribute doesn't exist, since no matter what I do, *some* of its partisans will find a reason to complain, and a way to paint my actions as evil.

That being the case, I might as well stop trying to find ways to make them realize I'm not actually evil, and instead just do whatever I would've done if Distribute did not exist.

Paradoxically, Distribute, or more precisely, its FUDslingers, have had a *negative* impact on setuptools' release frequency. There have been several times in the last two years that I had the opportunity to work on a new release, and didn't, simply because contemplating the probable round of fresh criticism put me off the idea.

If it were purely a technical competition, that wouldn't bother me. But being branded as evil for releasing updates to software, that those same people *just* complained about not being updated often enough, well, that really yanks my chain.

Sorry to subject you to the crossfire, although, again, I'd really rather people promoted Distribute by talking *up* Distribute, rather than talking *down* setuptools. Surely if *I* can find positive technical points to say about Distribute, so can you.

Jacob Kaplan-Moss:

You're completely right that I the right way to promote Distribute -- or anything -- is to talk about its benefits, not the alternative's flaws. I'll try to do that.

I do think you're wrong that you're in a dammed-if-you-do, dammed-if-you-don't situation. The fact is that the momentum has moved on from Setuptools to Distribute, so, yes, you will be fighting upstream whether you work on Setuptools or not. There is, however, a third way: cooperate with Distribute and end the fork. As long as there's a fork it's bad for the community, so you could be the hero who forgoes his ego, cooperates with the competition, and betters the whole community.

I really wish we didn't have to have "partisans" and "sides" when it comes to Python. It fractures the community, drives otherwise polite people towards impolite behavior, and really harms the casual user. I hope that this period of conflict eventually resolves quickly.

PJ Eby:

On at least two previous occasions, I've attempted to negotiate an official handoff of the setuptools 0.6 line to the Distribute team... both of which were brought to a halt by name-calling from the Distribute side.

So, it may be that there are *individuals* who want things to go well, but collectively, the Distribute team is not in agreement as to what they want from me. Some want me on the team, and some demonstrably do not.

So, AFAICT, it's still slammed-if-I-do, slammed-if-I-don't. If I do the work myself, I'm being "closed". If I offer to bless the fork or merge the team, my "motives are suspicious" and I'm "trying to steal their thunder".

"Responsive and kind" is a definite misnomer for this particular part of the Distribute team. Had the parties in question not been a part of the Distribute team, the fork would've been over almost before it started, and I'd have been working on the 0.7 line in my spare time these last couple years, instead of shuddering every time I think of updating 0.6, let alone imagining the conniptions and accusations I still expect to recieve from that quarter the minute I roll out 0.7a1, or even announce a firm plan to do so.

On the bright side, this conversation has at least made me realize that it's silly to hold up setuptools development on account of somebody else's whinging. There may be nothing I can do about the whinging either way, but I can certainly do something about my reaction to it.

And, on the whole, I think I'd rather be criticized for making new releases than for *not* making them.

Jacob Kaplan-Moss:

This whole situation just gives me heartburn, so I'm going to stop discussing it now.

My opinion is that there's too much ego here; someone needs to be the bigger man, offer to drop their ego, and submit patches to the other that'll end the fork.

Untill that happens, I'm going to continue to look for pragmatic, real-world solutions. For me, that's picking a winner, and for me that winner's Distribute.

Brett Haydon:

For development pypi availability is like not getting your fix..the withdrawal symptoms will pass. For production running pip bundle -r [requirements.txt] beforehand (when you do have availability) means you're not dependent on pypi on production/staging deployment. The great thing about the pip bundle is that it nicely freezes your package versions into a pip-manifest in the bundle, meaning you can leave your requirements file fairly loose, and it's also a lot faster to install.

John Keyes:

Anders, if I refer to a .tar.gz in a requirements file pip will always install it, regardless of whether it has already been installed. By using package==version pip will not install that version if it's already in my virtualenv.

Adam Groszer:

Just simply setup your in-house pypi mirror+private egg server from: http://pypi.python.org/pypi...

Drew Engelson:

In addition to pypi itself being down, we also suffer when pypi points to download links that may be down as well. This was especially problematic in our automated cloud-based systems which depend heavily on pypi.

As a result, I've have implemented my own mirror using z3c.pypimirror. And then used the same techniques you describe above to use my local mirror instead of public pypi. The entire repo is quite beefy, but you can configure it to just pull in the packages you need.

Andreas Jung:

[buildout]
index = http://d.pypi.python.org/si...

is not sufficient for buildout. Buildout will still try to contact PyPI in the first place. You can black those PyPI looks by using the allow-hosts option of zc.buildout...but this has side-effects. Right now official PyPI mirros don#t help much.

Charlie DeTar:

Thanks, this was helpful. Using pip 0.3.1 (the current stable for Ubuntu), setting configuration in ~/.pip/pip.conf did not work for me (pip still used the default servers, not a mirror). However, providing the `-i` switch with the mirror URL worked fine.

Alfredo Deza:

This is not a good situation. It is embarrassing for the Python community to go through things like this. In the meantime, I spent a couple of hours writing an outgoing self-balancer for PYPI, that will figure out if PYPI is up or not and forward your request to the next mirror.
http://code.google.com/p/yo...

I truly hope for this to be over soon (and I so wish Ian's idea of App Engine goes through)

Steve Holden:

If there is friction between the setuptools and distribute maintainers that's a pity, and it will hurt the Python community as a whole whichever side it comes from. I hope both sides, if there have to /be/ sides, will exercise moderation.

Setuptools was a brilliant idea, and the community owes Phillip a debt of gratitude for developing it. He put a lot of hard work into proving the concept's practicality. The fact that there is now a fork underlines the desirability of the goal.

The real issue, I believe, was the long hiatus where setuptools was left in an incomplete (usable, but with known bugs that received no attention) state at around r0.6. So Phillip is paying for that period of negligence rather than anything that's happened more recently, since I have seen evidence that he is once more actively developing the code base.

As far as I know, since the PyCon 2010 Language Forum the official policy has been to deprecate setuptools so that it can be replaced by setuptools2 in some distant version, but I am not close to this topic so I could be mistaken.

The PSF would like to provide a better infrastructure for PyPi/the cheeseshop. It seems desirable long-term to fold the actual tarfiles into the redundant infrastructure that gets developed, to avoid reliance on individual developer web sites that don't offer redundancy. I don't believe PEP 381 has much to say about that.

Leave a comment:

Use your real name, or risk deletion.

Optional.

No markup allowed. Linebreaks will be converted; links will be linkified.

Be nice; don't be that guy.