Fixing PostgreSQL’s default encoding on Ubuntu 9.10

Jacob Kaplan-Moss

December 12, 2009

Ubuntu 9.10 installs PostgreSQL with a default encoding of SQL_ASCII. This is dumb: SQL_ASCII basically means “I don’t care about the encoding of my data; just store garbage.”

This is especially annoying since PostgreSQL will actually prevent you from creating new databases with a different encoding: if you try, you’ll be told that the “new encoding (UTF8) is incompatible with the encoding of the template database (SQL_ASCII).”

My fix was to just blow away the default cluster and re-create it with a UTF8 encoding. It should go without saying that you shouldn’t do this on a server with data you’d actually like to keep, but on a new server this works fine:

pg_dropcluster --stop 8.4 main
pg_createcluster --start -e UTF-8 8.4 main

If someone knows of an easier way — one that doesn’t require destroying data — I’d love to know about it.

Comments:

Matt Howell:

I fixed this problem by rebuilding template1 with UTF-8 so that any new databases would be created using the proper encoding.

The basic steps in psql are here: http://wiki.postgresql.org/...

Just make sure you specify the encoding correctly when you recreate template1 -- like this:

create database template1 with template = template0 encoding = 'UTF8';

Ben Turner:

I just checked an install of 9.10 server edition I setup a couple weeks ago and it's default was set to UTF8 on postgresql 8.4. I don't recall doing anything special when I installed... odd. Is your setup on a desktop or server?

Brent Tubbs:

Gentoo does the same thing, unfortunately, as I learned when switching a Django project from Mysql to Postgres last week. Is it a built in Postgres default?

I still haven't switched my development DB over to UTF-8. Luckily my Webfaction host does the right thing and uses UTF-8, so my production server is happy.

Matt Howell:

I just came back across this and wanted to share --

The underlying problem with my Postgres encoding problem was that my locales weren't properly defined on my Ubuntu VPS. Postgres was apparently taking its cue from the OS, but the OS didn't have its default encoding set properly.

I ended up fixing that issue by following the instructions here: http://blog.peopleareducks....

tbielawa:

Holy crap! Some one found my website useful! I saw a bunch of referrals from jacobian.org. I'm glad I'm not the only one who benefited from that post on setting locales :)

Peter Baumgartner:

I was just bit by this today. I needed to keep some of the data around, so I used Matt's trick and then did a

pg_dump mydb > mydb.psql
dropdb mydb
createdb mydb
psql mydb < mydb.psql

From what I understand, Postgres handles the encoding conversion gracefully and it seems to work for me.

Leave a comment:

Use your real name, or risk deletion.

Optional.

No markup allowed. Linebreaks will be converted; links will be linkified.

Be nice; don't be that guy.