Google+ Followers

zaterdag 23 november 2013

To ENUM or not to ENUM?

An ENUM restricts the allowed values for a column to a hardcoded set. This sounds exactly like what a foreign key does, but there are a few significant differences.

Firstly, an ENUM is part of the database schema, not of the data stored *in* the schema. It is not meant to be changed by a user, in fact it is supposed to be impossible for anyone other than the administrator to change the values.

Secondly, a column that is controlled by an ENUM will sort it's values accoring to the structure of the ENUM, not according to the actaul value. If the ENUM defines 'Zed' before 'Cucumber' then order-by will return 'Zed' before 'Cucumber'.

So why would you use an ENUM if it's not easy to change and doesn't sort properly?
Well, that's exactly why: the options are not going to be changed by a user and the sorting is hardcoded. This means that the rest of the database can make assumptions about this without having to look values up and manually sort them.

How does ENUM work in real life?

For MySQL the answer is simple: it doesn't. MySQL's implementation is very bad, it does not even check for duplicate values in the declaration:
mysql> create temporary table foo (a ENUM('a','a'));
Query OK, 0 rows affected, 1 warning (0.01 sec)
And removing existing options from an ENUM... well best not to think about that because MySQL wouldn't be MySQL if it didn't just NULL the fields that point to the removed value (unless it's NOT NULL, in which case... well.. bye bye data).
MySQL has no alternatives besides messing about with triggers, so use a lookup table instead.

In PostgreSQL the ENUM is enforced properly and it is done trough a custom datatype so you can re-use the definition in all tables that need it, eliminating the chance of discrepancies. as per the manual:

CREATE TYPE mood AS ENUM ('sad', 'ok', 'happy');
name text,
current_mood mood

The ENUM has one property that is often unexpected (and undesired) which is that they sort by the order of the values in the ENUM definition, so in this example an "ORDER BY currentmood" would list 'sad' first, then 'ok' and then 'happy'.

Of you do need ordering to work normally, you can use a CHECK constraint to simply see check if the value is IN('sad','ok','happy'). And you can make that re-usable by creating a DOMAIN for it:

CREATE DOMAIN current_mood_domain AS TEXT
   VALUE IN ('happy', 'very happy', 'ecstatic')
CREATE TABLE foo (current_mood current_mood_domain);
INSERT INTO foo(current_mood) VALUES ('happy');
INSERT INTO foo(current_mood) VALUES ('angry');