Enum Types in Postgres
This is another in the series “Yeah, Postgres can do that.” You can check out the first post about Postgres history tracking.
The Rails framework, including Active Record, is built so that you can switch from one database platform to another. If you want to switch from Postgres to MySQL, in theory we only have to update a few lines in a configuration file.
It’s a nice sentiment, but in my seven years as a web developer I have never seen this happen nor do I know anyone who has done this. In my experience it’s actually more common that Postgres remains in place and Rails is the thing that gets swapped for something else such as NodeJS, Python, or Elixir.
If we accept that Postgres is not going anywhere, we may as well take advantage of its rich features, such as the enum type.
Enum Types 101
An enum type is a data type whose value is chosen from a finite enumerated list of possibilities. Imagine a website with a job board. Each job application can have a status of approved
or denied
. We could store those values as plain text, but that requires more storage space than necessary. If there are only two possible choices, we can store that information with a single bit. We should also not be allowed to store anything other than approved
or denied
in that column. Enum types provide two features: data validation and small storage space.
Active Record Enum Type
Active Record implements its own enum type. It’s designed to work on all database platforms including the simplest of platforms such as SQLite. The value is stored in the database as an integer and then Ruby converts it to and from its plain-text value.
class JobApplication < ActiveRecord::Base
enum status: [ :approved, :denied ]
end
In the above example, if a job application is approved
, the status
is stored in the database as 0. If it’s denied
, the status
is stored as 1.
Using this approach has some drawbacks. First, if your company or client has a data analytics team they will despise you. Often the business analytics teams need to run straight SQL queries:
-- The business analyst needs to remember 1 equals "denied"
SELECT COUNT(*)
FROM job_applications
WHERE status = 1
Wouldn’t it be nicer if they could do this instead?
SELECT COUNT(*)
FROM job_applications
WHERE status = 'denied'
Furthermore, the data risks being corrupted by a faulty change in the code. Imagine we later decide to add a new status, pending
, but we accidentally do it like this:
class JobApplication < ActiveRecord::Base
enum status: [ :pending, :approved, :denied ]
end
Now our data will be corrupted. Applications that were previously approved were saved with status
equal to zero. Now they will show up as pending
. So if we catch this error, all we have to do is revert it, right? Not so fast. What happens to data that was saved around the time of the deployment? Was status
= 0
meant to be approved
or pending
? Depends on whether that value was saved shortly before or shortly after the deployment, which is difficult to tell.
In a perfect world no developer would ever make this mistake, but mistakes happen. A robust system means that a mistake is an annoyance, not a catastrophe.
Postgres Enum Type
What if we could have our cake and eat it too? What if the data appeared as plain text in Postgres but its values were validated and efficiently stored? That’s where the Postgres enum type comes into play.
Creating a New Type
Postgres allows you to define custom types. As soon as the type exists, we can add a column to an existing table using this type.
CREATE TYPE type_status AS ENUM ('approved', 'denied');
ALTER TABLE job_applications ADD COLUMN status type_status;
Now Postgres will figure out the most efficient way to store these values. It will also raise an error if anybody tries to store an invalid value.
In fact, newer versions of Rails migrations allow you to generate enum types automatically. Just be aware that those migrations are not yet reversible, so some custom SQL commands are still necessary.
Modifying an Existing Type
Let’s say we decide later on that we want to add a pending
status:
ALTER TYPE type_status ADD VALUE 'pending';
If we want to remove it later on (either in a down
migration or a reversible migration):
ALTER TYPE type_status REMOVE VALUE 'pending';
We can even rename types later on, converting existing values:
ALTER TYPE type_status RENAME VALUE 'pending' TO 'needs_review';
Integrating Postgres and Active Record Enums
Once we have run the migrations, we can replace the array in Postgres with a hash where each value equals its key.
class JobApplication < ActiveRecord::Base
enum status: {
approved: 'approved',
denied: 'denied',
pending: 'pending'
}
end
Now we have two layers of validation. The first is at the Rails level and the second is at the Postgres level.
In general, Rails validation errors make for a nicer experience for the user. If Postgres throws an error the user will get a frustrating “500: Something went wrong” error. That’s still not as bad as a database with corrupted data.
Optional: Ditch Active Record Enum
One of the signature traits of software engineers is that we love to disagree. The Active Record enum types provide a host of helper methods. Many of my talented colleagues at The Gnar disagree with me on this one, but I think it’s way clearer to write something like this standard Active Record query:
# This is standard Rails
job_application.update!(status: 'denied')
than to use the helper method created with enum:
# This is the enum method:
job_appication.denied!
and this code:
JobApplication.where(status: 'denied')
Is at least as intuitive, if not more so, than this:
JobApplication.denied
In my humble opinion the convenience methods provided by Active Record enums don’t justify the confusion they create. But before we ditch Rails enums, we should remember that enum provides some very nice data validations. No need to throw the baby out with the bathwater. Luckily ActiveModel::Validations
, which is part of ActiveRecord::Base
, gives us a clean way to validate our data.
class JobApplication < ActiveRecord::Base
validates :status, inclusion: { in: ['approved', 'denied', 'pending'] }
end
Then we can leave all that enum magic to Postgres.
Conclusion
A colleague of mine used to say “Postgres can probably do the thing I want. I just haven’t gotten to that part of the documentation.” This is just one of the many rich features available by default to us Rails developers. In a future blog post, I’ll cover Postgres’s full text search feature.