Naming Surrogate Primary Keys

db-schema

It’s one of those questions of style that aren’t always possible to be objective about. Often they come down to personal preference. But we still like to think we have decent enough reasons for our choices. So we try to justify them, often to ourselves but occasionally to everyone else. Sometimes we just darn call out anyone who disagrees and tell them we think they are foolish and uneducated. And a flame war ensues. I am not trying to start one now.

Primary key politics

I’ve designed a good number of databases in my time and, needless to say, confronted some of the more divisive primary key politics on many occasions. Surrogate or natural? Integer or UUID? Lookup table or simple column? Composites are the work of the devil…

But today I want to talk about something more benign: How to name a primary key. And I have to tell you from the onset that I have mostly been firmly planted in the surrogate camp from the day I designed my first production database. We have to get that out of the way otherwise nothing else in this post will make sense.

I could tell you a lot about why I love surrogates, but that’s not the goal of my post today. I have to tell you though, that I like how they jump out and grab your attention the moment you open a table. Because often they have beautiful, predictable and intuitive names: id, item_id, uuid, item_uuid… Placed at the beginning of a table, columns like these are like a skimpily dressed, smartphone-wagging young woman sashaying into a restaurant. Vain she may be, but she is impossible to miss. Natural keys, on the other hand, are not unlike the nondescript damsel curled up in a corner, all dignified and covered up and reading a copy of the Business Daily. Sure she may be full of substance, inner beauty and all that is good and holy, but it takes a good amount of prowling to know that she’s even there.

To prefix or not to prefix

But if surrogate key names are so beautiful, predictable and intuitive, what is this post for? Well, it turns out that the choice between item.id and item.item_id isn’t immune to the emotional flare-ups and touchiness of software developers and data modelling specialists. I have personally always preferred that the primary key be a single short name such as item.id rather than the uselessly repetitive and verbose table name plus short name variety of the item.item_id kind. However, I do use this later notation for foreign keys, where I think it make sense since the referenced table is not otherwise obvious.

In my experience, those who favor the shorter primary key approach are the minority. To be sure, proponents of the longer primary key nomenclature generally apply it also for their foreign keys, creating a welcome point of convergence for both camps. They argue that they end up with the same corresponding primary and foreign key names, which makes join queries a lot more joy to read and write. But I say what, pray tell, do you gain by writing item.item_id = stock.item_id that you don’t by writing item.id = stock.item_id?

Some will say that you don’t always include the table name when mentioning a column in an SQL query. I say that is bad practice and you really should be including the name of the table. For really huge queries, the last thing a reader wants is to have to sift through the inconsistency of some column names being table-prefixed while other column names (presumably the unambiguous ones) are left un-prefixed. Many times the reader of your queries will scarcely be familiar with your database and will appreciate the ability to instantly recognize a column as belonging to a particular table.

How do you name your (surrogate) primary keys?