How to Design a Clean Database

Mohammad Faisal - Jul 21 '23 - - Dev Community

To read more articles like this, visit my blog

No matter what kind of developer you are, every once in a while, we come across an API which returns data in such a way that we don’t have to spend much time understanding it.

But generating this type of clean and consistent result takes time, effort, and experience. Today we will take the first step towards designing a clean database.

We are keeping it short and to the point. Let’s start

Some Terminology

Table: this is a collection of data

Primary Key: This is the unique identifier of a table

Attribute: means property of your data. For example, name is an attribute of a user .

Data Type: Data types represent the various types of your data. For example -string, int, timestamp, etc.

1. Words should be underscore separated

When your attribute name has more than 1 word, then separate it with snake_case. Don’t use camelCase or any other case for consistency.

bad

wordcount or wordCount
Enter fullscreen mode Exit fullscreen mode

Good

word_count
Enter fullscreen mode Exit fullscreen mode

Reason

  • Improves readability

  • Names can become more platform-independent

2. Data Types Should not Be Names

Never have data-types as your column name. This happens mostly for timestamp parameters. Give a meaningful name to it.

Bad

timestamp or text
Enter fullscreen mode Exit fullscreen mode

Good

created_at or description
Enter fullscreen mode Exit fullscreen mode

Reason

  • Using data types can create confusion on the other end of the application.

  • Giving a proper name gives more context to the usage of the parameter.

3. Attribute names should be lowercase

Don’t use upper-case names for your attributes.

Bad

Description
Enter fullscreen mode Exit fullscreen mode

Good

description
Enter fullscreen mode Exit fullscreen mode

Reason

  • This practice avoids confusion from upper-case SQL keywords

  • It can improve typing speed

4. Write Full Words

Don’t try to shorten the names of columns for the sake of space or any other logic. Try to be as explicit as possible.

Bad

mid_name
Enter fullscreen mode Exit fullscreen mode

Good

middle_name
Enter fullscreen mode Exit fullscreen mode

Reason

This rule promotes self-documenting design

5. But use common abbreviations

An exception of rule-4 is when you have a widespread abbreviation. In those situations, go for the short one.

Good

i18n
Enter fullscreen mode Exit fullscreen mode

But if you find yourself in confusion, go for the full name. It’s an investment you are making for the future.

6. Avoid having numbers in a column name

Believe it or not, I have seen it enough. Never have numbers in your column name.

Bad

address1 , address2
Enter fullscreen mode Exit fullscreen mode

Good

primary_address, secondary_address
Enter fullscreen mode Exit fullscreen mode

Reason

This is a sign of very poor normalization on your end. So try to avoid it as much as possible.

7. Use short table names

Be very careful when naming tables because long table names can have a huge bad impact in the future.

Bad

site_detail
Enter fullscreen mode Exit fullscreen mode

Good

site
Enter fullscreen mode Exit fullscreen mode

Reason

Short table names will help you when you create relational columns and linking tables.

8. Lookout for reserved words

Each database has some reserved words. Learn them and avoid them.

Bad

user lock table etc
Enter fullscreen mode Exit fullscreen mode

List of reserved words for some popular database

9. Singular names for tables

Always try to use singular names for tables. This is a controversial one, and different people have different opinions. But stick to one.

Bad

users and orders
Enter fullscreen mode Exit fullscreen mode

Good

user and order
Enter fullscreen mode Exit fullscreen mode

Reason

  • This promotes consistency with primary keys and lookup tables

  • Pluralization can be tricky sometimes. So having singular table names can make it easier to program.

10. Linking tables should have alphabetical order

When creating a junction table, concatenate the names of the two tables in alphabetical order.

Bad

book_author
Enter fullscreen mode Exit fullscreen mode

Good

author_book
Enter fullscreen mode Exit fullscreen mode

11. Singular Column Names

Usually, it’s the best practice unless you are breaking data normalization rules.

Bad

books
Enter fullscreen mode Exit fullscreen mode

Good

book
Enter fullscreen mode Exit fullscreen mode

12. Primary key name

If it’s a single column, then it should be named as id

CREATE TABLE order (
  id            bigint PRIMARY KEY,
  order_date    date NOT NULL
);
Enter fullscreen mode Exit fullscreen mode

13. Foreign key name

It should be the name of the other table and the referred field. For example, if you are referencing a person inside your team_member the table then you can do it like this.

CREATE TABLE team_member (
  person_id     bigint NOT NULL REFERENCES person(id),
);
Enter fullscreen mode Exit fullscreen mode

14. Never suffix column names with types

There is no point in suffixing your column names with types of data. Avoid doing this.

Bad

name_tx
Enter fullscreen mode Exit fullscreen mode

Good

name
Enter fullscreen mode Exit fullscreen mode

15. Indexes should have both table and column name

If you are creating an index, then have the table name followed by the column names that you are referencing

CREATE TABLE person (
  id          bigserial PRIMARY KEY,
  first_name  text NOT NULL,
  last_name   text NOT NULL,
);

CREATE INDEX person_ix_first_name_last_name ON person (first_name, last_name);
Enter fullscreen mode Exit fullscreen mode

16. Date type column names

Suffix your date-type column names with _on or _date .

For example, if you have a column for storing the updated date, then do this,

Good

updated_on or updated_date
Enter fullscreen mode Exit fullscreen mode

17. Date-Time type column names

If your column name has time with it, then suffix them with _at or _time .

For example, if you want to store the order time, then

Bad

ordered
Enter fullscreen mode Exit fullscreen mode

Good

ordered_at or order_time
Enter fullscreen mode Exit fullscreen mode

18. Boolean type column Names

If you have boolean type column names, then prefix them with is_ or has_ .

Good

is_admin or has_membership
Enter fullscreen mode Exit fullscreen mode

Final Words

If you are already working on a project, stick to the convention that the project is already following. Because

Only thing worse than a bad convention is multiple conventions

But if you are learning or designing a database from scratch, having these rules in mind will take you a long way.

What are your thoughts? Is there any rule you disagree with? I am more than happy to have some productive conversations in the comment section!

Have a great day! :D

References

Get in touch with me via LinkedIn or my Personal Website.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .