Why do I put a table between the images and the data
If you have a table database, you might encounter a situation where you need to put images in that database. It is common to put the images in a separate database that is more suitable for images and other files. You then put a reference to the image in your table — a direct approach.
However, I suggest creating a separate table for all your images and referring to this table instead. This makes the location of the image way more dynamic.
A simple database with 2 tables both storing images. The tables make direct reference to the images
A simple database with 2 tables both storing images. The tables refer to a table of “images”. This table in turn refers to the image locations
Easier deleting of redundant images
If you want to have a good example of why you need to delete unused images, read this story by Alexandre Olive called: cloud buckets don’t have to be an upload and forget mess
It is very easy to lose track of where images are used, but by leveraging the foreign keys of Postgres it is fairly straightforward.
You can find every place an image is used with the query below. With a bit of code or even more SQL query magic, you can even find out if a specific image is still being used.
This will be something you run every month or so in a cron job making it less problematic if it takes a bit longer to run.
SELECT
tc.table_name AS referencing_table,
kcu.column_name AS referencing_column,
ccu.table_name AS referenced_table
FROM
information_schema.table_constraints AS tc
JOIN
information_schema.key_column_usage AS kcu
ON tc.constraint_name = kcu.constraint_name
JOIN
information_schema.constraint_column_usage AS ccu
ON tc.constraint_name = ccu.constraint_name
WHERE
tc.constraint_type = 'FOREIGN KEY'
AND ccu.table_name = 'images'
AND kcu.column_name IS NOT NULL;
Vendor lock-in
Vendor lock-in happens when you get stuck using a particular company’s products or services. It becomes hard or expensive to switch to a different company because you’re too dependent on what the first one offers. This can limit your choices and make it tricky to change to better or cheaper options.
To give you an idea of how much this can matter, Azure’s cheapest file storage is $10 TB/Month whereas BackBlaze is $6 TB/Month. This is a 40% cost saving on storage. Given how the additional cost of the services works, I had a personal experience where it totaled 80% cost saving on storage by changing vendors.
This is why you need to fight vendor lock-in as much as possible.
Vendor lock-in will also happen if you use a built-in feature to store image meta-data. Moving the data to a new vendor will increase your cost to move, meaning you are more locked in.
When you have a separate table for all the images you only have to change the references in one place to switch vendors. This can be done on everything in one go, but also can be done only for new images, or with numerous other methods. As long as the image table is up-to-date, you as a back-end developer can choose where data is stored.
Keep in mind that if you have private storage and you need to deal with keys, it might become a bit harder. It is a good idea to look into the S3 protocol, it has a wide adoption amongst storage vendors and can make switching easier.
Image transformation
If you let users upload any image they have you might get some users that upload raw or PNG images with 4k resolution. These might be a bit too much for you. Or as technology moves on, you might want to convert to a new image format. Maybe WEBP or AVIF hit the threshold on caniuse.com and you want to make the switch.
If the images are in one place you can just upload a new image, update the row, and delete the old image. Done!
Keep in mind that the front-end or application needs to be able to deal with the new image! This line might have been added because I made a front-end developer really angry with my storage savings.
Conclusion
In summary, putting a table between images and data in a database offers benefits like efficient image management, easier deletion of unused images, and flexibility in handling vendor changes. This approach simplifies tracking image references, reduces the risk of vendor lock-in, and streamlines updates, making database design more adaptable and cost-effective.