Day2 -> Learning Databases

Remember 90% of battle is won, when you select correct database for your system.

There are 2 types of Databases Relational(SQL) and Non-Relational(NoSql). In todays post we will learn about Relational Databases, ACID properties & Scaling them.

RELATIONAL DBs

Relational DBs were originally created for financial purposes. In 90's there were financial companies had ledgers that were filled by people to mantain financial records and it was tiresome as well as error prone.
So Relational DB's were inspired from those ledgers like rows and columns. They had to follow all the properties those ledgers had.
All the properties provided by Relational DB's can be generalised to ACID properties :

A -> Atomicity
C -> Consistency
I -> Isolation
D -> Durability

ATOMICITY

This simply means that if you have multiple queries under a transaction, either all of then will be executed or none of them will be executed.

Eg. Let's say Person A is transferring 50₹ to Person B. Now we have 2 statements to execute, one would deduct 50₹ from Person A and add 50₹ to Person B. So, either both of them will have to be executed or none of them should be executed to mantain consistency between data.

CONSISTENCY

This simply means that Relation DB's provides lots of tools and features to make sure that data is consistent. Some of the features are like Foreign Keys, Constraints, Cascades, triggers and many more.

ISOLATION

This simply means that Relational DB's provide multiple levels of transparency between two transactions.

Eg. Let's say Person A and Person B are running transactions at the same time on the same data, Now we can decide on our requirements when, where or how much transparency is there about changes.

Isolations have 4 levels

READ UNCOMMITTED
This isolation level allows dirty reads, i.e., where a transaction may see uncommitted changes made by some other transaction. This means values in the data can be changed and rows can appear or disappear in the data set before the transaction completes.
In general, don’t use READ UNCOMMITTED. The one and only time it ever makes sense is when you are reading data that will never be modified in any way. For example, if you need to scan a large volume of data to generate high-level analytics or summaries, and absolute moment-of-query accuracy is not critical.
READ COMMITTED
It is the default isolation level in most Postgres databases, and older SQL databases in general.
It restricts the reader from seeing any intermediate, uncommitted, ‘dirty’ read during a transaction and guarantees that any data read was committed at the moment it was read.
REPEATABLE READ
REPEATABLE READ is the ideal isolation level for read-only transactions.
REPEATABLE READ would be a good choice for a financial app that calculates the total balance of a user’s accounts. This isolation level ensures that if a row is read twice in the same transaction, it will return the same value each time, preventing the nonrepeatable read anomaly mentioned above. In this case, when calculating a user’s total balance, REPEATABLE READ guarantees that the balance does not change during the calculation process due to concurrent updates.
SERIALIZABLE
SERIALIZABLE is the highest level of isolation (and the default isolation level in CockroachDB).
Transactions are completely isolated from each other, effectively serializing access to the database to prevent dirty reads, non-repeatable reads, and phantom reads.

SCALING DBs

There are 2 types of scaling - VERTICAL & HORIZONTAL.

VERTICAL SCALING

Vertical scaling is simpler and should always be the first option in terms of scaling.
Vertical scaling is adding more power like increasing RAMs, CPU or PROCESSOR power, MEMORIES etc.
Vertical scaling is limited by hardware, like whatever source used will have certain hardware limit.

HORIZONTAL SCALING

We know that 90% of the time we perform read operations. So, for these read operations we can have seperate db which handles only read operations.
We can do sharding, partitioning and many more things for scaling.

We will read SHARDING, REPLICATION and PARTITIONING in the next blog.

If you love posts like this, and think its beneficial, do follow and if any confusion do post any questions in comments.

Day1-100 series - Learning System Design for Interviews.

Day2 -> Learning Databases