Hi everyone, recently I got hooked into system design and I wanted to share this knowledge. A lot of things started making sense once I started deep diving into system design. I am very excited to share this with the world. Before we go deeper, let us understand what System design actually means.
System design aims to design scalable, reliable and maintainable systems. Every system is unique and the design should be catered to the specific situation. (I talk about the terms like scalable and reliable below, don’t worry).
System design is basically designing scalable, reliable and highly efficient systems that are fault tolerant and meets all the business requirements as well.
It simplifies and explains the components, APIs, data models and the relationship and dependencies among them.
Obviously no design is perfect and there is always a scope of improvement, but we always try to pick the most optimized and correct solution according to our needs. We also consider our limitations like cost etc.
For example - If we have a lot of unstructured data, we would prefer NoSQL or maybe a combination of the different types of databases. Or if we are expecting a lot of data in blob format (images/videos), we will use something like S3 buckets with a CDN solution. If these terms scare you, don’t worry, coz I used to feel the same whenever I used to see the word CDN. (it gets better)
🤔 But why system design, why can’t we code directly?
You can, it will take more time, more redos and it will be far less efficient and way CHAOTIC.
Why would you hammer a nail with your hand, when you can actually use a hammer. I know lame example, but I hope you got the point. Doing a huge task without planning and with planning, the difference is insane. A well designed system is scalable, performs better and makes your life easier as a dev.
Let us understand the building blocks or the basic stuff you need to know before attempting to design systems:
Scalable: Scalability means the system should function properly when the system has grown from x to 100x (or 200x) users, or when the system is processing larger volumes of data than before. In other words, the system should be ‘scalable’ as in able to handle scale.
Reliable: The system should perform as expected by the user in all conditions, it should not break in case the user makes mistakes, and should be able to handle the load. Basically it should be reliable in all scenarios and work as expected.
Maintainable: Once the system is up and running, over the years, the main task is to maintain it and keep it running. Being maintainable means the system should function properly over time and adapting new functionalities should be smooth.
Load balancer: Responsible for dividing the load among servers, so that the service functions correctly in case the load is increased. It distributes the load coming from users to all the different servers so there is not load only on one server. Helps in achieving scalability.
Caching: By caching we mean that storing frequently needed data or computational results, by doing this we do not need to query the database or compute those results for each request.. So you can think of a cache like a smaller and faster storage. Common examples: redis, memcached. It reduces latency and improves user experience.
CDN: CDN or content delivery network, caching and distributing data to hundreds or billions of users. Distributing our data to a number of users.
API: APIs or application programming interface is an effective way to communicate between users and backend servers or from one server to another. Commonly we use REST APIs in modern system designs.
DNS: DNS or domain name server,translates domain names to IP addresses so browsers can load Internet resources. In simple words, the public address of the server, in human readable form.
Databases: Storage for data so to access it later, when you are designing your system, you would not want to worry about how to store the data, where to store the data so we depend on databases or databases management systems. We have a wide range of databases, and we choose them according to the type and amount of data and our budget. A lot of times we use a combination of different database options to fulfill our needs. Examples include PostgreSQL, MySQL, MongoDB, AWS DynamoDB, and Cassandra. (etc.)
Database Sharding: Dividing your database into shards, or partitions, makes it faster as queries now has to deal with lesser data. Ex- 4 million data is divided into 4 shards, so now every query will read just 1 million data, so faster. We need to consider query routing mechanisms.
Messaging queues: Helpful in asynchronous flow and makes the system fault tolerant. Examples: Rabbit MQ, Amazon SQS (Read bout it)
Rate limiter: No matter how scalable your system is, there is always a limit, suppose there is a malicious user who is sending too many requests, there will come a point when your service will start overloading or malfunctioning, we do not want this scenario. We want our users to be able to rely on our system, Rate limiting means limiting the number of requests per second or per minute according to our need. There are multiple ways, for ex: we can limit API calls made to our backend per second from an IP address.
Micro-services and monolithic architecture: Well in simple words, micro services based architecture has different independent services for different things or functions, code is loosely coupled, easier to maintain and update, flexible. Monoliths are the opposite, one whole service has all the parts. It is faster and simpler to develop, easier to debug, you might have seen this a lot in open source or personal projects.
Logging: A mechanism to keep logs of what happened in our system, so that we can debug later in case our system runs into an issue. For ex: whenever a metric spiked we can see on AWS Cloudwatch to see what went wrong.
Elastic Search: A lot of systems need search functionalities these days, where the user can enter a text or query and get results. We can query in a traditional database but it has a lot of intricacies, here comes elastic search. It is a text search engine built on Apache Lucene, (it has Fuzzy search, because we do not want the system to break when the user types bracekets instead of bracelets).
A lot of the topics need detailed discussion which I will talk about in later blogs. This is a system design blog series. We will learn more as we go down this journey. Feel free to google the topics and learn more about them.
I hope you enjoy designing systems. :)