Follow me on Twitter, happy to take your feedback and comments.
Distributed systems, stream processing, and Big Data systems are taking the headlines. BUT, WHY do we need them?
- 4 Reasons for building distributes system and how Kubernetes can help us get started.
What is a web app?
A web app is an application that can be accessed over the network, usually over the network via a web browser. It contains both a server and a client. The client contains the user interface and client-side logic. The server usually contains the database, some logic and can use other servers. At the server-side, we often define an API for the server and client to communicate through. This is the language they are communicating. The server can communicate with other servers using the same API as well. There are many ways to define API, the one I like most is by Petr Gazarov.
That’s all nice, but what is distributes web app and why do I need it?
A distributed system is a system with components on multiple machines that communicate over a network by passing messages to one another. Its components often located on a different network of computers and they communicate via passing messages to one another. The messaging protocol and the way they communicate is defined by a specific API that is built to support this cause. Using the message system they coordinate their actions using a communication model. A common model of communication is the master/slave. Where we have one master that control the flow and multiple slaves that acts upon the tasks given from the master_._ This communication model is implemented in Apache Spark, it is called driver and executors, where the driver is the master, and the executors are the slaves.
There are multiple ways of architecting a distributed system with various communications models like client–server, three-tier, n-tier, or peer-to-peer; or categories: loose coupling, or tight coupling. But, today we are going to focus on the why.
So, Why do we need a distributed system?
Let’s say that we build a web app. At the beginning we serve 100 happy customers, our product grows and within a year, we now need to serve 10 million customers. We would like to serve these customers as fast as possible, without them feeling latency. On top of that, we would like the app to be available at all times, no matter what. And, we want the app to perform its intended functionality fully, meaning, we need Reliability.
4 Reasones for architecting a distributed system:
1 - Availability
It is the probability that a system is available at any given time. Making sure that if one server fails, we have another server to replace it without the user noticing any delay. Even better, constantly monitoring the server for making sure it is available for upcoming requests from the customers. Having multiple components ready to take on the role helps make the system highly available.
2 - Scalability
There are multiple components to scalability scalable databases, and system that can hold and manage petabytes of data, and/or a system that can handle millions of concurrent requests. In each of these cases, we can use a distributed approach to build a system that can scale when we need it to. Adding components to the system shouldn’t be hard if we architect our system for scale from the beginning.
3 - Reliability
The probability that a system will produce correct outputs. A reliable system does not silently continue and deliver results with missing or corrupted data. Instead, it detects failure, fires an alert and, if possible, recovers from it. Building a system with reliability in mind is making sure that our customers can rely on the response and data provided by the system. This might include that they are building their business on, like biz dev analysis, risk management analysis for financial institutions and more. As we make a system more reliable we reduce the chance of it failing within a specific time period. We do this by increasing the redundancy or replication of the components and manage them for high availability, making them more reliable.
4 - Transparency
The customers are not aware of the distributed system, since they interact with the products the same as they would with a centralized system. Meaning the distribution of the system is isolated from the customer. We gain all the benefits of a distributed system without the customer feeling it.
Now that we understand the WHY, lets hit the WHAT, What is the easiest way to start?
A managed Kubernetes service such as AKS can provide a solid platform for building a distributed system. We define the number of nodes — which are the number of Azure Virtual Machines and Azure Kubernetes Service takes care of the control plane, or master nodes, which are managed and provided for free. From the picture, you can see a very simplified diagram of the worker nodes of our Kubernetes cluster that host our web app which our customer will access. A customer request is received by the load balancer and delivered to our web application which has its components distributed across our pool of worker node. The components can be replicated and our pool of nodes can grow or shrink according to the load we have.
In this post you will learn all the details of how we can create a distributed web application :)