The Saga pattern is often positioned as a better way to handle distributed transactions. I see no point in discussing Saga's disadvantages because the problem is that Saga should not be used in the microservices at all:
If you need distributed transactions across a few microservices, most likely you incorrectly defined and separated domains.
Below is a long explanation why.
Microservices As Distributed System
Any microservices-based system is a distributed system. To be precise — the simplest possible, basic version of it. Such a distributed system does not maintain any form of consensus. In other words, such a system:
- Lacks any built-in means to coordinate nodes
- Lacks any built-in means to get information about nodes
- If nodes need to communicate, such a communication must be a part of business logic
These properties result in an inability of a microservices-based system to perform certain tasks. For example, perform transactions. Or maintain consistency (even eventual). Or get information if all necessary nodes are up and running (i.e. if the system is available). As a result, if one node needs a piece of information from the other node, it should explicitly include a request to remote service as one of the business steps. It looks surprisingly similar to interactions between, for example, browsers and web servers. Note, that such interaction means that each request is completely independent of each other. All transactions, if they are necessary, never cross request boundaries. Only with such a deep separation of domains, it makes sense to require a microservice to maintain its own data, independently and separately. Or claim independent deployability or testability. Or accept a possibly failed request to another node as an explicit step in business logic.
Since no transaction can cross the request boundary, the service must govern all data included in the transaction. This could be considered a validation criteria for the separation of domains. If any cross-service transaction is necessary, then split-up was done incorrectly.
Domain Size Issue
As soon as we start doing separation of domains according to data governance, we may quickly realize that in the vast majority of cases, microservices look a lot like traditional monoliths and the domain they should handle is big. Or realize that traditional monoliths are, in fact, microservices. This happens because most organizations have only a very limited number of truly independent domains. Most often — one.
Unfortunately, the whole microservices hype ignores this fact, and we get "best practices", "design patterns", books, articles, etc. which are stretching the initial idea of loosely coupled independent services to areas where it does not fit. This results in a mind-blowing, devastating consequences:
- We build unreliable systems (see above about what kind of distributed systems microservices are) on top of reliable ones (cloud infrastructure)
- We get ugly, inherently broken design, where layers are mixed up, communication error handling/retrying/etc. and transaction handling happens at the business logic level
- We split data into parts and then try to collect them to process the request, introducing unpredictable and barely controllable tail latency
- We get the systems, which are unable to ensure data integrity and consistency
- We ought to perform end-to-end testing before deployment because there are no guarantees that a new version of service does not break the whole system. This completely obviates any independent testability and deployability of the services
The list above is definitely incomplete. Incorrectly applied microservices can cause all kinds of harm. Especially when combined with "cloud native" "microservices" frameworks like Spring, which turn the whole system into a bunch of slowly moving monoliths.
All advantages and requirements, which are inherent properties and natural fit for the microservices, either disappear or get transformed into quite painful and expensive obstacles.
Unfortunately, all these considerations might result in much bigger domains than could be considered acceptable for traditional microservices design. That’s fine and just means that we should not use microservices. What can we use then? Let’s take a look.
Handling Big Domain
Since we’re going to handle a big, but single domain, we need to use something capable to maintain consensus. There are at least three options:
- Modular monolith (also known as modulith)
- Event-driven architecture
- Cluster-based architecture
Modular Monolith
This option addresses most monolith pain points, in particular maintainability and concurrent development. Mostly, this is achieved by improving design via application of DDD and other techniques. The ability to access all data, perform regular transactions and lack of communication errors makes this approach an appealing choice for many use cases. In addition, this approach is much easier to fix/rework/refactor/update in atomic fashion during system evolution. Note that inner sub-services are not required to maintain their own data (but they can, if necessary). Shared data is often inherent and natural in this approach, as we’re talking about a single domain. Since this is just a monolith, after all, there are no issues with maintaining the consensus nor deployment/monitoring/maintenance.
The main disadvantage of moduliths is the limited scalability. At some point, there might be a need to switch to another design. Fortunately, modularity significantly simplifies the transition.
Event-Driven Architecture (EDA)
EDA is quite a popular choice and while often it is mentioned in the context of microservices, this is incorrect. By design, EDA leverages reliable data sharing infrastructure in the form of message brokers or pub-sub service, which explicitly conflicts with how microservices maintain their data. Overall, this architecture is described in detail in many sources, so I see no point to repeating them here.
Its main disadvantage is the similarity to microservices regarding reliance on infrastructure, complex deployment, monitoring, etc.
Cluster-based Architecture
This design is quite a rare animal. It has several advantages, but a few worth being noted separately:
- Simplicity of transition to this architecture from modulith
- No reliance on infrastructure. Often, the whole system is self-contained, could be deployed on premises, in the cloud or across several clouds
- Simplicity of deployment — there is only one deployable artifact
- Nearly linear scalability
- Real fault tolerance — failed node does not bring down the whole system nor results in error being returned from request. Requests are just somewhat slower processed
- Great resource utilization, fine-grained two level scaling
Conclusion
Microservices were thought of as a way to solve problems, but their blind application causes more harm than good. The main issue is that there are no clear criteria, where they are applicable. I have no illusions, my article will not solve this problem, but at least it provides some meaningful criteria to assess where microservices should not be used.