A Hitchhiker’s Guide to Caching Patterns

Nicolas Fränkel - Dec 8 '20 - - Dev Community

When your application starts slowing down, the reason is probably a bottleneck somewhere in the execution chain. Sometimes, this bottleneck is due to a bug. Sometimes, somebody didn't set up the optimal configuration. And sometimes, the process of fetching the data is the bottleneck.

One option would be to change your whole architecture. Before moving to such a drastic, and probably expensive measure, one can consider a trade-off: instead of getting remote data every time, you can store the data locally after the first read. This is the trade-off that caching offers: stale data vs. speed.

Deciding to use caching is just the first step in a long journey. The next step is to think about how your application and the cache will interact. This post focuses on your options regarding those interactions.

Cache-Aside

Cache-Aside is probably the most widespread caching pattern. With this approach, your code handles the responsibility of orchestrating the flow between the cache and the source of truth.

Regarding reads, it translates as the following:

Cache Aside Read

For writes, it's even simpler:

Cache Aside Write

The biggest advantage of using Cache-Aside is that anybody can read the code and understand its execution flow. Moreover, the requirements toward the cache provider are at their lowest: it just needs to be able to get and set values. That allows for pretty straightforward migrations from a cache provider to another one (e.g. Hazelcast).

The biggest issue of Cache-Aside is that your code needs to handle the inconsistency gap between the cache and the datastore. Imagine that you've successfully updated the cache but the datastore fails to update. The code needs to implement retries. Worse, during unsuccessful retries, the cache contains a value that the datastore doesn't.

Switching the logic to update the datastore first doesn't change the problem. What if the datastore updates successfully but the cache doesn't?

Read-Through

Compared to Cache-Aside, Read-Through moves the responsibility of getting the value from the datastore to the cache provider.

Read Through

Read-Through implements the Separation of Concerns principle. Now, the code interacts with the cache only. It's up to the cache to manage the synchronization between itself and the datastore. It requires a more advanced cache provider than for Cache-Aside, as the former needs to provide such capability.

Hazelcast provides the MapLoader interface for this usage.

Write-Through

Similar to Read-Through but for writes, Write-Through moves the writing responsibility to the cache provider.

Write Through

The main benefit of Write-Through is that the code is now free of failure handling and retry logic. Of course, it's now up to the cache to manage them.

Hazelcast provides the MapStore interface for this usage. Because in most of the cases, Write-Through also implies Read-Through, MapStore is a child-interface of MapLoader so that interactions with the datastore are co-located in the same implementation class.

Write-Behind

Write-Behind looks pretty similar to Write-Through.

Write Behind

I believe some of you dear readers didn't even see the difference. And if you did, you might be wondering what it does mean.

To make it clear, the difference lies in the last arrow's arrowhead: it changed from solid to line. If your UML days are past (I had to look at how to represent it), it means that the cache sends an asynchronous message to the datastore.

Up to this point, all messages exchanged between actors were synchronous: the caller needs to wait until the callee has finished processing and returned before continuing its flow. With Write-Behind, the cache sets the value to the datastore and doesn't wait for confirmation.

On the plus side, this approach speeds the whole process since the datastore is the slowest component - it sits somewhere over the network and writes to disk. On the other hand, it runs the risk of introducing inconsistencies in the cache. In Write-Through, you could retry to your heart's content until the value was set. In Write-Behind, you don't know if the set was even successful.

With Hazelcast, changing from a Write-Through approach to a Write-Behind one is just a matter of configuring the write-delay-seconds property to a value higher than 0.

Refresh-Ahead

The old saying goes that there are two hard things in computer science: naming things and cache invalidation. Cache invalidation is about planning how long an item should be stored in the cache before it expires. When it does or when the cache is still empty, you need to fetch the item from the datastore using one of the patterns above - Cache-Aside or Read-Through.

Both patterns implement a flow that involves the code, the cache, and the datastore. As mentioned above, reading from the datastore is an expensive operation: you need to first cross through the network and then request data from the datastore. What if you could prefetch the data, making it available before you even request, thus saving you from incurring the performance hit on the critical path? That's exactly what Refresh-Ahead does.

Implementations of Refresh-Ahead are cache provider-dependent. A safe bet that is agnostic to the provider is to use Hazelcast Jet. With its Change-Data-Capture capability, Jet allows to connect to any cache provider with a public API and update cached entities as soon as the datastore is updated. Here's a bird's eye view of CDC in action:

Alt Text

For more details on Refresh-Ahead, please check my previous post Designing an Evergreen Cache with Change-Data-Capture.

Summary

Here's a quick summary of the patterns and the context they best fit it:

Pattern Consider Cons
Cache-Aside When you're limited by the capabilities of your cache provider The application is responsible for the cache orchestration flow
Read-Through Solid default
Write-Through Solid default
Write-behind When performance considerations outweigh short-term consistency Asynchronous systems are harder to reason with
Refresh-ahead When fetching data from the datastore impairs throughput Additional component to develop, deploy and maintain
. . . . . . . . . . . . . . .