A not-so-common scenario for Distributed Cache

A not-so-common scenario for Distributed Cache

#redis #cache #database #softwarearchitecture #softwareengineering

Did you try to improve the performance of your API but it is still not good enough? Try a Caching mechanism next time.

Most of the time, a caching mechanism is used when all the bottlenecks of the API request are addressed. At this point, it is as fast as it can run. 


The most common scenario when caching is used will have the following criteria: 

✅ an endpoint that is called a lot to retrieve data

✅ the data requested is not being updated a lot

If data is cached and then retrieved immediately on the next call, a round trip to the database is not needed.


In order to know which endpoints should have caching, it is better to monitor them for a while. Get some metrics and decide where caching would be helpful. 


In a microservice world, a distributed cache is the way to go!


I decided to go for Redis and here is the minimum you need to know about it

👉🏻 each data saved has a unique key

👉🏻 SET command to add or update data based on the key

👉🏻 GET command to get the data based on the key

👉🏻 Transactions are available

👉🏻 Hashes when just a key is not enough


For the most common scenario, I already shared a post here with an article about it.


Now let's talk about a different scenario when caching would be needed

✅ endpoint to insert thousands of data in the database 

✅ the data should be immediately available 


For the above scenario, maybe the endpoint will even time out before it finishes. Think that you might have to create thousands of groups. Then a specific user should be added to each of them. A for loop to create and then one to update the UserGroup table is a lot.

A solution would be to have the request which will pass the data to an event and return 200 right after. Here we are talking about microservices, messages and queues, and functions triggered by a message in the queue (out of the scope of the article). But I will leave you here an article on that topic.


Let's say you went for the microservice architecture

👉🏻 an event will deal with inserting and updating the data

👉🏻 in the meantime another event is already updating the data for the same ID

👉🏻 on top of that, an event is requesting the updated version of the data (which is not ready yet since the other events are still running)

In this case, we run into troubles


The solution here is to have a distributed cache that will have the source of proof ready immediately since you just update the data for a specific key. The data is an object so in this case it can be a list of the type you need.

💾 all the messages in the queue will have the key for the data they try to process

💾 each event triggered will get the updated data from Redis based on the key and it will process it 

💾 if the data should be retrieved, the final version is already in Redis

💾 when the data is missing from Redis, it will be retrieved from the database and Redis will be updated (a request which will take more time)


What about the race conditions when 2 events are trying to update data which will have some that are the same?

I used a special unique ID for the data that was being processed. So any following events will just ignore that one. In this way, you will not end up having duplicates in the database for a specific ID.


Where that unique ID can be set until the function finish processing the data? 

You can use a special ID for the key of the data that is being processed. So the next event will skip those or retry later, depending on the scenario. This means that each time a function will process a set of data, it will also Get the latest from Redis again. Then the function will Set the data back with a special ID (eg. Processing:GUID). Finally, once everything is updated in the database, the special ID should be also updated back to the normal one. 


Do not forget that everything you have in the database but is no longer in Redis, should be removed.

To view or add a comment, sign in

More articles by Ovidiu Giurgiu

Insights from the community

Others also viewed

Explore topics