High Availability and Scalability with Redis Enterprise
As you probably already know, Redis is a key-value database. I’m sure when you hear Redis, the first term that might come to your mind is: Cache. That’s true, Redis works great as a cache, but it is much more than that. Trust me, Redis Labs is trying hard to move away from the Cache-only image. Let me try to change that image for you.
Why Redis?
- Redis is also an in-memory data store and it is super fast.
- It can be used as a database, cache and message broker.
- Yes it is a key-value store, but it does not restrict value to be a String. It supports list, set/sorted set, hash (field-value), bit arrays, Streams or HyperLogLogs as a value for a key. This makes Redis more powerful than other key-value stores, and it’s often referred to as data structures server.
- Search operation for keys are super fast, O(1) complexity
- It supports several multi-keys commands like intersection, union, etc
- Supports kind of transactions for operation involving multiple commands
In this article, we will take you on a journey to achieve High availability and scalability using Redis Enterprise version provided by RedisLabs, and by the end of the article, we promise you to make an expert at it!
Let’s get started with an overview about Scalability and High availability concepts from the Redis Enterprise point of view.
Redis Cluster
Redis Enterprise offers Redis Cluster. Redis cluster is just a set of Redis nodes (OS with Redis installed). Redis cluster is self-managed, so all you have to do is create a database with required options and it abstracts out the pain of worrying about nodes/master/slave from you.
It’s pretty straight forward to Setup a Redis cluster, as all you need to do is install Redis on a node. Redis enterprise comes up with a UI to interact with Redis Cluster. So, post installation, go to https://nodeipaddress:8443, and follow the simple steps to create a cluster. For any number of nodes that you want to be a part of the cluster, follow the same process and choose the option to join the cluster you created on the very first node.
This is how Redis Enterprise Cluster UI looks like for a cluster containing 3 nodes:
One of the drawbacks of the Redis UI is occasionally, it shows cryptic errors. But, once you get the hang of those, you will be able to resolve them! Anyways, those cryptic errors are definitely not part of the takeaways of this blog post, so let’s focus back on what is important.
This is what we have as of now:
High Availability and Scalability With Redis In General
Redis Enterprise offers concepts of Replication for high availability and Sharding for horizontal scaling.
Let’s say you create a Database with 4 shards and replication enabled, there would be a total of 8 shards created: 4 master shards and 4 replica shards. (Unlike other NoSQL DBs, you can not choose to have more than one replica for a shard process). This is how you can create them from Redis Enterprise UI:
Redis Cluster takes care of choosing the right nodes to participate into DB creation.
To understand all these better, let’s think of a use case.
What do we have?
- We have roughly 100 GB of data
- The connecting client application is scaled in different Data Centers. So, it would be nice if the application can read/write locally in its own zone.
What do we want to achieve?
- Set up an on-premise Redis cluster with right topology
- Scalability
- Of course, high availability
To be more specific, we should be able to recover from:
Redis Process failure
Node failure
Multi nodes failure
Data Center failure
Now that we have got some context, let’s have a look at different options available in Redis Enterprise and see how they can be the right fit or not.
Let’s first look at Scalability.
Scalability with Redis
As we know, with Sharding or Partitioning we are not restricted to store data on a single computer memory. Another advantage of sharding is, we can use computational power of multiple cores.
And what is the Redis way to perform partitioning? It’s always based on key and strategy that it uses is hash partitioning. There are 16384 hash slots in Redis Cluster. These hash slots are equally divided across all the nodes. It computes CRC16 of the key modulo 16384 to calculate the hash of a key, and places it into appropriate Redis node.
Now that we have a very basic understanding about sharding, question should be, how do we know if we need it or we can do without it?
Redis Enterprise recommends to shard if you have more than 25 GB of data and high number of operations. Other aspect is if you have above 25,000 operations per second, then sharding can improve performance. With less number of operations/second, it can handle up to 50GB of data too.
In our case we are talking about 100 GB of data, and our application has high number of operations. We will choose 3 shards. (You can fine tune this by doing some perf test on your DB.)
But, sharding comes with some limitations:
- Redis does not support multi-key operations (like SINTER) for data that is stored cross hash slot.
- Redis does not support transactions for data spread across different slots.
Two more things that are tunable for sharding are:
- Sharding policy — what kind of keys should be ended up in the same slot. Default strategy is hash partitioning based on key or you can choose custom RegEx sharding, to ensure that keys which need to be in the same hash slot (for multi-key operations) are identified correctly
- Shard placement policy — It could be Dense (default), if a node has enough memory allocated to its database, shard redis processes are placed on the same node and only after it is full a new node is chosen. Or it could be Sparse, a maximum number of nodes in the system are utilized to distribute the shards of a database.
To create a sharded cluster, you just need to give number of shards on the RedisEnterprise UI, and data is automatically sharded, and appropriate nodes are chosen to place those shards. This is how my rladmin (cmd line tool to monitor cluster) status looks like for 3 shards with Replication enabled.
HA at Shard level
Notice that in the above image, there are two processes running here for each shard: One master shard and one Replica shard. Redis Enterprise will make sure that replica shard process is always created on a different node to achieve high failover. So if any of the nodes goes down, it will make sure that replica shard process on the other available node becomes the new master shard. But remember, you can have one replica per shard. So in this situation when one node is down, the newly elected master shard process will become a single point of failure. To avoid this situation, you can configure to enable slave_ha option and that will make sure to create a new Replica shard on any of the other node available.
One more thing worth mentioning here is, in a Redis cluster at any point of time majority of the nodes should be available. In our case with 3 node, it can tolerate failure of just one node. Once 2 nodes are down, the cluster would be down too.
Ok, after sharding, let’s discuss High availability in the next section.
Replica-Of (Active-Passive)
You can choose a database to be replica-of or more database, and this can be geo-distributed. In this case, all the initial data will be loaded from master to destination (replica-of database) and then all writes will be synchronised to the destination.
This passive replica is not to be confused with replication though, it has nothing to do with high availability. It can act as a disaster recovery database, or to scale your reads. So it does not offer failover, but works great as a cold storage.
However since, high availability is one of our goals, we would not like to go with this approach. Let’s consider other options.
Multi AZ
In this mode, the cluster nodes are tagged with the zone/rack they have been deployed in, and Redis will ensure that master and slave Redis processes of the same shard are never hosted on nodes that are located in the same AZ/rack.
This sounds like a good option, as we have multiple Data Centers spread across different zones, and this rack-zone awareness can help us to spread our shards to different zones/racks, so that it can survive availability zone failure. However there are some conditions for multi AZ cluster, let’s check them out:
- It needs an odd number of zones/racks. In our case, we have only 2 zones. No problem here though, the zone here can be a logical rack as well. So we can choose to create two racks in one AZ, and third rack in the other AZ.
- Network latency between the AZ/rack should be <10ms. We are good here.
- There can be only one master in one of the AZs (active node) in this case, and the rest of them would just act as Replica.
So, should we seal the deal with Multi AZ? Not yet, we want to scale writes too. Let’s see if there is any way to do that.
Active-Active
With Redis Active-Active, you can have a database that can spread across more than one participating clusters, and these clusters can belong to different geo-distributed Data Centers or Availability zones (AZs). For such a database, each cluster has one active master. This active master can read/write to the database.
As we have multiple clusters writing to the DB, there could be conflicts around whose write should win. To avoid that, Redis uses Last Write Wins resolution. (more details: Redis Active-Active). These kind of DBs are referred to as a conflict free replica database (CRDB).
The major advantage is that applications hosted in different zones can access (read-write) database locally.
So let’s have an Active-Active Cluster with two Redis Clusters, each in different DC (AZ). In the Redis Enterprise UI of one of the clusters, we will create a Database with:
- Two participating clusters
- Three shards
- Replication enabled
- Rack-zone awareness enabled at the cluster level (as we discussed in the above section.) This will make sure that master shard process and it’s replica shard processes are placed on different racks and enables failover at a node level. We will choose “rackid1” for node1, “rackid2” for node 2, so on.
This is how our final architecture looks like, notice how master and replica shards are not placed on the same node :
However, please make a note that with CRDBs, we will have to compromise strong consistency. CRDBs are often referred to as Strongly Eventually Consistent (SEC). How does it differ from Eventually Consistent(EC) DBs? SEC is a special case of EC, that is valid for some data types like counter. It ensures that any two nodes that have received the same (unordered) set of updates like add or subtract, will reach the same value eventually.
Failover from the client point of view
To achieve failover at a cluster level, the client should connect to Redis cluster though a DNS, and not through IP of any node, so the client doesn’t need to care about who is the current master. This enables simple failover with Redis.
In our case as we have two Redis clusters participating, we will have to consider using some third party library (e.g. HAProxy) that can achieve failover at DNS level of two clusters.
Why Redis Enterprise?
By the way, you may ask, why should I go for a paid version of Redis? Well, the answer is, it works great for what we wanted to achieve here. With Open Source Redis version, you can achieve HA with Sentinels, but it’s not as powerful as Redis Cluster. There are some nice Redis modules available like RedisSearch, ReJSON, RedisGraph, RediSQL (few of them are available as open source too). Recently, Redis Enterprise has released Streams, which seems promising. RedisLabs is likely to come up with more advanced features in the future, as it recently received a boost in the form of impressive funding.
Remember it’s a NoSQL
In the end, one thing worth mentioning is, Redis is a NoSQL database, so you shouldn’t expect the same behaviours as SQL DBs.
Remember that it cannot be strongly consistent as we are favouring High availability to consistency. Also remember that you can achieve best results if you keep your search operations to key as much limited as you can.
If you are a Spring Data fan, then Spring has nice support for Redis to give you the same Spring JPA feeling. But of course it’s limited and not as rich in terms of APIs as it is for SQL. For some custom Redis operations don’t hesitate to use RedisTemplate.
You can also add a new Redis functionality using Lua Scripting or Redis Modules (in C) which is quite powerful.
Conclusion
Finally to summarise, it was pretty straightforward to create this kind of complex architecture with Redis Enterprise as it hides all the complexities about nodes from you.
Playing with HA in Redis is a different learning curve than other NoSQL databases, even though the concept is quite similar. Sometimes it’s hard to look for references (though Redis engineers are super nice and highly available!), so we tried to pen down our experience with it. Hope you’ll find it helpful!
Written by Aditi Phadke