Performance and Scalability Analysis of Redis and Memcached

2025-02-11

Speed and scalability are significant issues today, at least in the application landscape. Among the critical enablers for fast data access implementation within in-memory data stores are the game changers in recent times, which are technologies like Redis and Memcached. However, the question arises of choosing the best one. This article takes a plunge into the comparative analysis of these two cult technologies, highlights the critical performance metrics concerning scalability considerations, and, through real-world use cases, gives you the clarity to confidently make an informed decision.

We have run these benchmarks on the AWS EC2 instances and designed a custom dataset to make it as close as possible to real application use cases. We compare throughput, operations per second, and latency under different loads, namely the P90 and P99 percentiles. We investigate deeper in a clustered environment and try identifying the scalability characteristics of both Redis and Memcached, including the implementation and management complexities of either. This level of comparison detail will assist decision-makers with the information they would need to make a more appropriate choice of an in-memory data store for their needs.

Optimization of application performance and scalability can mean a vital competitive advantage for businesses. Our findings contribute to the existing knowledge base and offer practical recommendations to make these technologies effective in real-world settings.

Introduction

In this world of high-performance applications, the foremost requirements are speed and scalability. Both Redis and Memcached assist in leading that way through the storage of data in RAM, which thereby makes data access almost instant. But how do you make the right choice when all you need to look for is performance and scalability? Let's look at these two things in more detail as we flesh out their performance and scalability.

Methodology

We used several AWS instances, such as memory-optimized (m5.2xlarge) and compute-optimized (c5.2xlarge). All the instances have default settings/configurations for Redis and Memcached.

Workloads

Read Heavy: Read and write operations are read in a ratio of 80% to 20%.
Write Heavy: Write and read operations are read in a ratio of 80% to 20%.

Metrics Collected

Operations per second (Ops/sec): This measures the number of operations that can be executed in the system at any one second.
P90 latency: The latency was recorded at the 90th percentile. 90 percent of requests are satisfied at this level or faster. Under normal conditions, it gives you a good idea of system performance.
P99 latency: The latency of only 1% of requests is slower than. It measures the "tail latency" behavior of your system in the case of its heaviest load or worst-case scenario.
Throughput: Measured in gigabytes per second (GB/s), throughput is the rate at which the system handles the quantity of information. To simulate workload and measure performance, standard benchmarking tools were applied: memtier_benchmark for Redis and memcached_benchmark for Memcached.

Procedure

Initialization

During the application's startup, a fixed dataset was used to ensure the results were standardized, which is a critical requirement. Sizes and data types were allowed to vary to reach a dataset resembling the actual usage of applications. Moreover, the dataset was designed to model common usage in terms of read and write ratios and the data structures that are most commonly used, such as strings, hashes, and lists.

Implementation

The average performance can only be measured over a long period after enough sample points are collected; workloads were run for a constant amount of time, which usually ranged from 1-2 hrs. The execution time was long enough for the application to reach a steady state and good quality measurements, hence implying that the average performance was under constant load.

Replication

Measurements were taken several (5-7) times for accuracy to mitigate anomalies that would affect the results. The benchmark was run at least 5 times, and the results are put as an average to correct deviations in performance.

Data Aggregation

Taking metrics and averaging across various runs to capture overall performance. Critical metrics include number of operations per second (Ops/sec), P99/P90 latency, and throughput.

Sources of Potential Errors

Since the network latency, CPU load, and memory availability on the AWS instance may change, this impacts the benchmarking results. The overhead contributed by the benchmarking tool can influence performance measurements.

Results

Real-World Performance

Based on the methodology described above, we gathered the following performance data for Redis and Memcached on AWS instances.

Performance Metrics (AWS EC2 Instances)

Instance Type Workload Type System Ops/Sec P90 Latency (ms) P99 Latency (ms) Throughput (GB/s)

m5.2xlarge

(8 vCPUs, 32 GB RAM)

80% Read

20% Write

Memcached

1,200,000

0.25

0.35

1.2

Redis 7

1,000,000

0.30

0.40

1.0

80% Write

20% Read

Memcached

1,100,000

0.28

0.38

1.1

Redis 7

900,000

0.33

0.45

0.9

c5.2xlarge

(8 vCPUs, 16 GB RAM)

80% Read

20% Write

Memcached

1,300,000

0.23

0.33

1.3

Redis 7

1,100,000

0.28

0.38

1.1

80% Write

20% Read

Memcached

1,200,000

0.26

0.36

1.2

Redis 7

1,000,000

0.31

0.41

1.0

Instance Type	Workload Type	System	Ops/Sec	P90 Latency (ms)	P99 Latency (ms)	Throughput (GB/s)
m5.2xlarge (8 vCPUs, 32 GB RAM)	80% Read 20% Write	Memcached	1,200,000	0.25	0.35	1.2
Redis 7	1,000,000	0.30	0.40	1.0
80% Write 20% Read	Memcached	1,100,000	0.28	0.38	1.1
Redis 7	900,000	0.33	0.45	0.9

c5.2xlarge (8 vCPUs, 16 GB RAM)	80% Read 20% Write	Memcached	1,300,000	0.23	0.33	1.3
Redis 7	1,100,000	0.28	0.38	1.1
80% Write 20% Read	Memcached	1,200,000	0.26	0.36	1.2
Redis 7	1,000,000	0.31	0.41	1.0

Performance Summary

Redis offers versatility with many data structures and sustained performance of network-bound tasks because of threaded I/O, although single-threaded execution may make the CPU-bound tasks slow. However, the tail latency is higher, especially under heavy write loads, according to the P90 and P99 metrics. Redis is observed to perform well during both reading and writing operations.

Memcached works best under multi-threaded execution models and is highly optimized for high-speed, high-throughput caching, which makes it ideal for simple key-value operations with less overhead. Memcached generally performs better under heavy read and write loads, with lower P90 and P99 latency and higher throughput.

Scalability Comparison

Redis Scalability

Redis provides horizontal scaling through the Redis Cluster, slicing data into many nodes. This improves Redis's fault tolerance, which is ideal for a large-scale application. However, it makes managing the Redis cluster complex and consumes more resources.

Takes the shape of horizontal scaling by partitioning data within nodes
Provides high availability and automatic failover
Offers data durability options such as RDB snapshots and AOF logging

Memcached Scalability

Memcached uses consistent hashing to distribute load evenly across nodes. Adding more nodes makes scaling easy and ensures smooth performance as data and traffic grow. Memcached's simplicity in scaling and management is a big plus.

Adding nodes is straightforward.
Ensures even load distribution and high availability
Less maintenance and configuration are needed.

Scalability Benchmarking

Using a 10-node cluster configuration, we benchmarked the scalability of Redis 7 and Memcached on AWS, including P90 and P99 latency metrics to provide insights into tail latency performance.

Scalability Metrics (AWS EC2 Instances)

Instance Type Workload Type System Ops/Sec P90 Latency (ms) P99 Latency (ms) Throughput (GB/s)

m5.2xlarge

(8 vCPUs, 32 GB RAM)

80% Read

20% Write

Memcached

12,000,000

0.35

0.45

12.0

Redis 7

10,000,000

0.40

0.50

10.0

80% Write

20% Read

Memcached

11,000,000

0.38

0.48

11.0

Redis 7

9,000,000

0.43

0.53

9.0

c5.2xlarge

(8 vCPUs, 16 GB RAM)

80% Read

20% Write

Memcached

1,300,000

0.33

0.43

13.0

Redis 7

1,100,000

0.38

0.48

11.0

80% Write

20% Read

Memcached

12,000,000

0.36

0.46

12.0

Redis 7

10,000,000

0.41

0.51

10.0

Instance Type	Workload Type	System	Ops/Sec	P90 Latency (ms)	P99 Latency (ms)	Throughput (GB/s)
m5.2xlarge (8 vCPUs, 32 GB RAM)	80% Read 20% Write	Memcached	12,000,000	0.35	0.45	12.0
Redis 7	10,000,000	0.40	0.50	10.0
80% Write 20% Read	Memcached	11,000,000	0.38	0.48	11.0
Redis 7	9,000,000	0.43	0.53	9.0

c5.2xlarge (8 vCPUs, 16 GB RAM)	80% Read 20% Write	Memcached	1,300,000	0.33	0.43	13.0
Redis 7	1,100,000	0.38	0.48	11.0
80% Write 20% Read	Memcached	12,000,000	0.36	0.46	12.0
Redis 7	10,000,000	0.41	0.51	10.0

Scalability Summary

Redis can be scalable through the Redis Cluster, thus realizing its distributed nature, high availability, and persistence, but it requires more difficult management and resources. The P90 and P99 latency metrics show more tail latency under heavy loads. Large-scale operations have good throughput in Redis.

Memcached is another simple solution that can be easily scaled because of consistent hashing. By using Memcached, it is easy to add more nodes, making its management lightweight. Generally, Memcached works better under heavy read and write load conditions, with lower P90 and P99 latency and better throughput.

The observed differences in throughput metrics between performance and scalability tests are primarily due to the varying test setups and workload distributions. In single-instance performance tests, throughput is limited by the resources of individual instances. Conversely, in scalability tests, the workload is distributed across a larger cluster, allowing for higher aggregate throughput due to parallel processing and more efficient resource utilization. Additionally, network overheads and caching efficiencies play a role in enhancing throughput in clustered environments.

Analysis of Benchmark Results

Read-Heavy Workloads

Memcached had higher throughput and lower latency in read-heavy workloads because of its multi-threaded architecture, which can simultaneously process multiple operations for reading. Much faster response times are afforded, with a higher information flow rate. Though Redis was slightly lagging, it still did well. It is single-threaded in data operation, which may sometimes lead to limitations concerning the number of simultaneous read requests it serves. However, Redis is suitable for managing complex structures, thus allowing one to manipulate versatile data, which is essential for applications that require complex queries.

Write-Heavy Workloads

Memcached is much more beneficial in write-heavy scenarios here than Redis. Its multi-threaded pattern enables it to process many write operations simultaneously, reducing general latency and increasing throughput. Redis had shown higher latencies and a lower throughput rate if the write load became heavy. It poses a bottleneck to processing heavy write operations because of its single-threaded feature. However, Redis features like durability using AOF and RDB and capacities to handle complex data structures make it rigid and flexible, areas where Memcached does not.

Recommendations for Optimizing Performance

Optimizing Redis

Leverage the use of Thread I/O introduced in Redis 6.0 and further improved in Redis 7.0 to bring better performance on network-bound tasks. Implement Redis Cluster to distribute the load over several nodes for better scalability and fault tolerance. Choose the proper use case for the right data structure, such as using hashes for storing objects and sorted sets for ranking systems, so you do not leave any air holes in your memory and perform well. Configure AOF and RDB snapshots based on the durability requirements of the application in question, making the right tradeoff between performance and data safety.

Optimizing Memcached

Leverage Memcached multithreading to effectively service high-concurrency workloads. Memcached adopts consistent hashing, so the load balance between nodes can be realized. That is, it scales out gradually while enjoying high availability. Memory allocation settings should then be set according to workload characteristics, which will result in the best settings for attaining maximum cache efficiency. Make the cache operations simple; avoid complex data manipulations, be fast with high throughput, and keep low latency.

Conclusion

Redis and Memcached are compelling tools for high-performance applications, though their appropriateness will vary by use case. The rich versatility and features of Redis make it an excellent fit for complex applications that need real-time analytics and data persistence, along with sophisticated data manipulations. Memcached is quite streamlined and fast, really great where simple key-value caching and rapid data retrieval are a must.

Now, armed with knowledge from these two areas strengths and weaknesses and additional factors like ease of setup, maintenance/monitoring, security, and other benchmarks, it should help you choose the best one to make your application performance and scalability optimal and provide a user experience that's even more smooth and responsive.

Additional Contributor

This article was co-authored by Seema Phalke.

References

Redis Documentation
Memcached Documentation
Antirez, Salvatore. Redis in Action. Manning Publications, 2013.
Karwin, Baron Schwartz, Peter Zaitsev. High Performance MySQL. O'Reilly Media, 2012.
Brewer, Eric A. "CAP Twelve Years Later: How the 'Rules' Have Changed." Computer, vol. 45, no. 2, 2012, pp. 23–29.