Problem
You want to boost application performance by loading data from a cache and prevent the network trip to the persistent store (and also the query execution). This can be achieved by loading data from a cache. However, you want to load data on demand or lazily. Also, you want the application to control the cache data management – loading, eviction, and retrieval.
Forces
- Improve performance by loading data from cache lazily.
- Application code controls cache data management.
- The underlying caching system does not provide read-through, write-through/write-behind strategies (strange really ??).
Solution
Use cache aside design pattern to solve the problems outlined above. This is also one of many caching patterns/strategies. I believe it is named in this because aside from managing the data store, application code is responsible for managing the cache also.
Let's now try to understand how this caching technique works and then explore how it solves the problems.
Read
Cache Miss
- The application tries to get data from the cache.
- The data is not available in the cache.
- The application queries the data store.
- The data is retrieved from the datastore.
- The application adds the data in the cache (completes lazy loading)
Cache Hit
- The application tries to get data from the cache.
- The cache returns the data
(Performance improvement, yay!!!)
Write/Update
Implementation
Spring cache abstraction makes it very easy to implement Cache-aside pattern. Also, it prevents code from being mixed between business logic and caching logic. It uses aspects to separate cross-cutting concerns and saves on reinventing the wheel. The default implementation for Spring cache is ConcurrentHashMap. However, it can transparently integrate with distributed caching systems like Ehcache, Hazelcast, Apache Ignite etc.
Spring cache documentation can be found here.
The important classes from the Employee example are below.
The @Cacheable annotation ensures that when the findOne method is called, first the cache is checked for an employee with the given id. If it is present then, the data is returned from the cache. If the employee record is missing in the cache, then the actual findOne method is called which in turn queries the repository to retrieve the data. The retrieved data is then added to the cache for future access.
The @CachePut annotation is used with the write methods - save and update. As a result, the new and modified data is added back into the cache. This ensures the data in the cache is not stale.
This class drives the quick tests. The logs below shows that the "employee" cache behaves in cache-aside mode.
Consequences
Benefits
- The application writes to the data store (new or updated data).
- The application then writes to the cache.
Implementation
Spring cache abstraction makes it very easy to implement Cache-aside pattern. Also, it prevents code from being mixed between business logic and caching logic. It uses aspects to separate cross-cutting concerns and saves on reinventing the wheel. The default implementation for Spring cache is ConcurrentHashMap. However, it can transparently integrate with distributed caching systems like Ehcache, Hazelcast, Apache Ignite etc.
Spring cache documentation can be found here.
The important classes from the Employee example are below.
The @Cacheable annotation ensures that when the findOne method is called, first the cache is checked for an employee with the given id. If it is present then, the data is returned from the cache. If the employee record is missing in the cache, then the actual findOne method is called which in turn queries the repository to retrieve the data. The retrieved data is then added to the cache for future access.
The @CachePut annotation is used with the write methods - save and update. As a result, the new and modified data is added back into the cache. This ensures the data in the cache is not stale.
This class drives the quick tests. The logs below shows that the "employee" cache behaves in cache-aside mode.
Consequences
Benefits
- Low latency reads
- Reduced workload on data store, which may indirectly lead to lower bill from the cloud provider :)
- The application is responsible for managing the cache. The problem is accentuated if the application framework does not support Spring like caching abstraction. In that case, the code has to be written to manage the cache. There is a chance that the business logic will be polluted with caching logic. Caching logic is a cross-cutting concern and aspect-oriented design can be implemented to keep code clean.
- The cache-aside pattern works best for read-only data. If the data changes fast, then there is a possibility that the underlying data store will be stormed with requests.
- There are challenges of synchronizing the cache with the underlying data store. This will happen even in the case of local in-memory cache. One thread may be reading employee record(potentially stale or invalidated entry), while another thread might be updating the data store and hence the cache. The problem is even greater in the case of a distributed cache. It takes a while (due to network latency etc) to synchronize changes across a distributed cache.
- Also like any caching pattern, it's always a challenge to set the eviction rule and expiration policy or whether to pin data. This can be set after observing the usage pattern.
Comments
Post a Comment