Skip to main content

How to implement Cache Aside Pattern with Spring?

Problem

You want to boost application performance by loading data from a cache and prevent the network trip to the persistent store (and also the query execution). This can be achieved by loading data from a cache. However, you want to load data on demand or lazily. Also, you want the application to control the cache data management – loading, eviction, and retrieval.

Forces

  • Improve performance by loading data from cache lazily.
  • Application code controls cache data management.
  • The underlying caching system does not provide read-through, write-through/write-behind strategies (strange really ??).

Solution

Use cache aside design pattern to solve the problems outlined above. This is also one of many caching patterns/strategies. I believe it is named in this because aside from managing the data store, application code is responsible for managing the cache also.

Let's now try to understand how this caching technique works and then explore how it solves the problems.

Read

       Cache Miss
  1. The application tries to get data from the cache.
  2. The data is not available in the cache.
  3. The application queries the data store.
  4. The data is retrieved from the datastore.
  5. The application adds the data in the cache (completes lazy loading)
1
Cache Hit
  1. The application tries to get data from the cache.
  2. The cache returns the data
(Performance improvement, yay!!!)

2

Write/Update
  1. The application writes to the data store (new or updated data).
  2. The application then writes to the cache.
(Cache synchronized, yay!!!)

3

Implementation

Spring cache abstraction makes it very easy to implement Cache-aside pattern. Also, it prevents code from being mixed between business logic and caching logic. It uses aspects to separate cross-cutting concerns and saves on reinventing the wheel. The default implementation for Spring cache is ConcurrentHashMap. However, it can transparently integrate with distributed caching systems like Ehcache, Hazelcast, Apache Ignite etc.

Spring cache documentation can be found here.
The important classes from the Employee example are below.
package com.hardcode.sample.service;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.cache.annotation.CachePut;
import org.springframework.cache.annotation.Cacheable;
import org.springframework.stereotype.Service;
import org.springframework.transaction.annotation.Transactional;
import com.hardcode.sample.domain.Employee;
import com.hardcode.sample.repository.EmployeeRepository;
import lombok.extern.slf4j.Slf4j;
@Service
@Slf4j
public class EmployeeService {
@Autowired
private EmployeeRepository repository;
/**
* This method returns an existing employee record.
* @param id
*/
@Cacheable("employees")
@Transactional(readOnly=true)
public Employee findOne(Long id) {
log.info("Find employee by id - {}", id);
//Employee emp = new Employee();
//emp.setId(id);
return repository.findOne(id);
}
@CachePut("employees")
@Transactional
public Employee save(Employee e){
return repository.save(e);
}
@CachePut("employees")
@Transactional
public Employee update(Employee e){
return repository.save(e);
}
}

The @Cacheable annotation ensures that when the findOne method is called, first the cache is checked for an employee with the given id. If it is present then, the data is returned from the cache. If the employee record is missing in the cache, then the actual findOne method is called which in turn queries the repository to retrieve the data. The retrieved data is then added to the cache for future access.

The @CachePut annotation is used with the write methods - save and update. As a result, the new and modified data is added back into the cache. This ensures the data in the cache is not stale.

package com.hardcode.sample.startup;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.CommandLineRunner;
import org.springframework.stereotype.Component;
import com.hardcode.sample.domain.Employee;
import com.hardcode.sample.service.EmployeeService;
import lombok.extern.slf4j.Slf4j;
@Component
@Slf4j
public class EmployeeRunner implements CommandLineRunner {
@Autowired
private EmployeeService service;
@Override
public void run(String... args) throws Exception {
log.info("=== Find employees ||| Cache HIT/MISS =====");
log.info("emp#-1 --> {}" , service.findOne(1L));
log.info("emp#-2 --> {}" , service.findOne(2L));
log.info("emp#-1 --> {}" , service.findOne(1L));
log.info("emp#-2 --> {}" , service.findOne(2L));
log.info("emp#-1 --> {}" , service.findOne(1L));
log.info("emp#-2 --> {}" , service.findOne(2L));
//write
log.info("=== Adding new employee ====");
Employee e = new Employee();
e.setId(3L);
e.setFirstName("Paul");
e.setLastName("Pogba");
service.save(e);
log.info("=== Find new joinee =====");
e = service.findOne(3L); //this is a detached object
log.info("emp#-3 --> {}" , e);
log.info("=== Update new employee ====");
e.setFirstName("Paul Labile"); //this is his middle name i believe
service.update(e);
log.info("=== Check :: Update new employee ====");
e = service.findOne(3L); //check if update is successful
log.info("emp#-3 --> {}" , e);
}
}

This class drives the quick tests. The logs below shows that the "employee" cache behaves in cache-aside mode.
2017-03-31 18:10:35.987 INFO 19276 --- [main] c.h.sample.startup.EmployeeRunner : === Find employees ||| Cache HIT/MISS =====
2017-03-31 18:10:36.007 INFO 19276 --- [main] c.h.sample.service.EmployeeService : Find employee by id - 1
2017-03-31 18:10:36.037 INFO 19276 --- [main] c.h.sample.startup.EmployeeRunner : emp#-1 --> Employee(id=1, firstName=dhrubo, lastName=kayal)
2017-03-31 18:10:36.047 INFO 19276 --- [main] c.h.sample.service.EmployeeService : Find employee by id - 2
2017-03-31 18:10:36.047 INFO 19276 --- [main] c.h.sample.startup.EmployeeRunner : emp#-2 --> Employee(id=2, firstName=arjun, lastName=kayal)
2017-03-31 18:10:36.047 INFO 19276 --- [main] c.h.sample.startup.EmployeeRunner : emp#-1 --> Employee(id=1, firstName=dhrubo, lastName=kayal)
2017-03-31 18:10:36.047 INFO 19276 --- [main] c.h.sample.startup.EmployeeRunner : emp#-2 --> Employee(id=2, firstName=arjun, lastName=kayal)
2017-03-31 18:10:36.047 INFO 19276 --- [main] c.h.sample.startup.EmployeeRunner : emp#-1 --> Employee(id=1, firstName=dhrubo, lastName=kayal)
2017-03-31 18:10:36.047 INFO 19276 --- [main] c.h.sample.startup.EmployeeRunner : emp#-2 --> Employee(id=2, firstName=arjun, lastName=kayal)
2017-03-31 18:10:36.047 INFO 19276 --- [main] c.h.sample.startup.EmployeeRunner : === Adding new employee ====
2017-03-31 18:10:36.067 INFO 19276 --- [main] c.h.sample.startup.EmployeeRunner : === Find new joinee =====
2017-03-31 18:10:36.067 INFO 19276 --- [main] c.h.sample.service.EmployeeService : Find employee by id - 3
2017-03-31 18:10:36.067 INFO 19276 --- [main] c.h.sample.startup.EmployeeRunner : emp#-3 --> Employee(id=3, firstName=Paul, lastName=Pogba)
2017-03-31 18:10:36.067 INFO 19276 --- [main] c.h.sample.startup.EmployeeRunner : === Update new employee ====
2017-03-31 18:10:36.077 INFO 19276 --- [main] c.h.sample.startup.EmployeeRunner : === Check :: Update new employee ====
2017-03-31 18:10:36.077 INFO 19276 --- [main] c.h.sample.startup.EmployeeRunner : emp#-3 --> Employee(id=3, firstName=Paul Labile, lastName=Pogba)

Consequences
Benefits
  1. Low latency reads
  2. Reduced workload on data store, which may indirectly lead to lower bill from the cloud provider :)
Concerns
  1. The application is responsible for managing the cache. The problem is accentuated if the application framework does not support Spring like caching abstraction. In that case, the code has to be written to manage the cache. There is a chance that the business logic will be polluted with caching logic. Caching logic is a cross-cutting concern and aspect-oriented design can be implemented to keep code clean.
  2.  The cache-aside pattern works best for read-only data. If the data changes fast, then there is a possibility that the underlying data store will be stormed with requests.
  3. There are challenges of synchronizing the cache with the underlying data store. This will happen even in the case of local in-memory cache. One thread may be reading employee record(potentially stale or invalidated entry), while another thread might be updating the data store and hence the cache. The problem is even greater in the case of a distributed cache. It takes a while (due to network latency etc) to synchronize changes across a distributed cache.
  4. Also like any caching pattern, it's always a challenge to set the eviction rule and expiration policy or whether to pin data. This can be set after observing the usage pattern.

Comments

Popular posts from this blog

Part 3 - Integrating Tiles, Thymeleaf and Spring MVC 3

In this post I will demonstrate how to integrate Apache Tiles with Thymeleaf. This is very simple. The first step is to include the tiles and thymeleaf-tiles extension dependencies. I will include them in the pom.xml. Note we wil lbe using Tiles 2.2.2 Listing 1 - parent/pom.xml --- thymeleaf-tiles and tiles dependencies <!-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ --> <!-- Tiles --> <!-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ --> <dependency> <groupId>org.apache.tiles</groupId> <artifactId>tiles-core</artifactId> <version>${tiles.version}</version> <scope>compile</scope> </dependency> <dependency> <groupId>org.apache.tiles</groupId> <artifactId>tiles-template</artifactId> <version>${tiles.version}</version> <scope>compile</s...

Breaking down the CRM monolith

In my previous posts, I have shared some theory regarding microservices. But it's time to start some implementation. I love to write code and see and feel things working. So I will start a series to refactor a monolithic CRM system and transform it into microservices based flexible software. Big ball of mud. Customer Relationship Management(CRM) is that giant software which existed since time immemorial and is used by all companies in some form or shape. Big enterprises will buy CRM software (also known as packages) from top CRM vendors like Oracle, SAP, Salesforce etc and then employ an army of consultants to try and implement it. Most of the classic CRM systems in the market today, even if deployed on the cloud are the big monolithic ball of mud. They are the gigantic piece of software with the huge feature set. Most often those requirements are surplus to the requirement or they will not fit into the processes of the company. So the company has to hire these certified consu...

Getting started with Prime faces 2

Prime faces is an amazing JSF framework from Cagatay Civici ( http://cagataycivici.wordpress.com/ ). Its wonderful because it is easy to use, minimal dependencies, has probably the widest set of controls among all JSF frameworks, easy to integrate with Spring (including Spring Security) , Java EE EJBs, and last but not the least mobile UI support. So I decided to give Prime faces a try, before selecting it to use in my projects. Step 1 – Create Maven 2 project As a first step to integrating Prime faces, create a Maven 2 project in Eclipse. You will need to select ‘maven-archetype-webapp’. Step 2 – Add repositories and dependencies in pom.xml I will be using Prime faces 2 with JSF 2 on Tomcat 6. Since the dependencies for Prime Faces and JSF 2 (JSF 2.0.3 is required) are available on different repositories, I will add them to my pom file first. The listing below shows my pom.xml <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/X...