Avoiding Multiple Data Fetches Using the First-Level-Cache in Spring Data JPA

JPA

July 12, 2023

Author

Marcus Held

4 minutes to read

Share this post:

Enjoy this Post?

You'll love our newsletter!

If you’re dealing with backend development on the JVM, you’ll surely come across the Java Persistence API (JPA). A well-known implementation framework for it is Hibernate. In this article, we’ll show you how you can optimize the performance of your application in Spring Data JPA with caching by preventing the same resources from being fetched multiple times.

What is the First-Level-Cache?

Spring Data JPA uses Hibernate as the default ORM (Object-Relational Mapping), which provides an inbuilt First-Level-Cache. Each session has its own cache, which we refer to as the First-Level-Cache. Every time you retrieve an object (or more precisely, an entity instance) from the database, it is first stored in the First-Level-Cache. When retrieving the same object again, it is fetched directly from the cache and not from the database. This can lead to data repetition and impair performance.

How @Transactional Makes a Difference

The First-Level-Cache is always linked with the current Hibernate session, which in turn is bound to a running transaction. The use of the @Transactional annotation is crucial. Each method that is annotated with @Transactional runs in a separate transaction, and each transaction has its own First-Level-Cache.

@Service
public class MyService {
    private final MyRepository repository;

    public MyService(MyRepository repository) {
        this.repository = repository;
    }

    @Transactional
    public void demonstrateFirstLevelCache(Long id) {
        MyEntity entity1 = repository.findById(id).orElseThrow();
        // Modifications on entity1...
        // ...

        // Calling findById again for the same ID
        MyEntity entity2 = repository.findById(id).orElseThrow();
        // Since the object is already in the First-Level-Cache, it is not fetched from the database.
        // So, entity1 and entity2 are actually the same object (== would be true)

        // Modifications on entity2...
        // ...
    }
}

In this code example, the method demonstrateFirstLevelCache has its own transaction and hence its own First-Level-Cache. Although we call the method multiple times with the same ID, the entity is fetched only once from the database. On the second call, the entity is fetched from the cache.

How to Prevent Repeated Fetching of the Same Resource

Sometimes, without realizing it, we may fetch the same resource multiple times in a use case. This can happen when we have multiple transactions in a single code path. Each transaction has its own First-Level-Cache, and therefore, the same entity is fetched from the database for each transaction. Let’s look at the following example:

@Service
public class MyService {
    private final MyRepository repository;

    public MyService(MyRepository repository) {
        this.repository = repository;
    }

    @Transactional
    public MyEntity retrieveEntity(Long id) {
        return repository.findById(id).orElseThrow();
    }

    public void processEntityTwice(Long id) {
        MyEntity entity1 = retrieveEntity(id);
        // Modifications on entity1...
        // ...
        
        // A second transaction is started
        MyEntity entity2 = retrieveEntity(id);
        // Modifications on entity2...
        // ...
    }
}

In this example, retrieveEntity is called twice, each call starting its own transaction. Therefore, the same entity is fetched from the database for each transaction.

A more efficient way is to fetch all the necessary data in a single transaction. Here is an improved example:

@Service
public class MyService {
    private final MyRepository repository;

    public MyService(MyRepository repository) {
        this.repository = repository;
    }
    
    @Transactional
    public void processEntityOnce(Long id) {
        MyEntity entity1 = repository.findById(id).orElseThrow();
        // Modifications on entity1...
        // ...
        
        // Calling findById again within the same transaction
        MyEntity entity2 = repository.findById(id).orElseThrow();
        // Since the object is already in the First-Level-Cache, it is not fetched from the database.
        // So, entity1 and entity2 are actually the same object (== would be true)
        
        // Modifications on entity2...
        // ...
    }
}

In this code example, the processEntityOnce method fetches the same entity only once from the database and stores it in the First-Level-Cache. On a repeated fetch within the same transaction, Hibernate fetches the object directly from the First-Level-Cache and not from the database. This reduces the database load and improves the application performance.

Conclusion

In backend development with Spring Data JPA and Hibernate, it’s important to understand how the First-Level-Cache works and how to best utilize it to optimize the efficiency and performance of the application. With the correct use of @Transactional and a conscious approach to data fetches, you can achieve a significant improvement in application performance.

Don’t be afraid of “large” transactions. I’ve developed very large projects where we opened the transaction already in the controller. This is a very simple solution to ensure consistency and avoid repeated fetching of entities.

Remember, there are still many aspects in Hibernate and Spring JPA to discover, and each project can offer unique challenges. But with the solid understanding of the caching mechanism and transaction management that you now have, you’re well equipped to master them!

Have you heard of Marcus' Backend Newsletter?

New ideas. Every week!

I want that Newsletter!