Distributed Task Synchronization: Leveraging ShedLock in Spring
In today's distributed computing landscape, coordinating tasks across multiple nodes while ensuring they execute without conflicts or duplication presents significant challenges. Whether managing periodic jobs, batch processes, or critical system tasks, maintaining synchronization and consistency is crucial for seamless operations.
The Problem
Let's say we need to run some tasks on a schedule, whether it's a database cleanup task or some data generation task. If you approach the problem directly, you can solve this problem using the @Schedules
annotation included in Spring Framework. This annotation allows you to run code at fixed intervals or on a cron schedule. But what if the number of instances of our service is more than one? In this case, the task will be executed on every instance of our service.
ShedLock
ShedLock makes sure that your scheduled tasks are executed at most once at the same time. The library implements a lock via an external store. If a task is executed on one instance, the lock is set, all other instances do not wait, and skip the execution of the task. This implements "at most once execution." The external store can be relational databases (PostgreSQL, MySQL, Oracle, etc.) working via JDBC, NoSQL (Mongo, Redis, DynamoDB), and many others (the full list can be found on the project page).
Let's consider an example of working with PostgreSQL. First, let's start the database using Docker:
docker run -d -p 5432:5432 --name db \
-e POSTGRES_USER=admin \
-e POSTGRES_PASSWORD=password \
-e POSTGRES_DB=demo \
postgres:alpine
Now it is necessary to create a lock table. On the project page, we need to find the SQL script for PostgreSQL:
CREATE TABLE shedlock(
name VARCHAR(64) NOT NULL,
lock_until TIMESTAMP NOT NULL,
locked_at TIMESTAMP NOT NULL,
locked_by VARCHAR(255) NOT NULL,
PRIMARY KEY (name));
Here:
name
- Unique identifier for the lock, typically representing the task or resource being lockedlock_until
- Timestamp indicating until when the lock is heldlocked_at
- Timestamp indicating when the lock was acquiredlocked_by
- Identifier of the entity (e.g., application instance) that acquired the lock
Next, create a Spring Boot project and add the necessary dependencies to build.gradle:
implementation 'net.javacrumbs.shedlock:shedlock-spring:5.10.2'
implementation 'net.javacrumbs.shedlock:shedlock-provider-jdbc-template:5.10.2'
Now describe the configuration:
@Configuration
@EnableScheduling
@EnableSchedulerLock(defaultLockAtMostFor = "10m")
public class ShedLockConfig {
@Bean
public LockProvider lockProvider(DataSource dataSource) {
return new JdbcTemplateLockProvider(
JdbcTemplateLockProvider.Configuration.builder()
.withJdbcTemplate(new JdbcTemplate(dataSource))
.usingDbTime()
.build()
);
}
}
Let's create an ExampleTask
that will start once a minute and perform some time-consuming action. For this purpose, we will use the @Scheduled
annotation:
@Service
public class ExampleTask {
@Scheduled(cron = "0 * * ? * *")
@SchedulerLock(name = "exampleTask", lockAtMostFor = "50s", lockAtLeastFor = "20s")
public void scheduledTask() throws InterruptedException {
System.out.println("task scheduled!");
Thread.sleep(15000);
System.out.println("task executed!");
}
}
Here, we use Thread.sleep
for 15 seconds to simulate the execution time of the task. Once the application is started and the task execution starts, a record will be inserted into the database:
docker exec -ti <CONTAINER ID> bash
psql -U admin demo
psql (12.16)
Type "help" for help.
demo=# SELECT * FROM shedlock;
name | lock_until | locked_at | locked_by
-------------+----------------------------+----------------------------+---------------
exampleTask | 2024-02-18 08:08:50.055274 | 2024-02-18 08:08:00.055274 | MacBook.local
If, at the same time, another application tries to run the task, it will not be able to get the lock and will skip the task execution:
2024-02-18 08:08:50.057 DEBUG 45988 --- [ scheduling-1] n.j.s.core.DefaultLockingTaskExecutor
: Not executing 'exampleTask'. It's locked.
At the moment of lock acquired by the first application, a record is created in the database with a lock time equal to lockAtMostFor
from the lock settings. This time is necessary to ensure that the lock is not set forever in case the application crashes or terminates for some reason (for example, evicting a pod from one node to another in Kubernetes). After the successful execution of the task, the application will update the database entry and reduce the lock time to the current time, but if the task time execution is very short, this value cannot be less than lockAtLeastFor
from the configuration. This value is necessary to minimize clock desynchronization between instances. It ensures that your scheduled tasks are executed only once concurrently.
Conclusion
ShedLock is a useful tool for coordinating tasks in complex Spring applications. It ensures that tasks run smoothly and only once, even across multiple instances. It is easy to set up and provides Spring applications with reliable task-handling capabilities, making it a valuable tool for anyone dealing with distributed systems.
The project code is available on GitHub.