Paginating JOINs via jOOQ and DENSE_RANK()

2024-12-02

Working With DENSE_RANK()

DENSE_RANK() is a window function that assigns a rank to each row within a partition or result set with no gaps in ranking values. A simple example is shown here:

Let's assume that we want to rank employees (EMPLOYEE) in offices (OFFICE) by their salary (EMPLOYEE.SALARY). Expressing this via jOOQ and DENSE_RANK() can be done as follows:

     Java 
   
 
 
   ctx.select(EMPLOYEE.FIRST_NAME, EMPLOYEE.LAST_NAME, EMPLOYEE.SALARY,
           OFFICE.CITY, OFFICE.COUNTRY,
           OFFICE.OFFICE_CODE, 
           denseRank().over().partitionBy(OFFICE.OFFICE_CODE)
              .orderBy(EMPLOYEE.SALARY.desc()).as("salary_rank"))
   .from(EMPLOYEE)
   .innerJoin(OFFICE)
   .on(OFFICE.OFFICE_CODE.eq(EMPLOYEE.OFFICE_CODE))
   .fetch(); 
  

An output fragment looks like this (notice that the employees having the same salary get the same rank):

Paginating JOINs via jOOQ and DENSE_RANK()

Next, let's use DENSE_RANK() for selecting the highest salary from each office, including duplicates. This time, let's use the QUALIFY clause as well. The code is illustrated in the following snippet:

     Java 
   
 
 
   select(EMPLOYEE.FIRST_NAME, EMPLOYEE.LAST_NAME,
       EMPLOYEE.SALARY, OFFICE.CITY, OFFICE.COUNTRY,
       OFFICE.OFFICE_CODE)
  .from(EMPLOYEE)
  .innerJoin(OFFICE)
  .on(OFFICE.OFFICE_CODE.eq(EMPLOYEE.OFFICE_CODE))
  .qualify(denseRank().over().partitionBy(OFFICE.OFFICE_CODE)
           .orderBy(EMPLOYEE.SALARY.desc()).eq(1))
  .fetch(); 
  

Before going further, here is a nice read called "The Difference Between ROW_NUMBER(), RANK(), and DENSE_RANK()." You can check out these examples in the DenseRank bundled code.

Paginating JOINs via DENSE_RANK()

Let's assume that we want to paginate offices (OFFICE) with employees (EMPLOYEE). If we apply a classical offset or keyset pagination to the JOIN between OFFICE and EMPLOYEE, then the result is prone to be truncated. Therefore, an office can be fetched with only a subset of its employees. For instance, while we think of a result page of size 3 as containing three offices with all their employees, we instead get a single office with three employees (even if this office has more employees). The following figure reveals what we expect versus what we get from a page of size 3 (offices):

Paginating JOINs via jOOQ and DENSE_RANK()

In order to obtain a result set like the one on the left-hand side of the preceding figure, we can rely on the DENSE_RANK() window function, which assigns a sequential number to different values of a within each group b, as shown in the following jOOQ query:

     Java 
   
 
 
   Map<Office, List<Employee>> result = ctx.select().from(
  select(OFFICE.OFFICE_CODE, OFFICE...,
         EMPLOYEE.FIRST_NAME, EMPLOYEE...,
         denseRank().over().orderBy(OFFICE.OFFICE_CODE, OFFICE.CITY).as("rank"))
  .from(OFFICE)
  .join(EMPLOYEE)
  .on(OFFICE.OFFICE_CODE.eq(EMPLOYEE.OFFICE_CODE)).asTable("t"))
  .where(field(name("t", "rank")).between(start, end))
  .fetchGroups(Office.class, Employee.class); 
  

The start and end variables represent the range of offices set via DENSE_RANK(). The following figure should clarify this aspect where start = 1 and end = 3 (the next page of the three offices is between start = 4 and end = 6):

Paginating JOINs via jOOQ and DENSE_RANK()

Here is a more compact version of the previous query, using the QUALIFY clause:

     Java 
   
 
 
   Map<Office, List<Employee>> result = 
  ctx.select(OFFICE.OFFICE_CODE, OFFICE...,
             EMPLOYEE.FIRST_NAME, EMPLOYEE...)
  .from(OFFICE)
  .join(EMPLOYEE)
  .on(OFFICE.OFFICE_CODE.eq(EMPLOYEE.OFFICE_CODE))
  .qualify(denseRank().over().orderBy(OFFICE.OFFICE_CODE, OFFICE.CITY)
           .between(start, end))
  .fetchGroups(Office.class, Employee.class); 
  

You can check out the complete example named DenseRankPagination for MySQL. The returned Map<Office, List<Employee>> is serialized to JSON via a classical Spring Boot controller.

There are chances that you are not familiar with the QUALIFY clause, so here is a brief overview.

The QUALIFY Clause

Some databases (for instance, Snowflake) support a clause named QUALIFY. Via this clause, we can filter (apply a predicate) the results of window functions. Mainly, a SELECT … QUALIFY clause is evaluated after window functions are computed, so after Window Functions (step 6) and before DISTINCT (Step 8):

Paginating JOINs via jOOQ and DENSE_RANK()

The syntax of QUALIFY is QUALIFY <predicate>, and in the following screenshot, you can see how it makes the difference (this query returns every 10th product from the PRODUCT table via the ROW_NUMBER() window function):

Paginating JOINs via jOOQ and DENSE_RANK()

By using the QUALIFY clause, we eliminate the subquery and the code is less verbose. Even if this clause has poor native support among database vendors, jOOQ emulates it for all the supported dialects. Cool, right?!

Explore further in my book, jOOQ Masterclass.