Paginating JOINs via jOOQ and DENSE_RANK()

Working With DENSE_RANK()

DENSE_RANK() is a window function that assigns a rank to each row within a partition or result set with no gaps in ranking values. A simple example is shown here:

DENSE_RANK() example

Let's assume that we want to rank employees (EMPLOYEE) in offices (OFFICE) by their salary (EMPLOYEE.SALARY). Expressing this via jOOQ and DENSE_RANK() can be done as follows:

Java
 
ctx.select(EMPLOYEE.FIRST_NAME, EMPLOYEE.LAST_NAME, EMPLOYEE.SALARY,
           OFFICE.CITY, OFFICE.COUNTRY,
           OFFICE.OFFICE_CODE, 
           denseRank().over().partitionBy(OFFICE.OFFICE_CODE)
              .orderBy(EMPLOYEE.SALARY.desc()).as("salary_rank"))
   .from(EMPLOYEE)
   .innerJoin(OFFICE)
   .on(OFFICE.OFFICE_CODE.eq(EMPLOYEE.OFFICE_CODE))
   .fetch();


An output fragment looks like this (notice that the employees having the same salary get the same rank):

Output: Salary rank

Next, let's use DENSE_RANK() for selecting the highest salary from each office, including duplicates. This time, let's use the QUALIFY clause as well. The code is illustrated in the following snippet:

Java
 
select(EMPLOYEE.FIRST_NAME, EMPLOYEE.LAST_NAME,
       EMPLOYEE.SALARY, OFFICE.CITY, OFFICE.COUNTRY,
       OFFICE.OFFICE_CODE)
  .from(EMPLOYEE)
  .innerJoin(OFFICE)
  .on(OFFICE.OFFICE_CODE.eq(EMPLOYEE.OFFICE_CODE))
  .qualify(denseRank().over().partitionBy(OFFICE.OFFICE_CODE)
           .orderBy(EMPLOYEE.SALARY.desc()).eq(1))
  .fetch();


Before going further, here is a nice read called "The Difference Between ROW_NUMBER(), RANK(), and DENSE_RANK()." You can check out these examples in the DenseRank bundled code.

Paginating JOINs via DENSE_RANK()

Let's assume that we want to paginate offices (OFFICE) with employees (EMPLOYEE). If we apply a classical offset or keyset pagination to the JOIN between OFFICE and EMPLOYEE, then the result is prone to be truncated. Therefore, an office can be fetched with only a subset of its employees. For instance, while we think of a result page of size 3 as containing three offices with all their employees, we instead get a single office with three employees (even if this office has more employees). The following figure reveals what we expect versus what we get from a page of size 3 (offices):

What we expect versus what we get from a page of size 3 (offices)

In order to obtain a result set like the one on the left-hand side of the preceding figure, we can rely on the DENSE_RANK() window function, which assigns a sequential number to different values of a within each group b, as shown in the following jOOQ query:

Java
 
Map<Office, List<Employee>> result = ctx.select().from(
  select(OFFICE.OFFICE_CODE, OFFICE...,
         EMPLOYEE.FIRST_NAME, EMPLOYEE...,
         denseRank().over().orderBy(OFFICE.OFFICE_CODE, OFFICE.CITY).as("rank"))
  .from(OFFICE)
  .join(EMPLOYEE)
  .on(OFFICE.OFFICE_CODE.eq(EMPLOYEE.OFFICE_CODE)).asTable("t"))
  .where(field(name("t", "rank")).between(start, end))
  .fetchGroups(Office.class, Employee.class);


The start and end variables represent the range of offices set via DENSE_RANK(). The following figure should clarify this aspect where start = 1 and end = 3 (the next page of the three offices is between start = 4 and end = 6):

Offices where start = 1 and end = 3

Here is a more compact version of the previous query, using the QUALIFY clause:

Java
 
Map<Office, List<Employee>> result = 
  ctx.select(OFFICE.OFFICE_CODE, OFFICE...,
             EMPLOYEE.FIRST_NAME, EMPLOYEE...)
  .from(OFFICE)
  .join(EMPLOYEE)
  .on(OFFICE.OFFICE_CODE.eq(EMPLOYEE.OFFICE_CODE))
  .qualify(denseRank().over().orderBy(OFFICE.OFFICE_CODE, OFFICE.CITY)
           .between(start, end))
  .fetchGroups(Office.class, Employee.class);


You can check out the complete example named DenseRankPagination for MySQL. The returned Map<Office, List<Employee>> is serialized to JSON via a classical Spring Boot controller.

There are chances that you are not familiar with the QUALIFY clause, so here is a brief overview.

The QUALIFY Clause

Some databases (for instance, Snowflake) support a clause named QUALIFY. Via this clause, we can filter (apply a predicate) the results of window functions. Mainly, a SELECT … QUALIFY clause is evaluated after window functions are computed, so after Window Functions (step 6)  and before DISTINCT (Step 8):

Flow From Join to Limit Fetch Top

The syntax of QUALIFY is QUALIFY <predicate>, and in the following screenshot, you can see how it makes the difference (this query returns every 10th product from the PRODUCT table via the ROW_NUMBER() window function):

Query returning every 10th product from the PRODUCT table via the ROW_NUMBER() window function

By using the QUALIFY clause, we eliminate the subquery and the code is less verbose. Even if this clause has poor native support among database vendors, jOOQ emulates it for all the supported dialects. Cool, right?!

Explore further in my book, jOOQ Masterclass.

 

 

 

 

Top