Oracle Partition Pruning
Oracle database optimizer analyzes your SQL statement, if there is a partitioned table and "appropriate" where
condition, it decides which partitions will be ignored and which partitions will be accessed and read. This is called partition pruning. It is an essential performance feature for data warehouses.
As you can guess, when you prune some of the partitions, the data you need to read from disks is reduced considerably, which leads to shorter processing times.
Types of Partition Pruning
There are two types of partition pruning:
- Static partition pruning: if the
where
condition has constant values. (Compile time) - Dynamic partition pruning: if the
where
condition has some calculations, functions, etc. (Run time)
Applicable Conditions
Type of Partition
First, the type of partition determines if partition pruning can be used or not.
Below you can find which predicates can be used in which partition type.
Range or List partitions:
- LIKE
- equality (=)
- range (between .. and ..)
- in-list (in)
Hash partitions:
- equality (=)
- in-list (in)
Functions Used or Not
Second, functions used on partition columns cancel the partition pruning.
Any functions including type conversions (hidden functions) cancel the partition pruning. For example:
where trunc(my_column) between sysdate - 30 and sysdate
Demonstration
Let's see how this works behind the scenes;
First, we create a range partitioned table called MY_TABLE
. This table has 29 partitions initially. The partition key is called PERIOD
and its data type is NUMBER
.
create table ODS.MY_TABLE
(PERIOD NUMBER,
STATUS VARCHAR2(4000),
PORTFOLIO_DEALER_NAME VARCHAR2(4000),
PORTFOLIO_DEALER_CODE VARCHAR2(4000),
SALES_REGION VARCHAR2(4000),
SALES_SUB_REGION VARCHAR2(4000),
SALES_CHANNEL_TYPE VARCHAR2(4000)
)
PARTITION BY RANGE (PERIOD)
(PARTITION p201801 VALUES LESS THAN (201802),
PARTITION p201802 VALUES LESS THAN (201803),
PARTITION p201803 VALUES LESS THAN (201804),
PARTITION p201804 VALUES LESS THAN (201805),
PARTITION p201805 VALUES LESS THAN (201806),
PARTITION p201806 VALUES LESS THAN (201807),
PARTITION p201807 VALUES LESS THAN (201808),
PARTITION p201808 VALUES LESS THAN (201809),
PARTITION p201809 VALUES LESS THAN (201810),
PARTITION p201810 VALUES LESS THAN (201811),
PARTITION p201811 VALUES LESS THAN (201812),
PARTITION p201812 VALUES LESS THAN (201901),
PARTITION p201901 VALUES LESS THAN (201902),
PARTITION p201902 VALUES LESS THAN (201903),
PARTITION p201903 VALUES LESS THAN (201904),
PARTITION p201904 VALUES LESS THAN (201905),
PARTITION p201905 VALUES LESS THAN (201906),
PARTITION p201906 VALUES LESS THAN (201907),
PARTITION p201907 VALUES LESS THAN (201908),
PARTITION p201908 VALUES LESS THAN (201909),
PARTITION p201909 VALUES LESS THAN (201910),
PARTITION p201910 VALUES LESS THAN (201911),
PARTITION p201911 VALUES LESS THAN (201912),
PARTITION p201912 VALUES LESS THAN (202001),
PARTITION p202001 VALUES LESS THAN (202002),
PARTITION p202002 VALUES LESS THAN (202003),
PARTITION p202003 VALUES LESS THAN (202004),
PARTITION p202004 VALUES LESS THAN (202005),
PARTITION p202005 VALUES LESS THAN (202006))
nologging
parallel 16
compress for query high;
Now, we run a simple select statement and examine the execution plan.
EXPLAIN PLAN FOR
select /*+ parallel */ *
from ODS.MY_TABLE;
SELECT PLAN_TABLE_OUTPUT FROM TABLE(DBMS_XPLAN.DISPLAY());
--------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | TQ |IN-OUT| PQ Distrib |
--------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 12025 | 2 (0)| 00:00:01 | | | | | |
| 1 | PX COORDINATOR | | | | | | | | | | |
| 2 | PX SEND QC (RANDOM) | :TQ10000 | 1 | 12025 | 2 (0)| 00:00:01 | | | Q1,00 | P->S | QC (RAND) |
| 3 | PX BLOCK ITERATOR | | 1 | 12025 | 2 (0)| 00:00:01 | 1 | 29 | Q1,00 | PCWC | |
| 4 | TABLE ACCESS STORAGE FULL| MY_TABLE | 1 | 12025 | 2 (0)| 00:00:01 | 1 | 29 | Q1,00 | PCWP | |
--------------------------------------------------------------------------------------------------------------------------------------
The part we need to focus on this execution plan is Pstart
and Pstop
columns. As you can see, when we don't apply any where
condition (in other words filtering the data), it reads all 29 partitions.
What if we apply a where
condition and filter the data using the partition key. Let's check how the execution plan changes.
EXPLAIN PLAN FOR
select /*+ parallel */ *
from ODS.MY_TABLE
where period between 201901 and 201905;
SELECT PLAN_TABLE_OUTPUT FROM TABLE(DBMS_XPLAN.DISPLAY());
--------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | TQ |IN-OUT| PQ Distrib |
--------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 12025 | 2 (0)| 00:00:01 | | | | | |
| 1 | PX COORDINATOR | | | | | | | | | | |
| 2 | PX SEND QC (RANDOM) | :TQ10000 | 1 | 12025 | 2 (0)| 00:00:01 | | | Q1,00 | P->S | QC (RAND) |
| 3 | PX BLOCK ITERATOR | | 1 | 12025 | 2 (0)| 00:00:01 | 13 | 17 | Q1,00 | PCWC | |
|* 4 | TABLE ACCESS STORAGE FULL| MY_TABLE | 1 | 12025 | 2 (0)| 00:00:01 | 13 | 17 | Q1,00 | PCWP | |
--------------------------------------------------------------------------------------------------------------------------------------
We check the Pstart
and Pstop
columns and clearly see that now it only reads the partitions between the 13th and 17th partitions. In other words, it prunes other partitions and prevents reading extra partitions unnecessarily.
Now, we have a basic idea of what partition pruning is. So, let's create a scenario of how a function cancels pruning. We'll apply a simple to_char
and to_number
function and check what will happen in the execution plan.
EXPLAIN PLAN FOR
select /*+ parallel */ *
from ODS.MY_TABLE
where to_number(to_char(period)) between 201901 and 201905;
SELECT PLAN_TABLE_OUTPUT FROM TABLE(DBMS_XPLAN.DISPLAY());
--------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | TQ |IN-OUT| PQ Distrib |
--------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 12025 | 2 (0)| 00:00:01 | | | | | |
| 1 | PX COORDINATOR | | | | | | | | | | |
| 2 | PX SEND QC (RANDOM) | :TQ10000 | 1 | 12025 | 2 (0)| 00:00:01 | | | Q1,00 | P->S | QC (RAND) |
| 3 | PX BLOCK ITERATOR | | 1 | 12025 | 2 (0)| 00:00:01 | 1 | 29 | Q1,00 | PCWC | |
|* 4 | TABLE ACCESS STORAGE FULL| MY_TABLE | 1 | 12025 | 2 (0)| 00:00:01 | 1 | 29 | Q1,00 | PCWP | |
--------------------------------------------------------------------------------------------------------------------------------------
As we can see, function usage cancels the partition pruning and it reads all the partitions again. You should keep in mind that if there is a hidden type casting in the where
condition, it still cancels partition pruning.
Conclusion
To sum up, partition pruning is a great way to increase your query performance in data warehouses. We passed over partition pruning in a simple way and I hope this creates a basic understanding of how pruning works.