Translating descriptive dates, like last-week, with in line SQL code to achieve Partition elimination.
One frequently used feature of BI tools is to present drop-down lists of values. The purpose is to provide limiting parameters used in the creation of a report or selection. Within the database, these parameters are used to limit the data selected and (hopefully) use partition elimination on the larger fact tables.
Most commonly the Fact tables are partitioned by transaction date. The reporting and selection that is run on these tables, either specifies a single date, or a date range.
When actual dates are provided, all types of selection are able to use partition elimination.
However when the parameter first needs a lookup to be translated to a date, like “last week”, then partition elimination can become more problematic.
If the selection results in a single value that can be compared to the partition column in an equal (=) test, then partition elimination works. The optimizer creates a temporary variable, where the actual value is inserted into this variable during execution. When looking in the explain plan, you will see this variable being used in the retrieve or join step on the table with partitioning. This is called Delayed partition elimination.
Unfortunately, this delayed partition elimination only works for the equal test. When we need to do the commonly used BETWEEN clause, partition elimination with lookup does not work.
In some cases we can create a lookup table, where a list of dates can be found for each combination of the selection parameters. Then this list is used in a product join, achieving Dynamic partition elimination (DPE). Unfortunately this does not always work as the filtering has to be done in a (costly) product join.
Fortunately, we have another option of achieving direct partition elimination for multiple parameter lookup, by translating the lookup into a date calculation.
For instance the below descriptive terms are easily translated into a date calculation.
-
YESTERDAY = CURRENTDATE – 1
-
CURRENT_MONTH_START = CURRENT_DATE -EXTRACT(DAY FROM CURRENT_DATE) +1
-
PREVIOUS_MONTH_START = ADD_MONTHS(CURRENT_DATE -EXTRACT(DAY FROM CURRENT_DATE) +1 ,-1)
-
PREVIOUS_MONTH_END = CURRENT_DATE -EXTRACT(DAY FROM CURRENT_DATE)
-
CURRENT_YEAR_START = ADD_MONTHS(CURRENT_DATE -EXTRACT(DAY FROM CURRENT_DATE) +1 ,-EXTRACT(MONTH FROM CURRENT_DATE) +1)
-
CURRENT_YEAR_End = ADD_MONTHS(CURRENT_DATE-EXTRACT(DAY FROM CURRENT_DATE)+1 ,-EXTRACT(MONTH FROM CURRENT_DATE) +13) -1
We can take advantage of Transitive Closure (elimination of code that is never true, like 1=2), to code multiple of these translations into a single filter condition.
For example the below SQL;
SEL FACT.* FROM Daily_transactions_table FACT
WHERE ( /* filter to set the FROM for the BETWEEN (>=) */
( @FROM_DATE = 'TODAY' AND FACT.Activity_DATE >= CURRENT_DATE )
OR ( @FROM_DATE = 'YESTERDAY' AND FACT.Activity_DATE >= CURRENT_DATE -1)
)
AND ( /* filter to set the TO for the BETWEEN (<=) */
( @TO_DATE = 'TODAY' AND FACT.Activity_DATE <= CURRENT_DATE )
OR ( @TO_DATE = 'TOMORROW' AND FACT.Activity_DATE <= CURRENT_DATE +1)
) ;
In this example, the FROM and TO parameters are linked with date calculations. When we run this query and substitute YESTERDAY and TOMORROW for the FROM and TO parameters, what remains for the system to execute is;
SEL FACT.* FROM Daily_transactions_table FACT
WHERE (( 'YESTERDAY' = 'YESTERDAY' AND FACT.Activity_DATE >= CURRENT_DATE -1))
AND (( 'TOMORROW' = 'TOMORROW' AND FACT.Activity_DATE <= CURRENT_DATE +1))
The @FROM_DATE = ‘TODAY’ filter is eliminated by Transitive Closure, as ‘YESTERDAY’ = ‘TODAY’ is not possible. As is the @TO_DATE = ‘TODAY’ filter. This leaves a simple BETWEEN logic with calculated dates, resulting in partition elimination.
Of course we should add some tests to make valid dates possible, and some more options to enter in the selection lists.
Below a complete working example with date validation;
SEL FACT.txt_typ_cd, COUNT(*) Cntr
FROM ProdDB.Large_Fact_Table FACT
INNER JOIN SYS_CALENDAR.CALENDAR CAL
ON CAL.Calendar_Date = FACT.activity_date
/***************************************************************************************/
/* Possible combinations for FROM - TO parameters are;
/* TODAY - TODAY
/* TODAY - MTD
/* TODAY - QTD
/* TODAY - YTD
/* <valid date> - TODAY
/* <valid date> - <valid date>
/* PRIOR - WEEK
/* PRIOR - 2WEEKS
/* PRIOR - MONTH
/* PRIOR - 2MONTHS
/* PRIOR - 3MONTHS
/* All selections are assumed to be based on yesterday (CURRENT_DATE-1).
/*************************************************************************************/
/* replace the parameters with location of “list of values” (LOV) for use in BO;
/* @FROM_PARM: @Prompt('Report Start:','A','Date LOVs\Relative Date LOV', MONO,FREE,not_persistent,{'TODAY'})
/* @TO_PARM : @Prompt('Report END:','A','Date LOVs\Month number LOV', MONO,FREE,not_persistent,{'MTD'})
/* note: do this substitution last, as it makes the code near to unreadable.
/*************************************************************************************/
AND ( ( /* for TODAY - TODAY */
@FROM_PARM = 'TODAY' AND @TO_PARM = 'TODAY' AND CAL.Calendar_Date = CURRENT_DATE-1
)
OR ( /* for TODAY - MTD */
@FROM_PARM = 'TODAY' AND @TO_PARM = 'MTD' AND CAL.Calendar_Date
BETWEEN CURRENT_DATE-1 -EXTRACT(DAY FROM CURRENT_DATE-1) +1
AND CURRENT_DATE-1
)
OR ( /* for TODAY - QTD */
@FROM_PARM = 'TODAY' AND @TO_PARM = 'QTD' AND CAL.Calendar_Date
BETWEEN ADD_MONTHS(CURRENT_DATE-1 -EXTRACT(DAY FROM CURRENT_DATE-1) +1
,-EXTRACT(MONTH FROM CURRENT_DATE-1)
+(((EXTRACT(MONTH FROM CURRENT_DATE-1)+2)/3-1)*3+1))
AND CURRENT_DATE-1
)
OR ( /* for TODAY - YTD */
@FROM_PARM = 'TODAY' AND @TO_PARM = 'YTD' AND CAL.Calendar_Date
BETWEEN ADD_MONTHS(CURRENT_DATE-1 -EXTRACT(DAY FROM CURRENT_DATE-1) +1
,-EXTRACT(MONTH FROM CURRENT_DATE-1) +1)
AND CURRENT_DATE-1
)
OR ( /* for TODAY - <valid date> */
@TO_PARM = 'TODAY' AND SUBSTR(@FROM_PARM,1,2) = '20' AND SUBSTR(@FROM_PARM,3,1) IN ('0','1','2','3','4','5','6','7','8','9')
AND SUBSTR(@FROM_PARM,4,1) IN ('0','1','2','3','4','5','6','7','8','9')
AND SUBSTR(@FROM_PARM,5,1) = '-' AND ( ( SUBSTR(@FROM_PARM,6,2) IN ('02')
AND SUBSTR(@FROM_PARM,9,2) <= '28' )
OR ( SUBSTR(@FROM_PARM,6,2) IN ('04','06','09','11')
AND SUBSTR(@FROM_PARM,9,2) <= '30' )
OR ( SUBSTR(@FROM_PARM,6,2) IN ('01','03','05','07','08','10','12')
AND SUBSTR(@FROM_PARM,9,2) <= '31' )
OR @FROM_PARM IN ('2004-02-29','2008-02-29','2012-02-29','2016-02-29','2020-02-29' ,'2024-02-29','2028-02-29','2032-02-29','2036-02-29','2040-02-29')
)
AND SUBSTR(@FROM_PARM,8,1) = '-' AND SUBSTR(@FROM_PARM,9,1) IN ('0','1','2','3')
AND SUBSTR(@FROM_PARM,10,1) IN ('0','1','2','3','4','5','6','7','8','9')
AND CAL.Calendar_Date
BETWEEN CAST((CASE WHEN SUBSTR(@FROM_PARM,1,2) = '20' THEN @FROM_PARM ELSE '1000-01-01' END) AS DATE)
AND CURRENT_DATE-1
)
OR ( /* for <valid date> - <valid date> */
SUBSTR(@FROM_PARM,1,2) = '20' AND SUBSTR(@FROM_PARM,3,1) IN ('0','1','2','3','4','5','6','7','8','9')
AND SUBSTR(@FROM_PARM,4,1) IN ('0','1','2','3','4','5','6','7','8','9')
AND SUBSTR(@FROM_PARM,5,1) = '-' AND ( ( SUBSTR(@FROM_PARM,6,2) IN ('02')
AND SUBSTR(@FROM_PARM,9,2) <= '28' )
OR ( SUBSTR(@FROM_PARM,6,2) IN ('04','06','09','11')
AND SUBSTR(@FROM_PARM,9,2) <= '30' )
OR ( SUBSTR(@FROM_PARM,6,2) IN ('01','03','05','07','08','10','12')
AND SUBSTR(@FROM_PARM,9,2) <= '31' )
OR @FROM_PARM IN ('2004-02-29','2008-02-29','2012-02-29','2016-02-29','2020-02-29' ,'2024-02-29','2028-02-29','2032-02-29','2036-02-29','2040-02-29')
)
AND SUBSTR(@FROM_PARM,8,1) = '-' AND SUBSTR(@FROM_PARM,9,1) IN ('0','1','2','3')
AND SUBSTR(@FROM_PARM,10,1) IN ('0','1','2','3','4','5','6','7','8','9')
AND SUBSTR(@TO_PARM,1,2) = '20' AND SUBSTR(@TO_PARM,3,1) IN ('0','1','2','3','4','5','6','7','8','9')
AND SUBSTR(@TO_PARM,4,1) IN ('0','1','2','3','4','5','6','7','8','9')
AND SUBSTR(@TO_PARM,5,1) = '-' AND ( ( SUBSTR(@TO_PARM,6,2) IN ('02')
AND SUBSTR(@TO_PARM,9,2) <= '28' )
OR ( SUBSTR(@TO_PARM,6,2) IN ('04','06','09','11')
AND SUBSTR(@TO_PARM,9,2) <= '30' )
OR ( SUBSTR(@TO_PARM,6,2) IN ('01','03','05','07','08','10','12')
AND SUBSTR(@TO_PARM,9,2) <= '31' )
OR @TO_PARM IN ('2004-02-29','2008-02-29','2012-02-29','2016-02-29','2020-02-29' ,'2024-02-29','2028-02-29','2032-02-29','2036-02-29','2040-02-29')
)
AND SUBSTR(@TO_PARM,8,1) = '-' AND SUBSTR(@TO_PARM,9,1) IN ('0','1','2','3')
AND SUBSTR(@TO_PARM,10,1) IN ('0','1','2','3','4','5','6','7','8','9')
AND CAL.Calendar_Date
BETWEEN CAST((CASE WHEN SUBSTR(@FROM_PARM,1,2) = '20' THEN @FROM_PARM ELSE '1000-01-01' END) AS DATE)
AND CAST((CASE WHEN SUBSTR(@TO_PARM,1,2) = '20' THEN @TO_PARM ELSE '1000-01-01' END) AS DATE)
)
OR ( /* for PRIOR - WEEK */
/* note: ((CURRENT_DATE - DATE'0001-01-07') MOD 7) gives 0 for Sun, 6 for Sat */
/* this calculation bases a week as Sun-Sat */
@FROM_PARM = 'PRIOR' AND @TO_PARM = 'WEEK' AND CAL.Calendar_Date
BETWEEN CURRENT_DATE-1 -((CURRENT_DATE - DATE'0001-01-07') MOD 7)-7
AND CURRENT_DATE-1 -((CURRENT_DATE - DATE'0001-01-07') MOD 7)
)
OR ( /* for PRIOR - 2WEEKS */
@FROM_PARM = 'PRIOR' AND @TO_PARM = '2WEEKS' AND CAL.Calendar_Date
BETWEEN CURRENT_DATE-1 -((CURRENT_DATE - DATE'0001-01-07') MOD 7)-14
AND CURRENT_DATE-1 -((CURRENT_DATE - DATE'0001-01-07') MOD 7)
)
OR ( /* for PRIOR - MONTH */
@FROM_PARM = 'PRIOR' AND @TO_PARM = 'MONTH' AND CAL.Calendar_Date
BETWEEN ADD_MONTHS(CURRENT_DATE-1 -EXTRACT(DAY FROM CURRENT_DATE-1) +1 ,-1)
AND CURRENT_DATE-1 -EXTRACT(DAY FROM CURRENT_DATE-1)
)
OR ( /* for PRIOR - 2MONTHS */
@FROM_PARM = 'PRIOR' AND @TO_PARM = '2MONTHS' AND CAL.Calendar_Date
BETWEEN ADD_MONTHS(CURRENT_DATE-1 -EXTRACT(DAY FROM CURRENT_DATE-1) +1 ,-2)
AND CURRENT_DATE-1 -EXTRACT(DAY FROM CURRENT_DATE-1)
)
OR ( /* for PRIOR - 3MONTHS */
@FROM_PARM = 'PRIOR' AND @TO_PARM = '3MONTHS' AND CAL.Calendar_Date
BETWEEN ADD_MONTHS(CURRENT_DATE-1 -EXTRACT(DAY FROM CURRENT_DATE-1) +1 ,-3)
AND CURRENT_DATE-1 -EXTRACT(DAY FROM CURRENT_DATE-1)
)
) /** end of OR list **/
GROUP BY 1;
Notes;
Starting with Teradata Database 12.0, static partition elimination also occurs for the built-in function CURRENT_DATE and DATE.
Dynamic partition elimination (DPE) for product joins may occur when there is an equality constraint (including IN and NOT IN starting with Teradata Database 13.0) between the partitioning column of a partitioning expression of one table and a column of another table, sub query or spool.
This easy solution for fitting on one page has some disadvantages:
The algorithm
First we execute for each column with statistics of the table to compress the "SHOW STATISTICS VALUES COLUMN". From this output we take the numbers of null and the values of the biased values block. From the number of occurences we decide which values come into the multi value compress list. At the moment each value has to have an estimation of more than 1% in the data. With this limit it could not happen that we have more than 100 compress values. In parallel we create a "DROP STATISTICS" and the "COLLECT STATISTICS COLUMN ... ON ... VALUES (...);" to put the statistics back. With this three files we first drop the statistics, perform the alter table statement and after that put the statistics back.
The process
The algorithm consists of a sql file and an awk script. The sql file gets the "SHOW STATISTICS VALUES COLUMN" for the columns for the tables in an useful ordering:
These commands have to be executed by bteq and stored in one file. The awk script takes this file and produces a larger file:
Executing these statements perform the compression. Finished.
The source code
SQL File
AWK File