Quantcast
Channel: Teradata Developer Exchange - All blogs
Viewing all articles
Browse latest Browse all 136

New StatsInfo query for TD14

$
0
0
AttachmentSize
stats_td14_20130830.zip212.2 KB

Beginning with TD14 statistics are no longer stored in dbc.TVFields and dbc.Indexes, they have been moved into dbc.StatsTbl to facilitate several enhancements. A new view dbc.StatsV returns much of the information previously extracted in my StatsInfo query.

But of course this is still not enough information, at least not for me ;-)

 dbc.StatsV vs. StatsInfo TD13/14

I tried to get the best of both worlds and so i wrote a new version for TD14 to extract as much additional data as possible. I had to remove a few columns (they're no longer needed or it's no longer possible to get that info) and i renamed some to match the new names in dbc.StatsV. Most of the new columns were simply not available before TD14.

Following table describes the new StatsInfo view and details the differences to the previous version and dbc.StatsV:

dbc.StatsV TD14

New StatsInfo TD14

StatsInfo TD13.10

Remarks

Description

 

(Column added/removed/changed)

 

 

DatabaseName

DatabaseName

DatabaseName

 

 

TableName

TableName

TableName

 

 

ColumnName

ColumnName

ColumnName

StatsId <> 0

List of comma-separated column names

FieldIdList

FieldIdList

FieldId

StatsId <> 0

List of semicolon-separated field ids

StatsName

StatsName

StatsName

StatsId <> 0

Alias name of the statistics (if assigned)

 

IndexName

IndexName

StatsId <> 0

Name of the index (if assigned)

 

DateIncluded

DateIncluded

StatsId <> 0

DATE or TIMESTAMP column included, Y/N

 

PartitionColumn

PartitionColumn

StatsId <> 0

Column included which is used in the table's partitioning expression: Y/N

 

PartitionLevels

PartitionLevels

 

Number of levels in the table's partitioning expression, zero means not partitioned

 

ColumnPartitioningLevel

 

 

Level number for the column partitioning level, > 0 indicates columnar table

 

PartitionsDefined

PartitionsDefined

 

The number of partitions defined

ExpressionCount

ExpressionCount

ColumnCount

StatsId <> 0

The number of columns in the statistics

StatsId

StatsId

 

 

StatsId = 0 → Summary Stats

StatsType

StatsType

 

 

Statistics collected on:
T → Table
I → Join Index
N → Hash Index
V → View 14.10?
Q → Query 14.10?
L → Link Row 14.10?

 

StatsTypeOld

StatsType

 

Statistics collected on:
Summ → Summary Statistic
UPI → Unique Primary Index
NUPI → Non-Unique Primary Index
USI → Unique Secundary Index
NUSI → Non-Unique Secondary Index
VOSI → Value Ordered NUSI
Part → Pseudo column PARTITION
Col → Single Column
MCol → Multiple columns

 

TableType

TableType

 

TempTbl → Global Temporary Table
Tbl → Table
JoinIdx → Join Index
HashIdx → Hash Index
NoPITbl → No Primary Index Table

StatsSource

StatsSource

 

 

The method this statistic is acquired:
I → Internally generated
S → User collected with COLLECT STATS (system built)
U → User collected with COLLECT STATS VALUES clause
C → Copied from other sources
T → Transferred with CREATE TABLE...AS statement

ValidStats

ValidStats

 

 

TD14.10: Indicates whether the statistics are valid or not: Y/N

DBSVersion

DBSVersion

 

 

Database version statistics collected on

SampleSizePct

SampleSizePct

SampleSize

StatsId <> 0

Sample size used for collect stats, NULL if not sampled

SampleSignature

SampleSignature

 

StatsId <> 0

Sample option encoded as a 10 character signature
USPnone → collected using NO SAMPLE
USP00nn.00 → collected using SAMPLE nn PERCENT
SDPxxxx.xx → sample size determined by system

ThresholdSignature

ThresholdSignature

 

StatsId <> 0

THRESHOLD options encoded as a 17 character signature (not used before TD14.10)
Characters 1 to 10 → THRESHOLD PERCENT
SCTxxxx.xx → System defined
UCT005.00 → User defined 5 percent
UCTnone → User defined no threshold
Characters 11 to 17 → THRESHOLD DAYS
STTxxxx → System defined
UTT0010 → User defined 10 days
UTTnone → User defined no threshold

MaxIntervals

MaxIntervals

 

StatsId <> 0

User-specified maximum number of intervals

StatsSkipCount

StatsSkipCount

 

StatsId <> 0

TD14.10 only: How many times the statistis collection has been skipped based on the THRESHOLD

MaxValueLength

MaxValueLength

 

StatsId <> 0

User-specified maximum value length

LastCollectTimestamp

LastCollectTimestamp

CollectTimestamp

 

Date and time when statistics were last collected

 

LastCollectDate

CollectDate

 

 

LastCollectTime

CollectTime

 

RowCount

RowCount

NumRows

 

The cardinality of the table, i.e. the number of rows

UniqueValueCount

UniqueValueCount

NumValues

StatsId <> 0

Distinct Values. Estimated when sampled

PNullUniqueValueCount

PNULLUniqueValueCount

 

StatsId <> 0

Number of unique values from rows with partial NULLs (multicolumn stats)
Estimated when sampled

NullCount

NULLCount

NumNULLs

StatsId <> 0

Number of partly NULL and all NULL rows, estimated when sampled

AllNullCount

AllNULLCount

NumAllNULLs

StatsId <> 0

Number of all NULL rows (multicolumn stats), estimated when sampled

HighModeFreq

HighModeFreq

ModeFreq

StatsId <> 0

Frequency of the most common value, estimated when sampled

PNullHighModeFreq

PNULLHighModeFreq

 

StatsId <> 0

Highest frequency of values having partial NULLs (for multicolumn stats), stimated when sampled

AvgAmpRPV

AvgAmpRPV

AvgAmpRPV

StatsId <> 0

Overall average of the average rows per value from each AMP, only for NUSIs, otherwise zero

 

MinValue

MinValue

StatsId <> 0

Minimum data value (only for single column numeric or datetime stats)

 

ModalValue

ModalValue

StatsId <> 0

Most common data value (only for single column numeric or datetime stats)

 

MaxValue

MaxValue

StatsId <> 0

Maximum data value (only for single column numeric or datetime stats)

 

OneAMPSampleEst

OneAMPSampleEst

StatsId = 0

Estimated cardinality based on a single-AMP sample

 

AllAmpSampleEst

AllAmpSampleEst

StatsId = 0

Estimated cardinality based on an all-AMP sample

DelRowCount

DelRowCount

 

StatsId = 0

Deleted rows count??? used in 14.10 ???

PhyRowCount

PhyRowCount

 

StatsId = 0

Seems to be the same as AllAMPSampleEst – used in 14.10 ???

AvgRowsPerBlock

AvgRowsPerBlock

 

StatsId = 0

Average number of rows per datablock???

AvgBlockSize

AvgBlockSize

 

StatsId = 0

Average datablock size???

BLCPctCompressed

BLCPctCompressed

 

StatsId = 0

Blockcompression in percent??? used in 14.10 ???

BLCBlkUcpuCost

BLCBlkUcpuCost

 

StatsId = 0

CPU cost for Blockcompression??? used in 14.10 ???

BLCBlkURatio

BLCBlkURatio

 

StatsId = 0

??? used in 14.10 ???

AvgRowSize

AvgRowSize

 

StatsId = 0

Average record size???

Temperature

Temperature

 

StatsId = 0

populated in 14.10???

NumOfAMPs

NumOfAMPs

NumAMPs

 

The number of AMPs from which statistics were collected, usually the number of AMPs in the system, 1 for an empty table

CreateTimeStamp

CreateTimeStamp

 

 

Statistics creation timestamp

LastAlterTimeStamp

LastAlterTimeStamp

LastAlterTimeStamp

 

Different meaningLast user updated timestamp, i.e. Collect stats was submitted but skipped by optimizer due to threshold not reached

 

LastAccessTimestamp

LastAccessTimestamp

 

The last time this column/index was used in queries, the same info is found in dbc.TablesV and dbc.IndicesV

 

AccessCount

AccessCount

 

How often this column/index was used in queries, the same info is found in dbc.TablesV and dbc.IndicesV

 

TableId

TableId

 

To facilitate additional joins to other system tables

 

IndexNumber

IndexNumber

StatsId <> 0

Index number of the index on which statistics are collected

 

FieldType

FieldType

 

Single column stats: dbc.TVFields.FieldType, NULL for multi-column

 

Version

StatsVersion

 

Internal version of statistics:
5 → TD14
6 → TD14.10

OriginalVersion

OriginalVersion

 

StatsId <> 0

Probably version when stats were migrated from older releases, but not yet recollected
4: pre-TD14
5: TD14.00
6: TD14.10

NumOfBiasedValues

NumOfBiasedValues

 

StatsId <> 0

Number of biased values in the histogram

NumOfEHIntervals

NumOfEHIntervals

 

StatsId <> 0

Number of equal height intervals in the histogram

NumOfRecords

NumOfRecords

 

 

Number of history records in the histogram

 

CollectStatement

CollectStatement

 

COLLECT STATS statement to collect the stats.
Two versions with or without double-quoted object names

 

ShowStatement

HelpStatement

 

SHOW STATS VALUES statement to get the stats details.
Two versions with or without double-quoted object names.

 

 

MissingStats

 

Was a side-product of the old query, too much overhead to add

 

 

NumIntervals

 

Replaced by NumOfBiasedValues & NumOfEHIntervals

 

 

CollectDuration

 

Not (yet) possible, i don't know if this is stored somewhere

 

 

NumericStats

 

No longer neccessary

 

 

DataSize

 

Too much overhead to calculate, not really needed as the limitation of 16 bytes is removed in TD14

Please report any issues or obviously wrong output to dnoeth@gmx.de.


Attached files:

StatsInfo_vs_StatsV.pdf

Describes the new StatsInfo view and details the differences to the previous version and dbc.StatsV - added, modified and removed columns (same as above table)

Teradata Statistics TD14.pdf

Partial description of the new internal stats format based on some reverse engeneering of the binary data stored in a BLOB in dbc.StatsTbl.Histogram. Luckily the internal storage maps almost 1:1 to the output of a SHOW STATISTICS VALUES :-)

stats_td14_yyyymmdd.sql

StatsInfo source code. To keep the code clean it's based on SQL-UDFs

ReverseBytes.sql ReverseBytes.c

Can be used to replace the ReverseBytes SQL-UDF with a C-UDF which uses way less CPU (but most DBA's don't like C).
Note: I'm not a C programmer, but this was so basic even i 

 

Ignore ancestor settings: 
0
Apply supersede status to children: 
0

Viewing all articles
Browse latest Browse all 136

Trending Articles