Dedicated to design and performance of databases and audio systems.
During Teradata code review sessions I often find a few of the following fundamentals overlooked. In that spirit, these are my top 10 core best practices.
SET
versus MULTISET
tables:MULTISET
with NO FALLBACK
unless there is a very specific reason to have a SET
and/or FALLBACK
protected definition. MULTISET
omits a unique row check, saving overhead, while NO FALLBACK
ensures there are not four copies of the table stored on disk. Teradata is RAID1/mirrored internally by default so a redundant copy already exists in case of drive failure.
PRIMARY INDEX
selection:PRIMARY INDEX
is the single most important physical design decision. The PRIMARY INDEX
is not necessarily the PRIMARY KEY
. If it is unique then it should be defined as a UNIQUE PRIMARY INDEX
. If the table does not have any good candidate field(s) for the PRIMARY INDEX
then defining the table as NO PRIMARY INDEX
is acceptable (e.g. some staging tables). Two important considerations for the PRIMARY INDEX
are NOT NULL
. Again, further efficiencies are realized. Examples of efficient data typing include defining code fields (e.g. USPS State Code: MA, NH), and other fixed-length character fields as CHARACTER
and not VARCHAR
. Today, we MVC compress low-cardinality fixed-length fields (prior to the introduction of VLC where varible-length fields can also be compressed). Additionally, date fields are defined as DATE
and not TIMESTAMP
nor INTEGER
. TIMESTAMP
is internally a concatenation of Year, Month, Day, Hour, Minute, and Second fields, and is not nearly as efficient as DATE
. Likewise, turning date values into a look-alike INTEGER
saves nothing in storage, but removes date math capabilities inherent with the DATE
data type. And, storing numerical values in fields much larger than is required (e.g. if DECIMAL(9,2)
suffices then don't use DECIMAL(18,2)
). Also, don't exceed DECIMAL(18,x)
without a specific reason to do so.
PRIMARY INDEX
PARTITION
ON
or WHERE
clauseINDEX
)FOREIGN KEY
s between tables allows the optimizer to understand intended relationships and offers inter-table demographics and extrapolate cardinality. FOREIGN KEY
constraints done on a SoftRI (i.e. without database referential integrity checks) basis also allow for Join Elimination to be leveraged. This means that the optimizer may omit joins from a query or view if it deems that the join with not affect the integrity of the query's outcome and the join process would otherwise be superfluous. Efficiency gained.
LOCKING TABLE table_name FOR ACCESS
, or the row version at a minimum. Access locks remove another level of overhead and mitigate contention with write locks. Another efficiency gain.
EXPLAIN
plan:EXPLAIN
plan defines the exact steps, and their sequence. It can provide valuable information about where a process may deviate from what was originally anticipated and help with debugging. It is a powerful tool that should be used to ensure efficiency and quality.
I hope this is helpful and serves as a guideline for you present and future Teradata development.
Cheers
-Brad