datavault4dbt is highly customizable by using many global variables. Since they are applied on multiple levels, a high rate of standardization across your data vault 2.0 solution is guaranteed.
Prerequisites
The default values of those variables are set inside the packages dbt_project.yml
under <your_dbt_project>/dbt_packages/datavault4dbt/dbt_project.yml
and should be copied to your own dbt_project.yml
. Copy all variables defined under the keyword vars
, and paste them under the keywordvars
in your file.
Variables and their usages
All the following variables are prefixed with datavault4dbt.
Column aliases
Name |
Usage |
Explanation |
ldts_alias |
Stage, DV entities |
The name of the load-date column in all DV entities. Is generated in the staging area. |
rsrc_alias |
Stage, DV entities |
The name of the record-source column in all DV entities. Is generated in the staging area. |
ledts_alias |
Version 1 Satellites |
The name of the load-end date column in version 1 Satellites and MA-Satellites. |
sdts_alias |
Snapshot Table, PITs |
The name of the snapshot-date column in the snapshot table (+view) and all PITs. |
snapshot_trigger_ column |
Snapshot Table, PITs, Post-Hook |
The name of the column that shows the activation state of single snapshots. |
stg_alias |
Record Tracking Satellite |
The name of the column, that holds info about the staging model of each record. |
is_current_col_alias |
Version 1 Satellites |
The name of the column that indicates the current row per hashkey. |
is_active_alias |
Effectivity Satellite v0 |
The name of the column that marks activity. Generated by the macro. |
Hash Configuration
Name |
Usage |
Explanation |
hash |
Stage |
What hash algorithm should be used for generating hash values. MD5, SHA1 or SHA2. |
hash_datatype |
Stage |
The datatype that hash columns should have. Needs to fit the output of the used hash algorithm. |
hashkey_input_ case_sensititve |
Stage |
Whether the input business keys for hashkey calculation should be case sensitive or not. |
hashdiff_input_ case_sensititve |
Stage |
Whether the input descriptive attributes for hashdiff calculation should be case sensitive or not. |
Stage Configuration
Name |
Usage |
Explanation |
copy_rsrc_ldts_ input_columns |
Stage |
Whether the columns that are used for ldts and rsrc should also be inside the stage, or not. If true, the stage would hold the ldts- & rsrc-alias columns, and the original columns. If false, only the aliased columns are kept. |
Satellite Configuration
Name |
Usage |
Explanation |
is_active_datatype |
Effectivity Satellite v0 |
Controls the datatype which is used for the is_active-column. Defaults to Bit for Fabric & Synapse, Number on Oracle and Boolean on the remaining adapters. Available from v1.10.0 |
Ghost Record and Zero Key Configuration
Name |
Usage |
Explanation |
beginning_of_all_times |
Stage, Satellites, PIT |
The timestamp that represents your earliest technical timestamp. |
end_of_all_times |
Stage, Version 1 Satellites, PIT |
The timestamp that represents your latest technical timestamp. We recommend to not use the maximum possible timestamp of your database. |
timestamp_format |
Stage, Version 1 Satellites, PIT |
The timestamp format of the two previous variables. |
beginning_of_all_times_date |
Stage |
The date that represents your earliest technical date. Used for ghost-record creation of columns with the date-datatype |
end_of_all_times_date |
Stage |
The date that represents your latest technical date. Used for ghost-record creation of columns with the date-datatype |
date_format |
Stage |
The date format of the two previous variables. |
default_unknown_rsrc |
Stage |
The default unknown value for the record source column you want to use. |
default_error_rsrc |
Stage |
The default error value for the record source column you want to use. |
rsrc_default_dtype |
Stage |
The default datatype that should be used for the two variables above. |
stg_default_dtype |
Record Tracking Satellite |
The default datatype for the ‘stg_alias’ column inside a record tracking satellite . |
derived_columns_ default_dtype |
Stage |
The default datatype for derived columns, if no other datatype can be detected automatically. |
Datatype specific default values
For each datatype there is a default unknown and error value defined. Additionally, an alternative, usually much shorter value is defined. See the applied default values in the table below.
Datatype |
Error Value |
Error Value alt. |
Unknown Value |
Unknown Value alt. |
STRING |
(error) |
e |
(unknown) |
u |
Numeric |
-2 |
|
-1 |
|
Those values are best changed by adding a global variable inside your dbt project, that is called following this pattern: `datavault4dbt._value_<(alt)>__`. If you want to change the default alternative error value for datatype STRING, you would need to set the global variable `datavault4dbt.error_value_alt__STRING` to your desired value.