Global Variables
datavault4dbt is highly customizable by using many global variables. Since they are applied on multiple levels, a high rate of standardization across your data vault 2.0 solution is guaranteed.
Prerequisites
The default values of those variables are set inside the packages dbt_project.yml
under <your_dbt_project>/dbt_packages/datavault4dbt/dbt_project.yml
and should be copied to your own dbt_project.yml
. Copy all variables defined under the keyword vars
, and paste them under the keyword vars
in your file.
Variables and their usages
All the following variables are prefixed with datavault4dbt.
Column aliases
Name | Usage | Explanation |
---|---|---|
ldts_alias | Stage, DV entities | The name of the load-date column in all DV entities. Is generated in the staging area. |
rsrc_alias | Stage, DV entities | The name of the record-source column in all DV entities. Is generated in the staging area. |
ledts_alias | Version 1 Satellites | The name of the load-end date column in version 1 Satellites and MA-Satellites. |
sdts_alias | Snapshot Table, PITs | The name of the snapshot-date column in the snapshot table (+view) and all PITs. |
snapshot_trigger_ column | Snapshot Table, PITs, Post-Hook | The name of the column that shows the activation state of single snapshots. |
stg_alias | Record Tracking Satellite | The name of the column, that holds info about the staging model of each record. |
is_current_col_alias | Version 1 Satellites | The name of the column that indicates the current row per hashkey. |
Hash Configuration
Name | Usage | Explanation |
---|---|---|
hash | Stage | What hash algorithm should be used for generating hash values. MD5, SHA1 or SHA2. |
hash_datatype | Stage | The datatype that hash columns should have. Needs to fit the output of the used hash algorithm. |
hashkey_input_ case_sensititve | Stage | Whether the input business keys for hashkey calculation should be case sensitive or not. |
hashdiff_input_ case_sensititve | Stage | Whether the input descriptive attributes for hashdiff calculation should be case sensitive or not. |
Stage Configuration
Name | Usage | Explanation |
---|---|---|
copy_rsrc_ldts_ input_columns | Stage | Whether the columns that are used for ldts and rsrc should also be inside the stage, or not. If true, the stage would hold the ldts- & rsrc-alias columns, and the original columns. If false, only the aliased columns are kept. |
Ghost Record and Zero Key Configuration
Name | Usage | Explanation |
---|---|---|
beginning_of_all_times | Stage, Satellites, PIT | The timestamp that represents your earliest technical timestamp. |
end_of_all_times | Stage, Version 1 Satellites, PIT | The timestamp that represents your latest technical timestamp. We recommend to not use the maximum possible timestamp of your database. |
timestamp_format | Stage, Version 1 Satellites, PIT | The timestamp format of the two previous variables. |
beginning_of_all_times_date | Stage | The date that represents your earliest technical date. Used for ghost-record creation of columns with the date-datatype |
end_of_all_times_date | Stage | The date that represents your latest technical date. Used for ghost-record creation of columns with the date-datatype |
date_format | Stage | The date format of the two previous variables. |
default_unknown_rsrc | Stage | The default unknown value for the record source column you want to use. |
default_error_rsrc | Stage | The default error value for the record source column you want to use. |
rsrc_default_dtype | Stage | The default datatype that should be used for the two variables above. |
stg_default_dtype | Record Tracking Satellite | The default datatype for the ‘stg_alias’ column inside a record tracking satellite . |
derived_columns_ default_dtype | Stage | The default datatype for derived columns, if no other datatype can be detected automatically. |
Datatype specific default values
For each datatype there is a default unknown and error value defined. Additionally, an alternative, usually much shorter value is defined. See the applied default values in the table below.
Datatype | Error Value | Error Value alt. | Unknown Value | Unknown Value alt. |
---|---|---|---|---|
STRING | (error) | e | (unknown) | u |
Numeric | -2 | -1 |
Those values are best changed by adding a global variable inside your dbt project, that is called following this pattern: `datavault4dbt._value_<(alt)>__`. If you want to change the default alternative error value for datatype STRING, you would need to set the global variable `datavault4dbt.error_value_alt__STRING` to your desired value.