datavault4dbt is highly customizable by using many global variables. Since they are applied on multiple levels, a high rate of standardization across your data vault 2.0 solution is guaranteed.
Prerequisites
The default values of those variables are set inside the packages dbt_project.yml under <your_dbt_project>/dbt_packages/datavault4dbt/dbt_project.yml and should be copied to your own dbt_project.yml. Copy all variables defined under the keyword vars, and paste them under the keywordvarsin your file.
Variables and their usages
All the following variables are prefixed with datavault4dbt.
Column aliases
| Name |
Usage |
Explanation |
| ldts_alias |
Stage, DV entities |
The name of the load-date column in all DV entities. Is generated in the staging area. |
| rsrc_alias |
Stage, DV entities |
The name of the record-source column in all DV entities. Is generated in the staging area. |
| ledts_alias |
Version 1 Satellites |
The name of the load-end date column in version 1 Satellites and MA-Satellites. |
| sdts_alias |
Snapshot Table, PITs |
The name of the snapshot-date column in the snapshot table (+view) and all PITs. |
| snapshot_trigger_ column |
Snapshot Table, PITs, Post-Hook |
The name of the column that shows the activation state of single snapshots. |
| stg_alias |
Record Tracking Satellite |
The name of the column, that holds info about the staging model of each record. |
| is_current_col_alias |
Version 1 Satellites |
The name of the column that indicates the current row per hashkey. |
| is_active_alias |
Effectivity Satellite v0 |
The name of the column that marks activity. Generated by the macro. |
Hash Configuration
| Name |
Usage |
Explanation |
| hash |
Stage |
What hash algorithm should be used for generating hash values. MD5, SHA1 or SHA2. |
| hash_datatype |
Stage |
The datatype that hash columns should have. Needs to fit the output of the used hash algorithm. |
| hashkey_input_ case_sensititve |
Stage |
Whether the input business keys for hashkey calculation should be case sensitive or not. |
| hashdiff_input_ case_sensititve |
Stage |
Whether the input descriptive attributes for hashdiff calculation should be case sensitive or not. |
Stage Configuration
| Name |
Usage |
Explanation |
| copy_rsrc_ldts_ input_columns |
Stage |
Whether the columns that are used for ldts and rsrc should also be inside the stage, or not. If true, the stage would hold the ldts- & rsrc-alias columns, and the original columns. If false, only the aliased columns are kept. |
Satellite Configuration
| Name |
Usage |
Explanation |
| is_active_datatype |
Effectivity Satellite v0 |
Controls the datatype which is used for the is_active-column. Defaults to Bit for Fabric & Synapse, Number on Oracle and Boolean on the remaining adapters. Available from v1.10.0 |
Ghost Record and Zero Key Configuration
| Name |
Usage |
Explanation |
| beginning_of_all_times |
Stage, Satellites, PIT |
The timestamp that represents your earliest technical timestamp. |
| end_of_all_times |
Stage, Version 1 Satellites, PIT |
The timestamp that represents your latest technical timestamp. We recommend to not use the maximum possible timestamp of your database. |
| timestamp_format |
Stage, Version 1 Satellites, PIT |
The timestamp format of the two previous variables. |
| beginning_of_all_times_date |
Stage |
The date that represents your earliest technical date. Used for ghost-record creation of columns with the date-datatype |
| end_of_all_times_date |
Stage |
The date that represents your latest technical date. Used for ghost-record creation of columns with the date-datatype |
| date_format |
Stage |
The date format of the two previous variables. |
| default_unknown_rsrc |
Stage |
The default unknown value for the record source column you want to use. |
| default_error_rsrc |
Stage |
The default error value for the record source column you want to use. |
| rsrc_default_dtype |
Stage |
The default datatype that should be used for the two variables above. |
| stg_default_dtype |
Record Tracking Satellite |
The default datatype for the ‘stg_alias’ column inside a record tracking satellite . |
| derived_columns_ default_dtype |
Stage |
The default datatype for derived columns, if no other datatype can be detected automatically. |
Datatype specific default values
For each datatype there is a default unknown and error value defined. Additionally, an alternative, usually much shorter value is defined. See the applied default values in the table below.
| Datatype |
Error Value |
Error Value alt. |
Unknown Value |
Unknown Value alt. |
| STRING |
(error) |
e |
(unknown) |
u |
| Numeric |
-2 |
|
-1 |
|
Those values are best changed by adding a global variable inside your dbt project, that is called following this pattern: `datavault4dbt._value_<(alt)>__`. If you want to change the default alternative error value for datatype STRING, you would need to set the global variable `datavault4dbt.error_value_alt__STRING` to your desired value.