Snapshot Control v0
This macro creates a snapshot table to control snapshot-based tables like PITs and Bridges. The snapshot table will hold daily snapshots starting at a specific start_date and has a configurable daytime. Usual application would involve creating one snapshot table per Data Vault environment, and therefore creating one dbt model using this macro. The model needs to be scheduled daily, with a execution time that matches your desired snapshot time, since the macro automatically inserts all snapshots until the current day, no matter which time it is, and a snapshot for 08:00:00 should be calculated at that time. So if the snapshot table is configured to have a ‘daily_snapshot_time’ of ’07:00:00′, all snapshots in the table will have the timestamp ’07:00:00′. Therefore you need to schedule the building of the snapshot table also to ’07:00:00′.
In addition to the actual snapshot-datetimestamp (sdts), the macro generates the following columns:
Column | Data Type | Explanation |
---|---|---|
replacement_sdts | timestamp | Allows users to replace a sdts with another one, without having to update the actual sdts column. By default this column is filled with the regular sdts. |
caption | string | Allows users to title their snapshots. Examples would be something like: ‘Christmas 2022’, or ‘End-of-year report 2021’. By default this is filled with ‘Snapshot {sdts}’, holding the respective sdts. |
is_hourly | boolean | Captures if the time of a sdts is on an exact hours, meaning minutes=0 and seconds=0. All sdts created by this macro are daily and therefor always hourly, but this column enables future inserts of custom, user-defined sdts. |
is_daily | boolean | Captures if the time of a sdts is on exactly midnight, meaning hours=0, minutes=0 and seconds=0. This depends on your desired daily_snapshot_time, but is not used by the downstream macros, and just generates additional metadata for potential future use. |
is_weekly | boolean | Captures if the day of the week of a sdts is monday. |
is_monthly | boolean | Captures if a sdts is the first day of a month. |
is_end_of_month | boolean | Captures if a sdts is the last day of a month. |
is_quarterly | boolean | Captures if a sdts is the first day of a quarter. |
is_yearly | boolean | Captures if a sdts is the first day of a year |
is_end_of_year | boolean | Captures if a sdts is the last day of a year |
comment | string | Allows users to write custom comments for each sdts. By default this column is set to NULL. |
Parameters | Data Type | Required | Default Value | Explanation |
---|---|---|---|---|
start_date | date | mandatory | – | Defines the earliest date that should be available inside the snapshot_table. The format of this date must be ‘YYYY-MM-DD’. |
daily_snapshot_time | time | mandatory | – | Defines the time that your daily snapshots should have. Usually this is either something right before daily business starts, or after daily business is over. Needs to be in the format ‘hh:mm:ss’. |
sdts_alias | string | optional | datavault4dbt. sdts_alias | Defines the name of the snapshot date timestamp column inside the snapshot_table. |
Example 1
{{ config(materialized='incremental') }}
{%- set yaml_metadata -%}
start_date: '2015-01-01'
daily_snapshot_time: '07:30:00'
{%- endset -%}
{%- set metadata_dict = fromyaml(yaml_metadata) -%}
{{ datavault4dbt.control_snap_v0(start_date=metadata_dict.get('start_date'),
daily_snapshot_time=metadata_dict.get('daily_snapshot_time')) }}
Description
- start_date: The parameter start_date has to be in the format ‘YYYY-MM-DD’. With the start_date set to ‘2015-01-01’, the snapshot table would hold daily snapshots beginning at 2015.
- daily_snapshot_time: ’07:30:00′ The snapshots inside this table would all have the time ’07:30:00′. Must be set in the format ‘hh:mm:ss’.