-
Notifications
You must be signed in to change notification settings - Fork 3
Non Historized Link
The "Datavault4Coalesce: Non-Historized Link" node creates a non-historized link entity, connecting two or more entities, or an entity with itself.
Config Name | Config Type | Explanation |
---|---|---|
Link Hashkey Column | string | In the Config section on the right there is a tab called 'Data Vault'. Expand this and you get a column selector list 'Link Hashkey Column'. Select the hashkey of this Link. |
Source is Single Batch? | Boolean | In the Config section on the right there is a tab called 'Data Vault'. Expand this and you get a toggle 'Source is Single Batch?'. Activate this if your source data always only hold one load_date. By activating this toggle, a QUALIFY() statement is disabled, which will then increase the loading performance. But activate only, when you can ensure your data holds only one load date! |
No Sys Columns are added inside a Link Node.
In this example the Non-Historized Link 'LINEITEM_NL' is being created. The source table is 'STG_LINEITEM'.
Create the Stage Node:
Create multiple Hash Columns, by selecting the following columns, and per selection, do 'right-click'->'Generate Hash Column'. Rename the output accordingly.
-
HK_LINEITEM_L:
- L_ORDERKEY
- L_PARTKEY
- L_SUPPKEY
- L_LINENUMBER
-
HK_ORDER_H:
- L_ORDERKEY
-
HK_PART_H:
- L_PARTKEY
-
HK_SUPPLIER_H:
- L_SUPPKEY
Additionally, rename the column L_LINEITEM to DCK_LINEITEM. DCK stands for Dependant Child Key. The Lineitem number is required to uniquely identify a row in the source object LINEITEM. That's why it is also part of the Link Hashkey calculation.
This is how your Stage Node should look like (not all columns are shown):
Create the node:
- Within your Node STG_LINEITEM, multi-select the following columns:
- HK_LINEITEM_L
- HK_ORDER_H
- HK_PART_H
- HK_SUPPLIER_H
- DCK_LINENUMBER
- LDTS
- RSRC
- L_QUANTITY
- L_EXTENDEDPRICE
- L_DISCOUNT
- L_TAX
- L_RETURNFLAG
- L_LINESTATUS
- Then do 'right-click' -> 'Add Node' -> 'Datavault by Scalefree: Non-Historized Link'
- Rename the new Node to LINEITEM_NL
- Rearrange the columns in the new Node, so that the Link Hashkey is at the top, followed by the Hub Hashkeys, the Dependant Child Key, LDTS and RSRC:
- In the Config section on the right there is a tab called 'Data Vault'
- Expand this and you get a column selector 'Link Hashkey Column'
- Select the hashkey of this Link, HK_LINEITEM_L.
- Select the hashkey of this Link, HK_LINEITEM_L.
- Create and run the node
CREATE OR REPLACE TABLE
"COALESCE_WORKSHOP"."CORE"."LINEITEM_NL" (
"HK_LINEITEM_L" STRING,
"HK_ORDER_H" STRING,
"HK_PART_H" STRING,
"HK_SUPPLIER_H" STRING,
"DCK_LINENUMBER" NUMBER (38, 0) NOT NULL,
"LDTS" TIMESTAMP
COMMENT 'The Load Date Timestamp (LDTS) describes when this data first arrived in the Data Warehouse.',
"RSRC" STRING
COMMENT 'The Record Source (RSRC) describes the source of this data.',
"L_QUANTITY" NUMBER (12, 2) NOT NULL,
"L_EXTENDEDPRICE" NUMBER (12, 2) NOT NULL,
"L_DISCOUNT" NUMBER (12, 2) NOT NULL,
"L_TAX" NUMBER (12, 2) NOT NULL,
"L_RETURNFLAG" VARCHAR (1) NOT NULL,
"L_LINESTATUS" VARCHAR (1) NOT NULL
)
COMMENT = 'Lineitem data as defined by TPC-H'
INSERT INTO
"COALESCE_WORKSHOP"."CORE"."LINEITEM_NL"
WITH
incoming AS (
SELECT DISTINCT
"STG_LINEITEM"."HK_LINEITEM_L" AS "HK_LINEITEM_L",
"STG_LINEITEM"."HK_ORDER_H" AS "HK_ORDER_H",
"STG_LINEITEM"."HK_PART_H" AS "HK_PART_H",
"STG_LINEITEM"."HK_SUPPLIER_H" AS "HK_SUPPLIER_H",
"STG_LINEITEM"."DCK_LINENUMBER" AS "DCK_LINENUMBER",
"STG_LINEITEM"."LDTS" AS "LDTS",
"STG_LINEITEM"."RSRC" AS "RSRC",
"STG_LINEITEM"."L_QUANTITY" AS "L_QUANTITY",
"STG_LINEITEM"."L_EXTENDEDPRICE" AS "L_EXTENDEDPRICE",
"STG_LINEITEM"."L_DISCOUNT" AS "L_DISCOUNT",
"STG_LINEITEM"."L_TAX" AS "L_TAX",
"STG_LINEITEM"."L_RETURNFLAG" AS "L_RETURNFLAG",
"STG_LINEITEM"."L_LINESTATUS" AS "L_LINESTATUS"
FROM
"COALESCE_WORKSHOP"."CORE"."STG_LINEITEM" "STG_LINEITEM"
),
new_records AS (
SELECT
"SRC".*
FROM
incoming "SRC"
WHERE
NOT EXISTS (
SELECT
1
FROM
"COALESCE_WORKSHOP"."CORE"."LINEITEM_NL" "TGT"
WHERE
"SRC"."HK_LINEITEM_L" = "TGT"."HK_LINEITEM_L"
)
AND "SRC"."LDTS" > (
SELECT
COALESCE(
MAX("LDTS"),
TO_TIMESTAMP('0001-01-01T00:00:01', 'YYYY-MM-DDTHH24:MI:SS')
)
FROM
"COALESCE_WORKSHOP"."CORE"."LINEITEM_NL"
WHERE
"LDTS" != TO_TIMESTAMP('8888-12-31T23:59:59', 'YYYY-MM-DDTHH24:MI:SS')
) QUALIFY ROW_NUMBER() OVER (
PARTITION BY
"HK_LINEITEM_L"
ORDER BY
"LDTS"
) = 1
)
SELECT
*
FROM
new_records
Table of Content
- Staging
- DV-Entities
- Hubs
- Links
- Satellites
- Standard Satellite
- Multi-Active Satellite
- Non-Historized Satellite
- Record-Tracking Satellite
- Business Vault
- PIT
- Snapshot Control
- Global Variables