Skip to content

Hash value comparison: new record is not loaded, if the same hash value exist in older record. #13

@nnikolov84

Description

@nnikolov84

When using hash value comparison, new record is not loaded, if the same hash value exist in older record.

Input data:

CUSTOMER_ID CUSTOMER_ADDRESS UPDATE_ON Comment
1 "ADDRESS_1" 1/1/2000 Initial address.
1 "ADDRESS_2" 1/20/2000 Customer changed address.
1 "ADDRESS_1" 1/22/2000 Customer changed back the address to the value from line 1.

The third value in the table will not be processed. This is happening, because when generating "ids_to_update", the historical records are not filtered.

with ids_to_update as 
   (select col1, HASH_COLUMN from `....TEST.TEST_TABLE_SRC_01`        
     except distinct         
   (select col1, HASH_COLUMN from `....TEST.source_data_scd_updates`)

In order for the process to work properly, only most recent record per KEY should be compared to the input data.
Possible solution is:

with ids_to_update as 
   (select col1, HASH_COLUMN from `....TEST.TEST_TABLE_SRC_01`        
     except distinct         
   (select col1, HASH_COLUMN from `....TEST.source_data_scd_updates`
      qualify row_number() over (partition by col1 order by updated_at desc) = 1)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions