BEST UPSERT TECHNIC

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

RodBarnes wrote:Correct?
Yes, for a minimum out of effort. I'm in the camp of spooling out the entire row (all columns) and staging into a sequential file first. This allows you to have a "before" image of a row that may be potentially updated.

Then, another job loads that sequential file into two hashed files: one that is a straight copy of the row using the surrogate primary key, the other that is the natural key to surrogate primary key cross reference, where the natural key is the hashed file primary key. This allows for a high-performance surrogate key lookup during foreign surrogate key substitution in child table processing. Also, by staging the entire row in its unadulterated form in a staging hashed file, you can update the existing contents of the hashed file with newer data during transformation processing.

During transformation processing, you can lookup against the natural to surrogate key cross-reference hashed file for insert or update determination and stage into the staging hashed files appropriately. During update processing, you would then reference the staging copy of the row and differentiate the newer row from the existing row and decide if an update is worthwhile. In the case of a slowly changing dimension, you could opt to generate the newer variant and potentially "retire" the older variant.

Using the hashed files as a staging mechanism allows you to handle the most difficult transformation exercises in data warehousing: SCD's, surrogate key assignment, foreign surrogate key substitution, repeating natural keys in a dataset, and milestoned/recoverable processing.

There's a few documents on my website that talk about this in detail. Take a peek.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
Post Reply