Implementing SCD Type-2 in DataStage

vimali balakrishnan · Wed Dec 29, 2004 10:03 pm

Hi,

Assume my source table has data in the following format:

Source table:
1110 1111 22222
1111 2222 33333
1111 3333 44444

Target table:
1110 1111 22222 'Active' 10-01-2003 10-01-2004
1111 2222 33333 'Active' 11-01-2003 11-01-2004

Now my job should see if 1110 is present in the target and the other two columns are same,so it is the same in the target,so it should skip that record.

It starts with the next record,checks if 1111 is present in the target,
and the other records are same,so it is the same in the target,so it should skip that record.

It starts with the next record checks if 1111 is already present,so it is already present in the target but the other values are not same so
my job should update the status of the current record(1111) in the target to 'Inactive' and insert the new record from the source with status as 'Active' and the End date of the previous record as start_date of this new record.

The problem i face is how to identify the record to be updated and then insert the new record.

Can u explain me with the constraints that should be used.

ray.wurlod · Post by **ray.wurlod** » Wed Dec 29, 2004 10:43 pm

Only if you post your job design to show us how you are detecting existence of the row in the target table.

Wrap your design in Code tags (which gives a fixed width font) and play with Preview until it's obvious that your design is clear to someone else. For example:

Code: Select all

             HashedFile1  HashedFile2
                     |     ^
                     |     |
                     V     |
  SeqFile  ----->  Transformer  ----->  DRS
                       |
                       |
                       V
                   SeqFile
                  (Rejects)

ramesh_dw · Post by **ramesh_dw** » Thu Dec 30, 2004 12:40 pm

Hi,

you can handle this with two lookups to the target.

1st lookup:
To ignore the duplicate records.
have the first 3 columns of the target in the hash file.
the constraint for the outgoing link shud be "Lookup is False"

1st lookup hash:
1110 1111 22222
1111 2222 33333

2nd lookup:
To findout the surrogate key of the target column to update the existing record.
Have the first column and the surrogate key in the hash file
there shud be 3 outgoing links.
2 with "lookup true" 1 with "lookup false"

2nd lookup hash:
sk1 1110
sk2 1111

2 Lookup true (existing record with new values)
1-fetch the surrogate key for the record "1111" and hard code "Inactive" and date field and update target table
2- Generate new surrogate key for the record "1111" and insert the new values in the target table.

1 lookup false (new record)
generate surrogate key and insert the new records

hope this helps

DSXchange

Implementing SCD Type-2 in DataStage

Implementing SCD Type-2 in DataStage

Re: Implementing SCD Type-2 in DataStage