Implementing SCD Type-2 in DataStage

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
vimali balakrishnan
Participant
Posts: 60
Joined: Mon Dec 27, 2004 3:38 am

Implementing SCD Type-2 in DataStage

Post by vimali balakrishnan »

Hi,

Assume my source table has data in the following format:

Source table:
1110 1111 22222
1111 2222 33333
1111 3333 44444

Target table:
1110 1111 22222 'Active' 10-01-2003 10-01-2004
1111 2222 33333 'Active' 11-01-2003 11-01-2004

Now my job should see if 1110 is present in the target and the other two columns are same,so it is the same in the target,so it should skip that record.

It starts with the next record,checks if 1111 is present in the target,
and the other records are same,so it is the same in the target,so it should skip that record.

It starts with the next record checks if 1111 is already present,so it is already present in the target but the other values are not same so
my job should update the status of the current record(1111) in the target to 'Inactive' and insert the new record from the source with status as 'Active' and the End date of the previous record as start_date of this new record.

The problem i face is how to identify the record to be updated and then insert the new record.

Can u explain me with the constraints that should be used.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Only if you post your job design to show us how you are detecting existence of the row in the target table.

Wrap your design in Code tags (which gives a fixed width font) and play with Preview until it's obvious that your design is clear to someone else. For example:

Code: Select all

             HashedFile1  HashedFile2
                     |     ^
                     |     |
                     V     |
  SeqFile  ----->  Transformer  ----->  DRS
                       |
                       |
                       V
                   SeqFile
                  (Rejects)
Last edited by ray.wurlod on Thu Dec 30, 2004 3:19 pm, edited 1 time in total.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
ramesh_dw
Participant
Posts: 21
Joined: Tue Mar 23, 2004 10:44 pm
Location: Chennai

Re: Implementing SCD Type-2 in DataStage

Post by ramesh_dw »

Hi,

you can handle this with two lookups to the target.

1st lookup:
To ignore the duplicate records.
have the first 3 columns of the target in the hash file.
the constraint for the outgoing link shud be "Lookup is False"

1st lookup hash:
1110 1111 22222
1111 2222 33333

2nd lookup:
To findout the surrogate key of the target column to update the existing record.
Have the first column and the surrogate key in the hash file
there shud be 3 outgoing links.
2 with "lookup true" 1 with "lookup false"

2nd lookup hash:
sk1 1110
sk2 1111

2 Lookup true (existing record with new values)
1-fetch the surrogate key for the record "1111" and hard code "Inactive" and date field and update target table
2- Generate new surrogate key for the record "1111" and insert the new values in the target table.

1 lookup false (new record)
generate surrogate key and insert the new records


hope this helps
Post Reply