Implementing SCD2 functionality the old fashioned way!!

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
vskr72
Premium Member
Premium Member
Posts: 128
Joined: Wed Apr 28, 2004 9:36 pm

Implementing SCD2 functionality the old fashioned way!!

Post by vskr72 »

I modified the job and implemented SCD2 in the old fashion way not using the SCD stage provided datastage. Below is the functionility:
- if a new record is found it inserts
- if an existing record is found, it end dates the existing record in the DIM and
- and open a new record .

So, the job has 2 inserts and 1 update Oracle stage. The issue I am facing with this is if I just have 1 Insert and 1 update and send the other insert to a Peek or file, it runs OK. But, if I add all three Oracle stages in the TGT, then it hangs and doesnt progress.

For now, I am doing the 3rd insert into a flat file and then dumping it to the table in a different job.

Is there something that I am missing. Thanks,
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Sounds like you may be generating a deadlock, with the same key on more than one link into the target. How are your array size and commit levels set?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
vskr72
Premium Member
Premium Member
Posts: 128
Joined: Wed Apr 28, 2004 9:36 pm

Post by vskr72 »

Ray - is this set at a job level or project level. Can you pls help me to understand where these are set?

Satish
vskr72
Premium Member
Premium Member
Posts: 128
Joined: Wed Apr 28, 2004 9:36 pm

Post by vskr72 »

FYI - I was working with 10 records.
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

Why do you have two insert links - can you do it with one? Check that the inserts are inserts and not upserts. It could be you are doing two updates to the same row. Are there two rows in your sample trying to update the same dimension row?
vskr72
Premium Member
Premium Member
Posts: 128
Joined: Wed Apr 28, 2004 9:36 pm

Post by vskr72 »

No - there will not be 2 rows which will update the same dimension row. Let me try that option and see if it works.
bobbysvr
Premium Member
Premium Member
Posts: 16
Joined: Fri Mar 26, 2004 2:07 pm

SCD the old fashioned way

Post by bobbysvr »

vskr72 wrote:No - there will not be 2 rows which will update the same dimension row. Let me try that option and see if it works.
You can do this by collecting all the inserts in a dataset in the first job. This is necessary because if you insert and update in the same job you could potentially be updating the newly inserted record.

In the first job collect all your inserts, both for brand new inserts and update based inserts in to a DataSet. In the same job ,update the old record.


In the second job insert the funneled inserts in to the table. Also if you are generating surrogate key, use a db sequence or surrogate key generator appropriately.

Thanks
Bobby.
priyadarshikunal
Premium Member
Premium Member
Posts: 1735
Joined: Thu Mar 01, 2007 5:44 am
Location: Troy, MI

Post by priyadarshikunal »

or add a contraint in update sql to exclude the newly inserted records (if you have something like Run ID in table).
Priyadarshi Kunal

Genius may have its limitations, but stupidity is not thus handicapped. :wink:
Post Reply