Page 1 of 1

Implementing SCD2 functionality the old fashioned way!!

Posted: Tue Jun 22, 2010 5:19 pm
by vskr72
I modified the job and implemented SCD2 in the old fashion way not using the SCD stage provided datastage. Below is the functionility:
- if a new record is found it inserts
- if an existing record is found, it end dates the existing record in the DIM and
- and open a new record .

So, the job has 2 inserts and 1 update Oracle stage. The issue I am facing with this is if I just have 1 Insert and 1 update and send the other insert to a Peek or file, it runs OK. But, if I add all three Oracle stages in the TGT, then it hangs and doesnt progress.

For now, I am doing the 3rd insert into a flat file and then dumping it to the table in a different job.

Is there something that I am missing. Thanks,

Posted: Tue Jun 22, 2010 9:02 pm
by ray.wurlod
Sounds like you may be generating a deadlock, with the same key on more than one link into the target. How are your array size and commit levels set?

Posted: Wed Jun 23, 2010 8:52 am
by vskr72
Ray - is this set at a job level or project level. Can you pls help me to understand where these are set?

Satish

Posted: Wed Jun 23, 2010 8:53 am
by vskr72
FYI - I was working with 10 records.

Posted: Wed Jun 23, 2010 1:16 pm
by vmcburney
Why do you have two insert links - can you do it with one? Check that the inserts are inserts and not upserts. It could be you are doing two updates to the same row. Are there two rows in your sample trying to update the same dimension row?

Posted: Thu Jun 24, 2010 4:59 pm
by vskr72
No - there will not be 2 rows which will update the same dimension row. Let me try that option and see if it works.

SCD the old fashioned way

Posted: Tue Jun 29, 2010 11:48 pm
by bobbysvr
vskr72 wrote:No - there will not be 2 rows which will update the same dimension row. Let me try that option and see if it works.
You can do this by collecting all the inserts in a dataset in the first job. This is necessary because if you insert and update in the same job you could potentially be updating the newly inserted record.

In the first job collect all your inserts, both for brand new inserts and update based inserts in to a DataSet. In the same job ,update the old record.


In the second job insert the funneled inserts in to the table. Also if you are generating surrogate key, use a db sequence or surrogate key generator appropriately.

Thanks
Bobby.

Posted: Wed Jun 30, 2010 9:38 am
by priyadarshikunal
or add a contraint in update sql to exclude the newly inserted records (if you have something like Run ID in table).