Implementing SCD2 functionality the old fashioned way!!

vskr72 · Post by **vskr72** » Tue Jun 22, 2010 5:19 pm

I modified the job and implemented SCD2 in the old fashion way not using the SCD stage provided datastage. Below is the functionility:
- if a new record is found it inserts
- if an existing record is found, it end dates the existing record in the DIM and
- and open a new record .

So, the job has 2 inserts and 1 update Oracle stage. The issue I am facing with this is if I just have 1 Insert and 1 update and send the other insert to a Peek or file, it runs OK. But, if I add all three Oracle stages in the TGT, then it hangs and doesnt progress.

For now, I am doing the 3rd insert into a flat file and then dumping it to the table in a different job.

Is there something that I am missing. Thanks,

ray.wurlod · Post by **ray.wurlod** » Tue Jun 22, 2010 9:02 pm

Sounds like you may be generating a deadlock, with the same key on more than one link into the target. How are your array size and commit levels set?

vskr72 · Post by **vskr72** » Wed Jun 23, 2010 8:52 am

Ray - is this set at a job level or project level. Can you pls help me to understand where these are set?

Satish

vskr72 · Post by **vskr72** » Wed Jun 23, 2010 8:53 am

FYI - I was working with 10 records.

vmcburney · Post by **vmcburney** » Wed Jun 23, 2010 1:16 pm

Why do you have two insert links - can you do it with one? Check that the inserts are inserts and not upserts. It could be you are doing two updates to the same row. Are there two rows in your sample trying to update the same dimension row?

vskr72 · Post by **vskr72** » Thu Jun 24, 2010 4:59 pm

No - there will not be 2 rows which will update the same dimension row. Let me try that option and see if it works.

bobbysvr · Post by **bobbysvr** » Tue Jun 29, 2010 11:48 pm

vskr72 wrote:No - there will not be 2 rows which will update the same dimension row. Let me try that option and see if it works.

You can do this by collecting all the inserts in a dataset in the first job. This is necessary because if you insert and update in the same job you could potentially be updating the newly inserted record.

In the first job collect all your inserts, both for brand new inserts and update based inserts in to a DataSet. In the same job ,update the old record.

In the second job insert the funneled inserts in to the table. Also if you are generating surrogate key, use a db sequence or surrogate key generator appropriately.

Thanks
Bobby.

priyadarshikunal · Post by **priyadarshikunal** » Wed Jun 30, 2010 9:38 am

or add a contraint in update sql to exclude the newly inserted records (if you have something like Run ID in table).

DSXchange

Implementing SCD2 functionality the old fashioned way!!

Implementing SCD2 functionality the old fashioned way!!

SCD the old fashioned way