Duplicates while inserting in to target

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
120267
Participant
Posts: 30
Joined: Tue Jun 07, 2005 12:27 am

Duplicates while inserting in to target

Post by 120267 »

Hi,

We are facing a peculiar problem in DS. We are using SCD2 job. From the source we are loading in to hash file and from the hash file we are inserting in to target. Its a row by row commit and before inserting we are having oracle stage lookup to generate surrogate key.

Our problem is some times the same records are inserted in to target with different surrogate keys. The insert date timestamp is also same for that records, the only difference is the surrogate key.

We are using link collectors, interprocess and hash file stages in the job. This scenario is happening once in a while. Will you please help me to resolve this issue?
With Love,

«·´`·.(*·.¸(`·.¸ ¸.·´)¸.·*).·´`·»
«.......>>>> Siva.G<<<<......»
«·´`·.(¸.·*(¸.·´ `·.¸)*·.¸).·´`·»
ajith
Participant
Posts: 86
Joined: Thu Nov 10, 2005 11:10 pm

Re: Duplicates while inserting in to target

Post by ajith »

Probably you can do one thing, split the job into two, one for writing the insert records to a dataset and one for loading it into a oracle target. Then schedule the second job after the first. That will reduce the database hits too .

Secondly, make sure that your datatypes match, there are no leading or trailing spaces in the columns used for looking up. More importantly, find the reason for look up failure.

Again, these are valid only if my understanding is correct, try to explain the job flow with a simplified ascii diagram so that everyone can understand the job flow
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

How many keys do you have defined in your target table? SCD2 means you will have to update too, are you doing that, how exactly are you doing that?
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

How can you update based on the surrogate key when your generating those as you insert? You need to update based on the natural key. Also, if the surrogate key is the only key column defined, how are you getting duplicates? How many columns have unique indices on them? I am confused :roll:
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
120267
Participant
Posts: 30
Joined: Tue Jun 07, 2005 12:27 am

Post by 120267 »

Hi,

My source will have left outer join with target to get the target surrogate key.If the CDC columns are changed for that Natural Key ..I am inserting one New record with New surrogate Key and Updating the old Record based on the Sk column.As i mentioned earlier it's a row by row commit and Grouped transaction.
With Love,

«·´`·.(*·.¸(`·.¸ ¸.·´)¸.·*).·´`·»
«.......>>>> Siva.G<<<<......»
«·´`·.(¸.·*(¸.·´ `·.¸)*·.¸).·´`·»
120267
Participant
Posts: 30
Joined: Tue Jun 07, 2005 12:27 am

Post by 120267 »

Hi,

My source will have left outer join with target to get the target surrogate key.If the CDC columns are changed for that Natural Key ..I am inserting one New record with New surrogate Key and Updating the old Record based on the Sk column.As i mentioned earlier it's a row by row commit and Grouped transaction.
With Love,

«·´`·.(*·.¸(`·.¸ ¸.·´)¸.·*).·´`·»
«.......>>>> Siva.G<<<<......»
«·´`·.(¸.·*(¸.·´ `·.¸)*·.¸).·´`·»
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

Check the logic on how you update the records. It might be a case that even update go in as Insert with new surrogate key.
You can give use more specific cases that on which case the duplicates occurs. Is it all the records from the source bee populated as duplicate?
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
kris007
Charter Member
Charter Member
Posts: 1102
Joined: Tue Jan 24, 2006 5:38 pm
Location: Riverside, RI

Post by kris007 »

You might end up having duplicates if you have not defined your key's properly. What are your keys?
Kris

Where's the "Any" key?-Homer Simpson
Post Reply