Hi,
We are facing a peculiar problem in DS. We are using SCD2 job. From the source we are loading in to hash file and from the hash file we are inserting in to target. Its a row by row commit and before inserting we are having oracle stage lookup to generate surrogate key.
Our problem is some times the same records are inserted in to target with different surrogate keys. The insert date timestamp is also same for that records, the only difference is the surrogate key.
We are using link collectors, interprocess and hash file stages in the job. This scenario is happening once in a while. Will you please help me to resolve this issue?
Duplicates while inserting in to target
Moderators: chulett, rschirm, roy
Duplicates while inserting in to target
With Love,
«·´`·.(*·.¸(`·.¸ ¸.·´)¸.·*).·´`·»
«.......>>>> Siva.G<<<<......»
«·´`·.(¸.·*(¸.·´ `·.¸)*·.¸).·´`·»
«·´`·.(*·.¸(`·.¸ ¸.·´)¸.·*).·´`·»
«.......>>>> Siva.G<<<<......»
«·´`·.(¸.·*(¸.·´ `·.¸)*·.¸).·´`·»
Re: Duplicates while inserting in to target
Probably you can do one thing, split the job into two, one for writing the insert records to a dataset and one for loading it into a oracle target. Then schedule the second job after the first. That will reduce the database hits too .
Secondly, make sure that your datatypes match, there are no leading or trailing spaces in the columns used for looking up. More importantly, find the reason for look up failure.
Again, these are valid only if my understanding is correct, try to explain the job flow with a simplified ascii diagram so that everyone can understand the job flow
Secondly, make sure that your datatypes match, there are no leading or trailing spaces in the columns used for looking up. More importantly, find the reason for look up failure.
Again, these are valid only if my understanding is correct, try to explain the job flow with a simplified ascii diagram so that everyone can understand the job flow
How can you update based on the surrogate key when your generating those as you insert? You need to update based on the natural key. Also, if the surrogate key is the only key column defined, how are you getting duplicates? How many columns have unique indices on them? I am confused
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
Hi,
My source will have left outer join with target to get the target surrogate key.If the CDC columns are changed for that Natural Key ..I am inserting one New record with New surrogate Key and Updating the old Record based on the Sk column.As i mentioned earlier it's a row by row commit and Grouped transaction.
My source will have left outer join with target to get the target surrogate key.If the CDC columns are changed for that Natural Key ..I am inserting one New record with New surrogate Key and Updating the old Record based on the Sk column.As i mentioned earlier it's a row by row commit and Grouped transaction.
With Love,
«·´`·.(*·.¸(`·.¸ ¸.·´)¸.·*).·´`·»
«.......>>>> Siva.G<<<<......»
«·´`·.(¸.·*(¸.·´ `·.¸)*·.¸).·´`·»
«·´`·.(*·.¸(`·.¸ ¸.·´)¸.·*).·´`·»
«.......>>>> Siva.G<<<<......»
«·´`·.(¸.·*(¸.·´ `·.¸)*·.¸).·´`·»
Hi,
My source will have left outer join with target to get the target surrogate key.If the CDC columns are changed for that Natural Key ..I am inserting one New record with New surrogate Key and Updating the old Record based on the Sk column.As i mentioned earlier it's a row by row commit and Grouped transaction.
My source will have left outer join with target to get the target surrogate key.If the CDC columns are changed for that Natural Key ..I am inserting one New record with New surrogate Key and Updating the old Record based on the Sk column.As i mentioned earlier it's a row by row commit and Grouped transaction.
With Love,
«·´`·.(*·.¸(`·.¸ ¸.·´)¸.·*).·´`·»
«.......>>>> Siva.G<<<<......»
«·´`·.(¸.·*(¸.·´ `·.¸)*·.¸).·´`·»
«·´`·.(*·.¸(`·.¸ ¸.·´)¸.·*).·´`·»
«.......>>>> Siva.G<<<<......»
«·´`·.(¸.·*(¸.·´ `·.¸)*·.¸).·´`·»
Check the logic on how you update the records. It might be a case that even update go in as Insert with new surrogate key.
You can give use more specific cases that on which case the duplicates occurs. Is it all the records from the source bee populated as duplicate?
You can give use more specific cases that on which case the duplicates occurs. Is it all the records from the source bee populated as duplicate?
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'