Page 1 of 1

Duplicate Surrogate Keys

Posted: Mon Jan 05, 2009 2:33 pm
by Raftsman
While running a job using 2 nodes, I encountered an issue with duplicate surrogate keys. A previous collegue created all job stages using sequential processing. In order to take advantage of parallel processing, I switched all stages back to Default (parallel). I used the transformer stage (Next Surrogate()) to create the keys. For some reason, a record on each node was assigned the same surrogate. Is this a known bug or do I need to structure the function differently.

I was thinks about thowing out the transformer stage and replacing it with the surrogate key generator stage. Would this eliminate the problem.

Thanks in advance

Posted: Mon Jan 05, 2009 4:34 pm
by ray.wurlod
The Surrogate Key Generator stage will, like the Transformer stage, do exactly what you tell it to do, though its defaults are more likely to be well behaved.

In a Transformer stage construct your expression using @PARTITIONNUM (plus any initial constant) as the initial value, and increment by @NUMPARTITIONS. This will necessarily yield a unique sequence of numbers.

Posted: Mon Jan 05, 2009 4:51 pm
by shankar_iyer
I am not able to see Ray's full reply for this because of "premium content". However this can be solved by use of @PARTITIONNUM and @NUMPARTITIONS

Posted: Mon Jan 05, 2009 5:47 pm
by Raftsman
What is the purpose of the NextSurrogateKey function if it can't control multiple nodes. Since this is a version 8.0 function, I would of assumed it to work correctly. If I understand the internal mechanism, should it not assign unique number even though multiple node are being used. I know it will work correctly if I use one node or sequential processing. Can anyone elaborate on why this function doesn't work correctly. Is there a patch.

Thanks

Posted: Mon Jan 05, 2009 8:37 pm
by ray.wurlod
Read again that the job was first run on a single node. I'm guessing, therefore, that something "sequential" has happened, maybe in the mechanism that initializes the state file or something within the function itself.

Posted: Tue Jan 06, 2009 5:52 am
by Mike
It works fine on multiple nodes if the state file is good. In my initial testing, I found that the mechanism to update a state file didn't seem to work. I didn't pursue what looked to me to be a bug because it was just as easy to delete and recreate the state file.

Mike

Posted: Tue Jan 06, 2009 7:43 am
by verify
I serached for the NextSurrogateKey() function in "datastage help" and "datastage manuals", but i didn't get any information about it.
Can anyone please tell me the syntax or where can i find this function.

Any help will be appreciated..

Posted: Tue Jan 06, 2009 7:48 am
by Mike
Parallel Job Developer Guide, Appendix B under Utility functions.

Mike

Posted: Tue Jan 06, 2009 8:20 am
by verify
I am using Datastage 7.5 parallel edition.
Under parallel jo guide--> appendix B --> utility functions, only one function is present that's "GetEnvironment()".

Is it present in 8.0 edition?

Please help me out..

Posted: Tue Jan 06, 2009 8:30 am
by verify
I am using Datastage 7.5 parallel edition.
Under parallel jo guide--> appendix B --> utility functions, only one function is present that's "GetEnvironment()".

Is it present in 8.0 edition?

Please help me out..

Posted: Tue Jan 06, 2009 8:38 am
by Mike
Yes, we're talking about the 8x release here.

Mike