unique key calculation
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 42
- Joined: Thu Dec 11, 2008 11:07 am
unique key calculation
hi,
i have a parallel job, (@PARTITIONNUM+1): @ INROWNUM function is used to gererate a unique key for further processing (inserting in database).
the job is on 8.5, running on 8 node. records are failing while inserting into datase using (@PARTITIONNUM+1): @ INROWNUM as unique key.
volume is 5000, observation is record number 464 onwards on partition number 4 are duplicated.
well i tried to think of all the combinations but can't come up with duplicate using (@PARTITIONNUM+1): @ INROWNUM.
could you please help me.
i have a parallel job, (@PARTITIONNUM+1): @ INROWNUM function is used to gererate a unique key for further processing (inserting in database).
the job is on 8.5, running on 8 node. records are failing while inserting into datase using (@PARTITIONNUM+1): @ INROWNUM as unique key.
volume is 5000, observation is record number 464 onwards on partition number 4 are duplicated.
well i tried to think of all the combinations but can't come up with duplicate using (@PARTITIONNUM+1): @ INROWNUM.
could you please help me.
Thanks,
Aruna
Aruna
This sounds suspiciously like the same issue as stated in this thread. Perhaps you can answer the same questions which were asked of Chandra234 in that thread?
Or maybe this (the thought just occurred to me): Is this an interview question?
Regards,
Or maybe this (the thought just occurred to me): Is this an interview question?
Regards,
Last edited by jwiles on Wed Aug 01, 2012 10:30 pm, edited 1 time in total.
- james wiles
All generalizations are false, including this one - Mark Twain.
All generalizations are false, including this one - Mark Twain.
If it's not the same then there is something spooky going on... two people with the exact same issue down to the row number... [cue]twilight zone[/cue]jwiles wrote:This sounds suspiciously like the same issue as stated in this thread. Perhaps you can answer the same questions which were asked of DSRajesh in that thread?
-
- Participant
- Posts: 42
- Joined: Thu Dec 11, 2008 11:07 am
thanks for sharing the link, yes guys it does looks spooky to me ...
well i would say a coincident, or IBM no more supporting 7.x from 2013 forcing everyone to migrate to 8.x enviornment replicating similar issues.
well you guys sure have good memory and imagination
jwiles , reply makes sense to me for more nodes, it can explain the duplicates.
Code:
Partition 0 Record 1464: 0+1 : 1464 = 1 : 1464 = 11464
Partition 10 Record 464: 10+1 : 464 = 11 : 464 = 11464
but in 8 node first digit always should be unique. doesn't explain the duplicates yet.
I will post if i find something logical.
Thanks guys for replying.
well i would say a coincident, or IBM no more supporting 7.x from 2013 forcing everyone to migrate to 8.x enviornment replicating similar issues.
well you guys sure have good memory and imagination
![Wink :wink:](./images/smilies/icon_wink.gif)
jwiles , reply makes sense to me for more nodes, it can explain the duplicates.
Code:
Partition 0 Record 1464: 0+1 : 1464 = 1 : 1464 = 11464
Partition 10 Record 464: 10+1 : 464 = 11 : 464 = 11464
but in 8 node first digit always should be unique. doesn't explain the duplicates yet.
I will post if i find something logical.
Thanks guys for replying.
Thanks,
Aruna
Aruna
It was suggested in the other thread to do the following:
- peek the partition number, row number and your "unique" key value for the duplicate rows
- When you have duplicates, what was the generated key and what do you believe it SHOULD have been?
- Did duplicates occur within the SAME partition (partition numbers are equal), different partitions or a combination of both?
We never received answers to those questions and therefore were unable to assist further in that thread.
Also, what is the data type of the generated key column (both stage variables and output link columns as appropriate)? PaulVL showed a good example of how a duplicate could occur when using a non-string key column (some sort of numeric data type).
Gathering this data helps in identifying a potential pattern or condition that leads to the duplicates.
Regards,
- peek the partition number, row number and your "unique" key value for the duplicate rows
- When you have duplicates, what was the generated key and what do you believe it SHOULD have been?
- Did duplicates occur within the SAME partition (partition numbers are equal), different partitions or a combination of both?
We never received answers to those questions and therefore were unable to assist further in that thread.
Also, what is the data type of the generated key column (both stage variables and output link columns as appropriate)? PaulVL showed a good example of how a duplicate could occur when using a non-string key column (some sort of numeric data type).
Gathering this data helps in identifying a potential pattern or condition that leads to the duplicates.
Regards,
- james wiles
All generalizations are false, including this one - Mark Twain.
All generalizations are false, including this one - Mark Twain.
-
- Participant
- Posts: 42
- Joined: Thu Dec 11, 2008 11:07 am
please find below my response....
peek the partition number, row number and your "unique" key value for the duplicate rows
i tried to simulate in other test enviornment , but to be honest can't replicate the same. this is production job which is showing duplicates intermittengly.
yes, if i could get this data it would be helpful. i am trying to move a change in prod but not sure whether i will get the approvals. let you know.
- When you have duplicates, what was the generated key and what do you believe it SHOULD have been?
the pattern is with 464 row onwards duplicates are getting generated until 520 row. and the partition is 4 always.
so 5464, 5464,5465,5465,5466,5466......5520,5521 it should have been
5464,5465,5466.....
- Did duplicates occur within the SAME partition (partition numbers are equal), different partitions or a combination of both?
yes in same partition mostly 4 , combination of both. both row number and partition number is duplicated
Also, what is the data type of the generated key column (both stage variables and output link columns as appropriate)?
i think it might be culprit in this behaviour , the data type is smallint.
peek the partition number, row number and your "unique" key value for the duplicate rows
i tried to simulate in other test enviornment , but to be honest can't replicate the same. this is production job which is showing duplicates intermittengly.
yes, if i could get this data it would be helpful. i am trying to move a change in prod but not sure whether i will get the approvals. let you know.
- When you have duplicates, what was the generated key and what do you believe it SHOULD have been?
the pattern is with 464 row onwards duplicates are getting generated until 520 row. and the partition is 4 always.
so 5464, 5464,5465,5465,5466,5466......5520,5521 it should have been
5464,5465,5466.....
- Did duplicates occur within the SAME partition (partition numbers are equal), different partitions or a combination of both?
yes in same partition mostly 4 , combination of both. both row number and partition number is duplicated
Also, what is the data type of the generated key column (both stage variables and output link columns as appropriate)?
i think it might be culprit in this behaviour , the data type is smallint.
Thanks,
Aruna
Aruna
-
- Participant
- Posts: 42
- Joined: Thu Dec 11, 2008 11:07 am
Ah...smallint (unsigned), which is 2 bytes long, has a maximum value of 65535. This is possibly the primary cause of the duplicates, but I'm uncertain of how this would not have appeared in DS 7x unless the row quantities were not sufficient.
I would recommend either an unsigned Integer or unsigned BigInt for storing the generated key, regardless of the logic used to create it.
Regards,
I would recommend either an unsigned Integer or unsigned BigInt for storing the generated key, regardless of the logic used to create it.
Regards,
- james wiles
All generalizations are false, including this one - Mark Twain.
All generalizations are false, including this one - Mark Twain.
-
- Participant
- Posts: 42
- Joined: Thu Dec 11, 2008 11:07 am