Issue with Random() function

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
mouthou
Participant
Posts: 208
Joined: Sun Jul 04, 2004 11:57 pm

Issue with Random() function

Post by mouthou »

Hi All,

I have recently used Random() function to get random numbers generated from the Transformer. But it seems to be generating duplicate values, though I defined the Transformer to run in sequential mode. This is quite strange. Please let me know if anyone faced this or any workaround.

The problem with the duplicate values is that those values are going as the primary key into the DB and the job is failing due unique constraint voilation.

Thanks
v2kmadhav
Premium Member
Premium Member
Posts: 78
Joined: Fri May 26, 2006 7:31 am
Location: London

Post by v2kmadhav »

if I were you - for something that contributes to the uniqueness, i would rather use something like a surrogate key generator or an equivalent logic instead.
mouthou
Participant
Posts: 208
Joined: Sun Jul 04, 2004 11:57 pm

Post by mouthou »

I am aware of that too. But it is a strange customer requirement to have some unpredictable dynamic values generated for that field. So, random was perfect pick for them!
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

There's no reason a random number would not generate duplicates. They're random, after all. You can even work out the probability of it occurring, if there is a finite domain of values from which the random number can be selected.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
v2kmadhav
Premium Member
Premium Member
Posts: 78
Joined: Fri May 26, 2006 7:31 am
Location: London

Post by v2kmadhav »

if you can be flexible with length/solution and its all about generating a randow number, try generating a unique value using a combination of something like inrownum||systime||random() or something that makes it unique ..
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

As noted, random <> unique.
-craig

"You can never have too many knives" -- Logan Nine Fingers
mouthou
Participant
Posts: 208
Joined: Sun Jul 04, 2004 11:57 pm

Post by mouthou »

Thanks Craig/Ray/v2kmadhav for the inputs. Working out the recurrence of numbers as Ray says, it will consume time. Since this random function in Datastage was swirling my head , I had to go with alternative implementation of involving the database. I will close this topic.

In general terms, random<>unique. Nevertheless, in my case, random is being exploited for uniqueness of the values.

And also surprisingly it generates the same duplicate random numbers for the consequtive runs. I am not sure why and it is strange. There should be some place in Datastage where it needs be reset so that it doesnt come up with the same number. Can any one try this with more number of rows (probability of duplicate numbers is more)?
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

You can seed the random number generator in DataStage. I'd recommend something based on time (seconds since midnight, for example), possibly multiplied by numeric date components.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

mouthou wrote:And also surprisingly it generates the same duplicate random numbers for the consequtive runs.
Not sure why, that's the way all "random" number generators work given the same seed value.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply