generating random number in a specified range

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

kirankota79 wrote:it is working.......but it is repeating some numbers......how i can avoid this?
You specified random.
Random numbers can repeat.

You did not specify unique random numbers.

This is not possible in the generator. You will need a Remove Duplicates or Sort stage (with "Unique" checked) downstream.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
kirankota79
Premium Member
Premium Member
Posts: 315
Joined: Tue Oct 31, 2006 3:38 pm

Post by kirankota79 »

no...........it has millions of records
kirankota79
Premium Member
Premium Member
Posts: 315
Joined: Tue Oct 31, 2006 3:38 pm

Post by kirankota79 »

i already specified at the start of the discussion that the random numbers should not to be repeated.

and also i am able to see only forst 2 lines of ur reply......i dont have any membership to see the other part. please allow me to see that.
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

Let me just ask a philosophical question. If you have a range of numbers, say 1 to 100, and want to assign random numbers in that range and never repeat, how would that be any different from just assigning from 1 to 100?

It's conceivable that a random number generator could return random numbers in order from 1 to 100 anyway. By keeping track of random numbers between 1 and 100, and eliminating repeats, you eventually will assign values from 1 to 100.

Whatever you're doing makes ZERO logical sense. Mathematically it works out the same as assigning from 1 to 100 as randomly assigning between 1 and 100. If you simply want a "scrambled" row order when finished, then simply re-partition to "randomize" the row order.

You're making something harder than it should be.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Write a sequence of numbers as keys to a hashed file (UniVerse table) on the conductor node, then read it back. The hashing algorithm will give you some measure of scrambling, and your generated sequence will be unique.

You could use an External Target stage or a server job shared container containing a Hashed File stage to write to the hashed file, and an External Source stage or a server job shared container containing a Hashed File stage to read from the hashed file.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply