Help regarding RANDOM data selecting

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Use Mod() function with Random() function. For example Mod(Random(),100) will return a number between 0 and 99. For a 10% random sample, use Mod(Random(),100) < 10 in a constraint expression. The constraint expression could also include a restriction on @OUTROWNUM.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Post by jwiles »

Krishna, since you require an equal number of records per key value to be selected, how is that number determined? Is it arbitrarily chosen at runtime or does it need to be calculated somehow? If calculated, what is the calculation?

Regards,
- james wiles


All generalizations are false, including this one - Mark Twain.
krishna14
Participant
Posts: 24
Joined: Mon Jan 31, 2011 6:43 pm

Post by krishna14 »

It has to be chosen arbitrarily at runtime....
krishna
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Job parameter seems to be indicated then. For example:

Code: Select all

Mod(Random(),100) < 10 And @OUTROWNUM <= #jpMaxRowsPerNode#
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Post by jwiles »

I think Ray and I need "Dueling Banjos" playing in the background :wink:
- james wiles


All generalizations are false, including this one - Mark Twain.
krishna14
Participant
Posts: 24
Joined: Mon Jan 31, 2011 6:43 pm

Post by krishna14 »

Thanks Ray & jwiles ,

the job parameter given by Ray is working fine for me .Thanks for your help ...
krishna
jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Post by jwiles »

Good! I was about to ask which of us was solving the right problem... :)

Regards,
- james wiles


All generalizations are false, including this one - Mark Twain.
krishna14
Participant
Posts: 24
Joined: Mon Jan 31, 2011 6:43 pm

Post by krishna14 »

Thanks for your Support jwiles & Ray .

I really appreciate your help ...
krishna
krishna14
Participant
Posts: 24
Joined: Mon Jan 31, 2011 6:43 pm

Post by krishna14 »

I have a Question ,does RANDOM means every time if we run the Same I/P file should i be able to get different RANDOM records ...?Please clarify me..
krishna
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Since computers can only ever do pseudo-random, the answer depends totally on whether or not you use the same seed for the random number generator each time, or whether you use a different seed. A seed based on current date and time is fairly "random".
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
krishna14
Participant
Posts: 24
Joined: Mon Jan 31, 2011 6:43 pm

Post by krishna14 »

According to my requirment i have to generate different random records for every run by using the same input DB2 file...

how can i work it out ..Please help me out .
krishna
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Have you bothered to read the Parallel Job Developer's Guide section on the Random() function yet?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Post by jwiles »

Krishna,

Since you've marked this as resolved, would you share what your final solution was?

I was going to suggest using a column generator to gain access to the seed option with it's random generation. That can be parameterized as well. I think it's possible also within DB2.

Regards,
- james wiles


All generalizations are false, including this one - Mark Twain.
krishna14
Participant
Posts: 24
Joined: Mon Jan 31, 2011 6:43 pm

Post by krishna14 »

jwiles,

I apoloize for delayed reply,i used the column generator and used the seed property as suggested ...i'm able to get the desired ouput .

i have a question ..if i use the RND () or RANDOM () function how can i seed system time to it so that i can get random numbers for every run...?
krishna
jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Post by jwiles »

There's not a method to do so within the transformer stage interface in Designer. You would need to write the logic in a transform outside of the Designer GUI (in Unix) to use the native Transform operator language and functions, which I believe includes seed functions for rand() and random().

I think how you've done it is sufficient to meet your needs and is easily supportable by others in the future.

Regards,
- james wiles


All generalizations are false, including this one - Mark Twain.
Post Reply