Random number generation
Moderators: chulett, rschirm, roy
Random number generation
Hi,
How to generate unique random numbers for each run of a sequence?
Thanks,
Phani
How to generate unique random numbers for each run of a sequence?
Thanks,
Phani
Pseudo-random numbers can be generated in both PX and Server versions. But as Craig has already pointed out, these are probably not the best solution to your problem, particularly as they can (and do) repeat values.
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>
By their very nature, generated random numbers are not unique.
Your approach will be different between having unique numbers per run or across several runs.
One approach is to create a table with just a numeric key and fill it with 1,000,000 records. Then use the random number generator to deliver a number between 1 and 1,000,000 and read that record. If the read is successful, then you have a unique random number to use and you delete the record from the table. If the read is not successful, then you have to repeat the random number generation and read process until you have a successful read. Note that this gets less and less efficient as you remove records from the pool.
In server, where long strings are efficiently processed, you can make a long delimited list and just remove the element each time, making the process much quicker.
Do you know how many elements you need?
Your approach will be different between having unique numbers per run or across several runs.
One approach is to create a table with just a numeric key and fill it with 1,000,000 records. Then use the random number generator to deliver a number between 1 and 1,000,000 and read that record. If the read is successful, then you have a unique random number to use and you delete the record from the table. If the read is not successful, then you have to repeat the random number generation and read process until you have a successful read. Note that this gets less and less efficient as you remove records from the pool.
In server, where long strings are efficiently processed, you can make a long delimited list and just remove the element each time, making the process much quicker.
Do you know how many elements you need?
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>
Every run will generate unique random numbers.
For every run we are genarating 1000 random numbers for ABC and 500 random numbers for XYZ
Job design :
RowgenStage(ABC):
Feilds:
Cust_Name(varchar)
RanNum(Integer) --> Type=random
Limit=1000000
Seedval=#seed_Val_ABC#
RowgenStage(XYZ):
Cust_Name(varchar)
RanNum(Integer) --> Type=random
Limit=300000
Seedval=#seed_Val_XYZ#
seed_Val_ABC,seed_Val_XYZ are parameters ..I am assinging a value from Sequence.
seed_Val_ABC-->KeyMgtGetNextValue(1)
seed_Val_ABC-->KeyMgtGetNextValue(1)
KeyMgtGetNextValue function will work for this scenario or not?
For every run we are genarating 1000 random numbers for ABC and 500 random numbers for XYZ
Job design :
RowgenStage(ABC):
Feilds:
Cust_Name(varchar)
RanNum(Integer) --> Type=random
Limit=1000000
Seedval=#seed_Val_ABC#
RowgenStage(XYZ):
Cust_Name(varchar)
RanNum(Integer) --> Type=random
Limit=300000
Seedval=#seed_Val_XYZ#
seed_Val_ABC,seed_Val_XYZ are parameters ..I am assinging a value from Sequence.
seed_Val_ABC-->KeyMgtGetNextValue(1)
seed_Val_ABC-->KeyMgtGetNextValue(1)
KeyMgtGetNextValue function will work for this scenario or not?
Your randomly generated numbers in the Row Generator stage will not be unique and the KeyMgtGetNextValue won't work either.
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>
Row genarate stage, which is producing unique random numbers based Random algorthm (Type=random)
we are passing Limit and Seed values as parameters...
If I provide the seed value is unique number for every run and getting unique random numbers..
seed_Val_ABC-->KeyMgtGetNextValue(1)
seed_Val_ABC-->KeyMgtGetNextValue(1)
here we want to use constraint for seed value <9999.
how can I achieve this by using KeyMgtGetNextValue function ?
we are passing Limit and Seed values as parameters...
If I provide the seed value is unique number for every run and getting unique random numbers..
seed_Val_ABC-->KeyMgtGetNextValue(1)
seed_Val_ABC-->KeyMgtGetNextValue(1)
here we want to use constraint for seed value <9999.
how can I achieve this by using KeyMgtGetNextValue function ?
ArndW wrote:Your randomly generated numbers in the Row Generator stage will not be unique and the KeyMgtGetNextValue won't work either. ...
Row genarate stage, which is producing unique random numbers based Random algorthm (Type=random)
we are passing Limit and Seed values as parameters...
If I provide the seed value is unique number for every run and getting unique random numbers..
seed_Val_ABC-->KeyMgtGetNextValue(1)
seed_Val_ABC-->KeyMgtGetNextValue(1)
here we want to use constraint for seed value <9999.
how can I achieve this by using KeyMgtGetNextValue function ?
No, your generated numbers are not going to be unique. The "seed" just give a starting point and the pseudo-random series will always be the same when using the same seed.RPhani wrote:Row genarate stage, which is producing unique random numbers based Random algorthm (Type=random)...
I just did a quick test, random integer with 1,000,000 rows generator only 999,759 unique ids (I used a seed of "1", so you can reproduce this as well).
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>
Yes ..I am agree with your answer ..
If the "seed" value is same for mutiple runs..it will produce same random numbers.
That's why in our job, we are passing "seed" from job Sequence by using KeyMgtGetNextValue() function.
For every run it will generate new sequence number.
seed_Val_ABC-->KeyMgtGetNextValue(1)
seed_Val_XYZ-->KeyMgtGetNextValue(1)
for example:
1 st run:
seed_Val_ABC-->KeyMgtGetNextValue(1)----OutputVal '1'
seed_Val_XYZ-->KeyMgtGetNextValue(1)----OutputVal '2'
2nd run:
seed_Val_ABC-->KeyMgtGetNextValue(1)----OutputVal '3'
seed_Val_XYZ-->KeyMgtGetNextValue(1)----OutputVal '4'
3rd run:
seed_Val_ABC-->KeyMgtGetNextValue(1)----OutputVal '5'
seed_Val_XYZ-->KeyMgtGetNextValue(1)----OutputVal '6'
here we want to use constraint for seed value(i.e., KeyMgtGetNextValue(1)) <9999.
how can I achieve this by using KeyMgtGetNextValue function at job Sequence ?
Total random numbers for ABC is 1000
Total random numbers for XYZ is 500
here we are not producing 10 Lacks / 3lacks unique random numbers for each run.
If the "seed" value is same for mutiple runs..it will produce same random numbers.
That's why in our job, we are passing "seed" from job Sequence by using KeyMgtGetNextValue() function.
For every run it will generate new sequence number.
seed_Val_ABC-->KeyMgtGetNextValue(1)
seed_Val_XYZ-->KeyMgtGetNextValue(1)
for example:
1 st run:
seed_Val_ABC-->KeyMgtGetNextValue(1)----OutputVal '1'
seed_Val_XYZ-->KeyMgtGetNextValue(1)----OutputVal '2'
2nd run:
seed_Val_ABC-->KeyMgtGetNextValue(1)----OutputVal '3'
seed_Val_XYZ-->KeyMgtGetNextValue(1)----OutputVal '4'
3rd run:
seed_Val_ABC-->KeyMgtGetNextValue(1)----OutputVal '5'
seed_Val_XYZ-->KeyMgtGetNextValue(1)----OutputVal '6'
here we want to use constraint for seed value(i.e., KeyMgtGetNextValue(1)) <9999.
how can I achieve this by using KeyMgtGetNextValue function at job Sequence ?
Total random numbers for ABC is 1000
Total random numbers for XYZ is 500
here we are not producing 10 Lacks / 3lacks unique random numbers for each run.
You are missing the fundamental issue: random does not equal unique.
Yes, using a unique seed each run will ensure that the sequence of random numbers generated is always different but there's no guarantee that they will be unique within a run and certainly not across runs. It seems that the latter is not an issue for you but the former will be... it may look like it is working but in reality it won't be. If that is truly important / critical to your processing, you'll need to find another mechanism.
Yes, using a unique seed each run will ensure that the sequence of random numbers generated is always different but there's no guarantee that they will be unique within a run and certainly not across runs. It seems that the latter is not an issue for you but the former will be... it may look like it is working but in reality it won't be. If that is truly important / critical to your processing, you'll need to find another mechanism.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers