Unique random number generation
Moderators: chulett, rschirm, roy
Unique random number generation
Dear all,
I would like to ask for a suggestion for making a "unique" random number. For the nature of "rand()" and "random()", they are not aimed to create a unique random number.
As the population of the customer list is million scale, it is not ideal to use row number (@OUTROWNUM) as unique factor. Also, the most important problem is the same results for same customers by random function.
Is it needed to do it by routine? Also, I am thinking about my own seed...
Any suggestion?!?
Thanks.
Regards,
Sam
[Note: Changed topic title from Lucky Draw to be more specific - helps with searches later - Andy]
I would like to ask for a suggestion for making a "unique" random number. For the nature of "rand()" and "random()", they are not aimed to create a unique random number.
As the population of the customer list is million scale, it is not ideal to use row number (@OUTROWNUM) as unique factor. Also, the most important problem is the same results for same customers by random function.
Is it needed to do it by routine? Also, I am thinking about my own seed...
Any suggestion?!?
Thanks.
Regards,
Sam
[Note: Changed topic title from Lucky Draw to be more specific - helps with searches later - Andy]
Learning is a daily assignment.
Re: Lucky Draw
OK. Instead of starting with your 'random' discussion, why don't we open the bidding with what exactly this means and what exactly it is you are trying to accomplish, the more details the better. I suspect the word 'random' will not come into play in the solution.aspiresam wrote:Also, the most important problem is the same results for same customers by random function.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
I don't know about anyone else but that tells me absolutely nothing and I prefer not to guess. Sounds like some kind of contest and I have no idea why an ETL tool would be involved in something like that.
Still looking for some of those pesky 'details'...
Still looking for some of those pesky 'details'...
Last edited by chulett on Wed Aug 13, 2014 8:11 am, edited 1 time in total.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
To generate a truly random numbers, especially from one job run to the next, you can combine the random() function result with an analog value such as the 6 digit microseconds from the CurrentTimestampMS() function. Microseconds are always changing.
To ensure that the randomly generated number is unique, which was your question, then you have to keep track of the previously generated numbers and compare the current value against the list. If already used then try, try again.
For your "most important problem of same results for same customers," like Craig suggested, the more details you provide, the better answers you will get. As of yet, the meaning is left up to imagination.
To ensure that the randomly generated number is unique, which was your question, then you have to keep track of the previously generated numbers and compare the current value against the list. If already used then try, try again.
For your "most important problem of same results for same customers," like Craig suggested, the more details you provide, the better answers you will get. As of yet, the meaning is left up to imagination.
Choose a job you love, and you will never have to work a day in your life. - Confucius
Thanks all.
Actually, I have around 300,000 customers. If just taking random function, the customer A (say) has won the prize. Next time, it is likely to A again for the next draw. We are taking a monthly draw for VIP (frequent buyer).
So, that's why I am trying to have a unique random number. Sorry for my writing was not clear previously.
I would like to put my own seed like timestamp in microsecond into the random. However, I am not sure that I can doing something like:
random(<decimal of timestamp>)
Thanks in advance again.
Actually, I have around 300,000 customers. If just taking random function, the customer A (say) has won the prize. Next time, it is likely to A again for the next draw. We are taking a monthly draw for VIP (frequent buyer).
So, that's why I am trying to have a unique random number. Sorry for my writing was not clear previously.
I would like to put my own seed like timestamp in microsecond into the random. However, I am not sure that I can doing something like:
random(<decimal of timestamp>)
Thanks in advance again.
Learning is a daily assignment.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Create a key-only table called WINNERS and copy keys into there when they win. Apply your random selection to a DIFFERENCE set of the two tables (those in CUSTOMER but not in WINNERS).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Thanks, Ray.
It's one of the possible work-around. However, it is not very ideal for getting a list of winners involving extra maintenance of table / file.
At this moment, I am thinking about routine way - but it still returns duplicated results even using a seed by time.
I have made a stored procedure in the DB2 to do similar stuff. However, I would like to fix it by DataStage.
-- My working routine function - still in trial & error stage --
It's one of the possible work-around. However, it is not very ideal for getting a list of winners involving extra maintenance of table / file.
At this moment, I am thinking about routine way - but it still returns duplicated results even using a seed by time.
I have made a stored procedure in the DB2 to do similar stuff. However, I would like to fix it by DataStage.
-- My working routine function - still in trial & error stage --
Code: Select all
#include <stdio.h>
#include <time.h>
#include <stdlib.h>
#include <math.h>
#include <unistd.h>
#include <sys/time.h>
#include <sys/resource.h>
int main()
{
struct timeval start,end;
long mtime, seconds, useconds;
gettimeofday(&start, NULL);
usleep(12000);
gettimeofday(&end, NULL);
seconds = end.tv_sec - start.tv_sec;
useconds= end.tv_usec - start.tv_usec;
mtime =(1000000*seconds)+useconds;
//seed by microseconds
srand(mtime);
long double Ans=(rand()%mtime);
printf("Random by Time: %ld microseconds\n", mtime);
printf("%g.\n",Ans);
return 0;
}
Learning is a daily assignment.
After a number of testing, the final version I would like to share with a quite fair random...
Previously, I am using a print function in C++ to observe the result
(NOTE: this will not work with the DataStage because it is calling object / library only)
Then, I don't need to maintain another customer list.
Previously, I am using a print function in C++ to observe the result
(NOTE: this will not work with the DataStage because it is calling object / library only)
Then, I don't need to maintain another customer list.
Code: Select all
#include <stdio.h>
#include <time.h>
#include <stdlib.h>
#include <math.h>
#include <unistd.h>
#include <sys/time.h>
#include <sys/resource.h>
double myCustomRandom()
{
struct timeval start,end;
long mtime, seconds, useconds;
gettimeofday(&end, NULL);
mtime = end.tv_usec;
//seed by microseconds
srand(mtime);
int TmpAns=(rand()%mtime);
// over the integer range
double Ans = TmpAns/mtime;
return (Ans);
}
Learning is a daily assignment.
Thanks for sharing. If you run the routine at exactly the same time each day then it should produce the same result each time.
If it also takes the date as a number into the seed, then the result should vary each day, which I am guessing is what you really want.
If it also takes the date as a number into the seed, then the result should vary each day, which I am guessing is what you really want.
Choose a job you love, and you will never have to work a day in your life. - Confucius