Page 1 of 1

Why Rnd() function generates same number twice?

Posted: Wed Jun 22, 2005 12:55 pm
by saadmirza
Hi,
Here is my code for generating Unique Random number keeping in mind that Rnd() function will not generate random number twice...I have distinctIds file which contain already used Numbers...if the number is present in the file then this routine doesnt generate that number again...
but if distincID file is empty the routine generates random non uniqe numbers...Please help me if there is some modification required in the code below.



$IFNDEF JOBCONTROL.H
$INCLUDE DSINCLUDE JOBCONTROL.H
$ENDIF
$DEFINE TESTING
* Declare "bitmap" strings to be in named COMMON area of memory, so that their
* values persist across multiple rows processed.
COMMON /FP1/UniqueIndex
If Len(UniqueIndex) = 1
Then
GoSub BuildBitmaps
End
GoSub GenerateRandom
GoSub CheckExistingRndNo

* Initialize return variable.
Ans = 0
* Generate Unique Number between 1 and 9
**********************************************************************************************
GenerateRandom:
**********************************************************************************************
BaseNumber = 1
UniqueIndexTmp = Rnd(4)
UniqueIndexNo=BaseNumber +UniqueIndexTmp
Ans = UniqueIndexNo
GoSub CheckExistingRndNo
Return(Ans)

**********************************************************************************************
CheckExistingRndNo:
**********************************************************************************************
UniqueIndexFound = UniqueIndex[UniqueIndexNo,1]
If UniqueIndexFound
Then
Message = "Results:"
Message<-1> = "UniqueIndexFound = " :UniqueIndexNo
Message<-1> = "Number Exists Generate New One"
Call DSLogInfo(Message, " ** TESTING ** ")
GoSub GenerateRandom
End
Else
Message = "Results:"
Message<-1> = "UniqueIndexFound = " :UniqueIndexNo
Message<-1> = "Number Does not Exists Pass the number to the Job"
End
Ans= UniqueIndexNo
Return(Ans)



**********************************************************************************************
BuildBitmaps:
**********************************************************************************************
*
* Here we construct the "bitmaps" in COMMON variables. This step is executed only if required
* (that is, usually, only on the first row processed).

UniqueIndex= Str("0", 8)
FileError = 1
OpenSeq "D:\files\DistinctIds.txt" To InputFvar
On Error
Message = 'Error (code ' : Status() : ') opening "input.txt" file.'
End
Locked
Message = 'File "DistinctIds.csv" locked by another process.'
End
Then
FileError = 0
Message = 'File "DistinctIds.csv" opened succcessfully.'
Loop
While ReadSeq Line From InputFvar
UniqueIndex[Line,1] = "1"
Repeat
Closeseq InputFvar
End
Else
Message = 'Unable to open "DistinctIds.csv" file for reading.'
End ; * end of OpenSeq statement

Re: Why Rnd() function generates same number twice?

Posted: Wed Jun 22, 2005 1:49 pm
by chulett
saadmirza wrote:keeping in mind that Rnd() function will not generate random number twice...
What makes you think that statement is true? :? Kind of flies in the face of what the term 'random' means.

Posted: Wed Jun 22, 2005 4:29 pm
by PilotBaha
For that please use LightngTwce() function :) Sorry, couldn't help it :)

Posted: Wed Jun 22, 2005 6:11 pm
by ray.wurlod
The base assumption - that Rnd() cannot generate the same number twice - is incorrect. That's like saying that tossing a coin can never generate two heads or two tails - what happens on the third toss? Edge? How about the fourth toss?

Your logic (somewhat familiar - are you at Reliance?) suggests that you don't really need Rnd() at all. You could use LOCATE to determine an unused place in the array. Can you document (in English, not in code) what you are trying to achieve?

Posted: Thu Jun 23, 2005 1:09 am
by saadmirza
Hi,
Ok....Can anyone guide me as to how to generate Random non duplictae numbers using Rnd() or Randomize() functions available in DS...

Regards,
Saad

Posted: Thu Jun 23, 2005 1:19 am
by ArndW
The only way to guarantee that you don't repeat a number when you use a RND function is to keep track of those number already used. If you seed your pseudo-random number a the same point you will get a reproduceable series - but it will, by definition, have duplicates.

Another approach is to create a table with keys from 1 to {YourMax}. Then use the RND function to shuffle those values around the array/table. If you then read these sequentially you will get pseudo-random amounts and no duplicates.

Seriously, the concept of random numbers is one that doesn't belong in the business end (meaning data storage) of a Data Warehouse. I can imagine a number of statistical uses but very few others.

Posted: Thu Jun 23, 2005 6:25 am
by chulett
You aren't doing all of this to generate a surrogate key, are you? :?

Posted: Thu Jun 23, 2005 7:08 am
by ArndW
Craig,

this looks like a continuation of a thread last week; and if that is the case then the answer to your question "yes"

Posted: Tue Jun 28, 2005 2:11 am
by saadmirza
Hi Ray,
As you said, I am trying to use a Bitmap array index to identify whether the random number generated is already present in the UniqueIndex .
If yes then I should generate another random number till I find that the number is not an existing number...but I think, i wrote this algo keeping int mind that Rnd() function will not generate the number twice within a single load run....As given in the help document...it says that Rnd() will generate non-repeatable seuqnece of numbers...but i dont know what exatly is happening...some says that I should use randomize in conjunction with Rnd() to generate non duplicate random numbers...but dont know how??Can you please guide me in this regards...the require of the client is that ...I need to generate new Employee IDs and that it should be randomly generated...Please suggest

Thanks,
Saad Mirza

Posted: Tue Jun 28, 2005 4:11 am
by ArndW
Saad,

another way to look at the RND() function not duplicating results is - what does RND(10) return when called for the 11th time? Or, to use a physical example, how can a dice be rolled 100 times without duplicating results.

There are two popular algorithms that generate pseudo-random numbers that are in common use; which one your operating system uses is not relevant. Both algorithms use a seed number to choose the starting location in a series of numbers; this way you can run a program using pseudo-random numbers with the same seed (randomize) number and get identical results each run. The Randomize/Seed does nothing for your particular problem.

If you store the previous maximum employee number in a table then you can generate a unique new employee number by adding 1 to the high value and incrementing it afterwards. If you have a requirement that the employee number shouldn't reflect the hiring date then you can either use a (secret to the user) algorithm or, if you really, really wish to stick with random numbers you could populate a file with all possible (and unused) employee numbers and then use RND() in a loop trying to find unused records until successful. I prefer not using RND().

Posted: Tue Jun 28, 2005 7:52 am
by ray.wurlod
"Nonrepeatable sequence" means exactly that; it is the sequence that is not repeatable; not any one number in the sequence. Indeed, there is a small but finite possibility that every number in the sequence will be the same, but this will not be true the next time a sequence is generated unless you deliberately seed the generator.

You can get guaranteed unique values by beginning with the current maximum and incrementing it. If you have SQL Server, you can also generate a thing called a GUID that is a "Guaranteed Unique ID".