Page 1 of 1

Generate Random Number

Posted: Wed Jun 01, 2005 5:05 am
by divine
Dear All,

Can any one guide me how to generate Random Number in Datastage without using Key Management Function as it only create the sequence Number. I need to generate a random number within the range of 111111 to 999999 .Can any one please help me in this regard as it is urgent.

Posted: Wed Jun 01, 2005 5:11 am
by ArndW
Divine,

you can always use the RND(x) function - this returns a pseudo-random number between 0 and (x-1). You can ensure you get either the same or different number sequences by also using the RANDOMIZE() function.

Posted: Wed Jun 01, 2005 5:36 am
by divine
Dear Anad,

I also thought of using RND(x). But the problem is that what ever the ramdom number is being generate it'll do look up with the database to find out the unique ness and if the number is laready there than another number will generate(but how?). In the begining there will be no problem but in the latter part as the size of the data increases than the performace will go down as it will comapre with the values present in the database and chances are that you may hit the same record many time as you are generating the number randomly .So can you please any other suggestion to do the same thing but also it will not harm the performance.

Posted: Wed Jun 01, 2005 5:52 am
by ArndW
Divine,

my only two suggestions are to:

(a) Use the Key Management functions - this is what they have been designed for

or

(b) Use the Database's in-built sequence generation facilities.

Using a random number to populate surrogate keys is not a good idea. The quickest method in terms of performance is the nextval() from the databases, assuming you are using Oracle.

Posted: Wed Jun 01, 2005 6:00 am
by divine
Dear ArndW,

Thanks for the tips. Actually I am also thinking about this and using Oracle.

Posted: Wed Jun 01, 2005 6:18 am
by chulett
Do not consider using any kind of a 'random' number generator for surrogate key handling. Any perceived perfomance gain from doing so will quickly degenerate as the size of your target table grows. :?

The methodologies Arnd mentions are the tried-and-true ones and have been discussed to death here, so plenty of fodder for the search cannon.

Posted: Wed Jun 01, 2005 11:46 am
by saadmirza
Hi,
Yes also if the performance is hampering, can I get a solution using DataStage...How do we design such a job without considering performance aspects...
Please provide me with your valuable suggestions.

Thanks,
Saad

Posted: Wed Jun 01, 2005 5:04 pm
by ray.wurlod
Avoid random, for reasons already given. Use a sequence. All you are needing is uniqueness. Random numbers detract from performance when checking for uniqueness.
You can generate the sequence using an Oracle sequence, a DataStage routine (such as KeyMgtNextValue), a stage variable and so on. Search the forum for techniques.
Basically, you need to preload the start point, and increment that value.

If you insist on random numbers then you must perform a lookup to determine whether the number already exists then (somehow) implement a strategy to keep generating "random" (?) numbers until an unused one has been generated. Don't do it. It's definitely sub-optimal.

Posted: Thu Jun 02, 2005 1:25 am
by saadmirza
Thanks Ray,
I understand your solution...but the requirement is that any employee Re-hired would be required to give his old employee number and not a newly generated unique number...how do I resolve this requirement if I use sequence since i cannot use already used sequence again...correct me If I am wrong..

Thanks
Saad

Posted: Thu Jun 02, 2005 5:05 am
by ray.wurlod
A re-hired employee using a previously allocated number would be an UPDATE instead of an INSERT. Use the sequence to generate keys only for INSERT.
Otherwise, if you need to preserve history, then your table needs a new design, featuring a completely artificial key that contains no information whatsoever, and the employee number then becomes a non-key column (though with a UNIQUE constraint). The surrogate key gives uniqueness, nothing else. This design allows any one employee to have multiple hire and separation dates. A suitable CHECK constraint can ensure that no employee is hired while currently hired, or fired while not currently employed.