oracle sequence or surrogate key generator

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
dnat
Participant
Posts: 200
Joined: Thu Sep 06, 2007 2:06 am

oracle sequence or surrogate key generator

Post by dnat »

HI,

I am in a position to decide whether to use the surrogate key generator or oracle sequence for creating unique ids.

My requirement is to generate sequence numbers based on the existing max number we have in the table.

I have done that using oracle seuqence--i.e in the insert statement, i have used user-defined sql and in the values i have passed <sequence.nextval). it was simple and had no problems.

Now in a different project i am being asked to evaluate the pros and cons of using oracle sequence and surrogate key generator as surrogate key generator is being used in different ETL projects which are already existing.

can anyone help..
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

One con is that sequence use in the insert statement requires user-defined sql. It also cannot be used in that manner if you need to know what the sequence will be ahead of time for RI purposes - you'd need to do a lookup of some sort and go across the network for every record processed, it seems to me.
-craig

"You can never have too many knives" -- Logan Nine Fingers
dnat
Participant
Posts: 200
Joined: Thu Sep 06, 2007 2:06 am

Post by dnat »

Are there any performance issues if we use oracle sequences?..I heard that, it was one of the disadvantages of using an oracle sequence.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Depends. The Surrogate Key Generator can, itself, be set up to use a database sequence, and retrieve huge blocks of keys (and why not?).

If you're processing 1000000 records, and request blocks of, say, 250000 keys at a time, performance will be amazingly quick. Of course, the same is true if the block size requested from a state file is also large.

Using a database sequence should be the preferred option where other applications might also be inserting records into the table while DataStage (or a process under its control) is doing so. Otherwise, it's probably preferred to use a state file. This can be initialized from a query against the database.

Code: Select all

SELECT MAX(SK) + 1 FROM tablename;
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply