Surrogate Key generation in Oracle 9I

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
dhiren
Premium Member
Premium Member
Posts: 43
Joined: Thu Aug 11, 2005 12:24 am

Surrogate Key generation in Oracle 9I

Post by dhiren »

Hi,
We are using Datastage V7.5 (Unix). The current requirement is to load an Oracle table in Datastage parallel job.
The column which is to be generated for next higher number is a Primary Key column that is INDEXED. Also there is a sequence defined on this column in Oracle.

How to insert a row into this Oracle table taking care of this Surrogate Key column.
One way i think is to get a stage variable in Transformer which represents the maximum value of that surrogate key column. The value of this variable can be increased by 1.

As an alternative, i tried to map all the columns except this column in Tranformer and then this columns were passed on to Oracle Bulk Loader.
But this didn't work

Is there a better way with Oracle Bulk Loader or Enterprise Stage. Either stage will do as the number of records that will be inserted are in the range of few thousands.

Please guide. Syntax will be v. useful :)

Thanks in advance.

Regards,
Dhiren
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Why not use the Surrogate Key stage?

The problem with increasing by 1 is that you will generate duplicate key values as soon as you start using more than one processing node. Search the forum for alternatives (driven, for example, by the partition number).

In the Enterprise stage change the SQL so that you do not deliver the key column from DataStage but, instead, access the NEXTVAL property of the sequence.

Code: Select all

INSERT INTO table(keycol, colA, colB, ...) VALUES (sequence.NEXTVAL, :1, :2, ...)
I did see one site that created a trigger to replace the key, whether supplied or not, with NEXTVAL from a sequence. It wasn't fast.

If you're using bulk loader, then you do need to provide key values.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
dhiren
Premium Member
Premium Member
Posts: 43
Joined: Thu Aug 11, 2005 12:24 am

Capturing the maximum value of surrogate key

Post by dhiren »

Hi Ray,
Thanks V much.
Well i did use a Surrogate key Stage.

I want to insert the records into table. The table can be empty or may contain some records.
So the Start Value of the column should be the max value + 1.
How do i incorporate this in the job
Can you please tell me the trick if any ?

Regards,
Dhiren

ray.wurlod wrote:Why not use the Surrogate Key stage?

The problem with increasing by 1 is that you will generate duplicate key values as soon as you start using more than one processing node. Search the forum for alternatives (driven, for example, by the partition number).

In the Enterprise stage change the SQL so that you do not deliver the key column from DataStage but, instead, access the NEXTVAL property of the sequence.

Code: Select all

INSERT INTO table(keycol, colA, colB, ...) VALUES (sequence.NEXTVAL, :1, :2, ...)
I did see one site that created a trigger to replace the key, whether supplied or not, with NEXTVAL from a sequence. It wasn't fast.

If you're using bulk loader, then you do need to provide key values.
cmmurari
Participant
Posts: 34
Joined: Sun Jan 02, 2005 9:55 am
Location: Singapore

Post by cmmurari »

Surrogate key Stage one of property "Start Value" you need to parameterize through Job Properties and in your job design you need to write another Utility job to get Max id's and pass output value to current job in sequence design.
Post Reply