Getting surrogate keys back from a db2 sequence

HSBCdev · Post by **HSBCdev** » Wed Mar 23, 2005 4:01 am

I've got a job which uses a lookup to a db2 stage. The db2 stage is set as a sparse lookup and uses the user defined sql

SELECT (NEXTVAL for SEQUENCE1) as SURROGATEKEY
FROM SYSIBM.SYSDUMMY1

I've outputed the results to a file and found that I get the same value for the surrogate key on all the records on a particular node going into the lookup.

There are 3 nodes on database so I only get 3 different values returned over all the records going into the lookup.

How do I set this up so I get a unique value for each record?

HSBCdev · Post by **HSBCdev** » Wed Mar 23, 2005 6:41 am

Has anyone managed to use a database sequence in a parallel job to generate a unique value for each record being processed? Am I misunderstanding what a lookup to a db2 sequence should achieve in a parallel job?

HSBCdev · Post by **HSBCdev** » Thu Mar 24, 2005 10:37 am

I've now changed my lookup SQL to be as follows:

SELECT (NEXTVAL for SEQUENCE1) as SURROGATEKEY
FROM SYSIBM.SYSDUMMY1
WHERE ORCHESTRATE.INPUTFIELD > -1

This works fine when my input to the job is a row generator but gives me the original problem when I use a sequential file.

Any ideas why this is?

HSBCdev · Post by **HSBCdev** » Fri Apr 01, 2005 7:20 am

I've ended up using a server shared container within a PX job to do the lookup on the db2 sequence.

T42 · Post by **T42** » Sun Apr 03, 2005 7:33 pm

Why not pull a single value with a lookup dummy field, join it, use a surrogate key generator stage to generate surrogate key, and do the math (surrkey + generated_key).

This will be a MUCH more performant solution than what you are doing originally, and even far more so than what you are doing now.

HSBCdev · Post by **HSBCdev** » Thu May 26, 2005 2:17 am

There could be other jobs creating new surrogate keys on the same table at the same time. I presume the solution you suggest relies on this being the only job creating the key.

ray.wurlod · Post by **ray.wurlod** » Thu May 26, 2005 7:04 am

WHERE ORCHESTRATE.INPUTFIELD > -1

Because you're working on a lookup, you have to have a WHERE clause, one that will always return a row. Elsewhere on the Forum (perhaps you found it in a search) I posted a similar solution for Oracle, from memory WHERE ROWNUM = :1 and provided 1 as the reference key expression.

Titto · Post by **Titto** » Tue Apr 11, 2006 3:44 pm

HSBCdev wrote:There could be other jobs creating new surrogate keys on the same table at the same time. I presume the solution you suggest relies on this being the only job creating the key.

Hi HSBCDev,

I have same kind of requirement and i need the sequence number to write to a file so that there are other processes using this number down the line. I created container as you said but still it is creating the same number for all the row. I am using Oracle as a database.
Could you please share how did you achive this?

Any help from other is appreciated!

Thanks

vmcburney · Post by **vmcburney** » Tue Apr 11, 2006 7:25 pm

I don't like this design as the sparse lookup is a very expensive overhead on your job. You've got to ask what advantage you are getting from using a DB2 sequence when you have to retrieve it for every row. I think you would be better off generating keys within your job using a parallel counter, retrieve the maximum value from the DB2 sequence and pass it into your job as a parameter, update this sequence after you have loaded all rows.

We currently retrieve values from a DB2 sequence as we insert rows into tables by putting the NEXTVAL into the insert statement.

Titto · Post by **Titto** » Tue Apr 11, 2006 10:52 pm

vmcburney wrote:I don't like this design as the sparse lookup is a very expensive overhead on your job. You've got to ask what advantage you are getting from using a DB2 sequence when you have to retrieve it for every row. I think you would be better off generating keys within your job using a parallel counter, retrieve the maximum value from the DB2 sequence and pass it into your job as a parameter, update this sequence after you have loaded all rows.

We currently retrieve values from a DB2 sequence as we insert rows into tables by putting the NEXTVAL into the insert statement.

Hi,

as the same table been used by other modules that is why we planned to use sequence from Oracle database.

but, could you please eloborate your last statement - "We currently retrieve values from a DB2 sequence as we insert rows into tables by putting the NEXTVAL into the insert statement" - how do you get the the sequence number while inserting, i am also expecting the same.
I need to write sequence number along with other fields to a seq file to use as input in other jobs.

your help is appreciated.

Thanks,

ray.wurlod · Post by **ray.wurlod** » Wed Apr 12, 2006 12:45 am

When using NEXTVAL in the INSERT statement you don't get the value back. That's the whole point. It's faster.

Titto · Post by **Titto** » Wed Apr 12, 2006 9:07 am

ray.wurlod wrote:When using NEXTVAL in the INSERT statement you don't get the value back. That's the whole point. It's faster.

Hi Ray,

but i need the latest Sequence number to write out to a file, can i use CURRVAL after the NEXTVAL (Insert statement) - in next stage, can it give the recently inserted value? please suggest me is it best way to do?

Thanks

ray.wurlod · Post by **ray.wurlod** » Wed Apr 12, 2006 4:21 pm

Then you can't use NEXTVAL in the INSERT statement, and need to use some other mechanism.

CURRVAL is not available within the job, which is probably (a) using NEXTVAL on all partitions, (b) implementing row buffering, (c) not committing every row.

Probably the most efficient mechanism for your particular requirement is to lock the table, generate the keys within the job, and reset the sequence to the new "next value" when done before releasing the table-level lock.

Or you can use NEXTVAL in the INSERT statement provided that you have some other means to identify the rows (such as run batch identifier and/or timestamp) so that you can retrieve those rows from target for ongoing processing.