Page 1 of 1

Surrogate id generation - multiple inst jobs at same time.

Posted: Sun Jul 29, 2007 9:49 am
by videsh77
We are using DB2 database has dpf (dynamically partitioned feature) enabled. Our Px job say JobA has multiple invocation enabled.

For an incoming record if surrogate id is created based on if previously generated. In the next stage this id generated gets stored into the database.

When job is called at same time 2 or more instances, all its instances create same surrogate id. When these jobs attempts to hit the database at the same time we face warnings in the log attempt to insert duplicate ids.

We could not find any isolation level settings with DB2 EE stage. As well we attempted to use DB2 API with isolation level setting as Repeatable Reads, still it did not work as expected.

I am sure DataStage must have solution to address such issues. Can someone please enlighten me, the steps or approach how we can achieve this?

Posted: Sun Jul 29, 2007 2:12 pm
by ray.wurlod
While we think about this, can you please put yourself in the position of DataStage - doing simultaneous inserts from multiple processes all of which have to generate unique values of the same column. How would you do it? Essentially that's the same way that DataStage would have to do it.

Posted: Sun Jul 29, 2007 2:52 pm
by videsh77
When we perform 'select nextval...' on the DB2 table for particular column, we need to have DB2 table locked, which we can mention in Open command. So no other process should get max id until this recently generated id is inserted. Once this record is inserted via next DB2 EE stage then in the close command of the EE stage we should unlock the table, letting other waiting resources to carry out same operation.

Let me know, If my approach could be correct.

Posted: Sun Jul 29, 2007 5:52 pm
by fridge
not sure if this is of use , but watch out for doing a sparse lookup using the db2 enterprise stage (I presume this is how u are doing the next val on the db2 sequence)

We found out that if u pump the same lookup values down the lookup stage (i.e. ones used in where a=orchastrate.value) the lookup uses the same value obtained on previous lookup (the assumption being that as the predicate is the same then the returned value will be also)

Did raise this with IBM but that suggest is not a bug , but suggested documentation should be change to highlight this feature

Posted: Sun Jul 29, 2007 9:31 pm
by ray.wurlod
You can use .nextval in user-defined insert SQL (which is what I think you may also be asking) but the same single-threading will apply, which somewhat counteracts your use of multi-instance jobs - or even parallel jobs, for that matter

Posted: Mon Jul 30, 2007 12:13 am
by chulett
There's a difference between selecting a 'max id' from a table (which the OP shows as 'select nextval') and the use of 'dot nextval' in user-defined sql. Unless it works totally differently from its counterpart in Oracle, there are no single threaded issues with the latter. :?