DB2 Identity Column Issues with Parallel Jobs

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
krishobby
Participant
Posts: 20
Joined: Sun Dec 04, 2005 3:23 pm

DB2 Identity Column Issues with Parallel Jobs

Post by krishobby »

Hi All,

I had a requirement in my current project to generate id's using identity column in the DB2 table, but I have given my recommendation to generate it from Data Stage and which works fine. Out of curiosity I tried populating the table with Identify column as I have never attempted that. But I am getting the following errors, so if anyone had attempted that would appreciate your help on this.

Here is the situation, I have my parallel job loading into the DB2 table from a sequential file using DB2 Enterprise stage. If I also have the identity column mapped in the job or default a zero it throws an error like this

EPC_Abc,0: Error Idx = 5;
DB2 Driver Embedded SQL message: SQL0798N A value cannot be specified for column "Abc_ID" which is defined as
GENERATED ALWAYS. SQLSTATE=428C9
;
sqlcode = -798;
sqlstate = 428C9

If I remove that identify column completely from the job, it throws an error like this,

main_program: Missing field: Abc_id in the input dataset.
EPC_Abc: When preparing operator: When binding partitioner interface: Could not find input field "Abc_id".

RCP is disabled and I tried using Write Method=Upsert and as well as Write.

Thanks in advance.
sud
Premium Member
Premium Member
Posts: 366
Joined: Fri Dec 02, 2005 5:00 am
Location: Here I Am

Re: DB2 Identity Column Issues with Parallel Jobs

Post by sud »

An identity column that is defined as "generated always" means that no application is allowed to provide values for it. You should not pass any value for that column nor menion it in the metadata. Insert values for the rest of the columns and DB2 will automatically generate values for that column.
It took me fifteen years to discover I had no talent for ETL, but I couldn't give it up because by that time I was too famous.
krishobby
Participant
Posts: 20
Joined: Sun Dec 04, 2005 3:23 pm

Re: DB2 Identity Column Issues with Parallel Jobs

Post by krishobby »

sud, If you read the last five lines of the post, thats what I am saying, if I remove the identity column completely from the job without passing any values, it throws an error again saying

main_program: Missing field: Abc_id in the input dataset.
EPC_Abc: When preparing operator: When binding partitioner interface: Could not find input field "Abc_id".
sud
Premium Member
Premium Member
Posts: 366
Joined: Fri Dec 02, 2005 5:00 am
Location: Here I Am

Re: DB2 Identity Column Issues with Parallel Jobs

Post by sud »

Apologies for overlooking that, are you using insert/update or load?
It took me fifteen years to discover I had no talent for ETL, but I couldn't give it up because by that time I was too famous.
krishobby
Participant
Posts: 20
Joined: Sun Dec 04, 2005 3:23 pm

Re: DB2 Identity Column Issues with Parallel Jobs

Post by krishobby »

sud, again if you look at the last two lines, I tried write method = upsert as well as write with this excercise, but still did not work.

I also read in one of the previous posts where substituting 0 for these identity column will be sufficient, I tried that and didn't work.
sud
Premium Member
Premium Member
Posts: 366
Joined: Fri Dec 02, 2005 5:00 am
Location: Here I Am

Re: DB2 Identity Column Issues with Parallel Jobs

Post by sud »

The zero value will work only for "generated by default" whereas in your case it is "always generated". And, did you check the select statement generated for the upsert method, does it have the column mentioned?
It took me fifteen years to discover I had no talent for ETL, but I couldn't give it up because by that time I was too famous.
krishobby
Participant
Posts: 20
Joined: Sun Dec 04, 2005 3:23 pm

Re: DB2 Identity Column Issues with Parallel Jobs

Post by krishobby »

I had already checked the SQL this column doesn't exists, infact when Write Method=Upsert and Auto generated SQL automatically removes that column from the SQL.
charles.frenette
Participant
Posts: 1
Joined: Thu Nov 01, 2007 2:19 pm

Re: DB2 Identity Column Issues with Parallel Jobs

Post by charles.frenette »

Hi,
I wonder if you have a resolution for this problem? I am facing the same problem and I belive it has something to do with the fact that the table is in a partionned tablespace.
Regards and thanks.
stefanfrost1
Premium Member
Premium Member
Posts: 99
Joined: Mon Sep 03, 2007 7:49 am
Location: Stockholm, Sweden

Post by stefanfrost1 »

The problem here is that you are probably partitioning you target table based on the identity column, hence the parallel engine will try to partition your input stream based on the same. And since that column is not present in the input stream any partitioning is impossible.

The solution to your problem is either to remove identity columns and add the surrogate key in the input stream before loading the target making partitioning possible. The other solution is to disable parrallism in the db2 enterprise stage and demote it to sequential. You do that under stage properties-> advanced. A third solution is to revise your partition key to something in the stream, but be careful of the consequences of other usage of the table.

My recommendation is to try to do it using parrallism and by feeding your surrogate key generator stage with the highest available sequence from your target at runtime.
-------------------------------------
http://it.toolbox.com/blogs/bi-aj
my blog on delivering business intelligence using agile principles
Post Reply