I'm reading a sequential file, passing it thru a link splitter and generating 4 streams into a DB2 CLI stage.
The DB2 stage does an 'update existing rows only' and each input stream is updating the same table. I'm trying this in an attempt to speed up the update process.
However, I keep on getting unique key violations from the database. It's as though DB2 is doing an Insert.
The 'where' clause refers to whichever columns you have marked as Keys - so regardless of the actual keys in the table, only mark the columns you want to be used in the where clause for this statement. All of the columns not marked as keys will be updated.
Make sure you've set the Update Action on all four of the Input links to the DB2 stage. Classic problem is to only set the 'first' one and leave the rest at the default action - Insert.
Unique Key violations don't just come from inserts! If you are updating primary key columns (or values in any unique index) and the value(s) you are trying to set the row to already exist in another record you will generate that error.
-craig
"You can never have too many knives" -- Logan Nine Fingers
I've always thought of it like this.
In any SQL, columns referred to in a WHERE clause refer to the search (constraining, filtering) keys (as opposed to the primary keys).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Thanks guys. I fiddled with the key columns and now it works.
Interesting though, I tried the job using the splitter and then created another job using the IPC stage without the splitter. The IPC stage beats the splitter every time.
A splitter will not do much for your job and may be making it slower because the entire job is still only running on a single CPU. To get the most from parallel processing you need to make it a multiple instance job and run your four streams as different instances of the job that partition the data.
I'm not sure I understand what you mean when you say the splitter will only run one cpu. From my understanding, each link out of the splitter will force a new 'thread' which, in turn, will mean the cpu's will be better utilised .
If this is not the case, then from a performance enhancing viewpoint, is the splitter worth anything?