Primary Key Query

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
chandra.shekhar@tcs.com
Premium Member
Premium Member
Posts: 353
Joined: Mon Jan 17, 2011 5:03 am
Location: Mumbai, India

Primary Key Query

Post by chandra.shekhar@tcs.com »

Hi,
I have a small query, one of my table doesn't has any PK.
Initially the mode of writing was only Insert. Now due to a new requirement updates can also be possible. So I asked my DBA not to define any PK column as I thought I can handle in my job itself.
So in the target stage, I tick marked the column upon which update should happen. I thought this will work.
But to my surprise, I am seeing duplicate records in the target table(DB2 database). I checked in my job logs too, the update statement is getting generated correctly.
What wrong have I done here or is my funda of doing update wrong ?
Thanx and Regards,
ETL User
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

The column you choose for a 'key column' i.e. the column(s) that should be part of the where clause, have nothing whatsoever to do with the PK (or lack thereof) on the table. So what are you doing besides updates? How have you configured the target stage? What stage are you using?

Just as an FYI, the lack of a PK or any unique index will allow "duplicates" to be loaded... however, you should be able to prevent that from happening with a proper design.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

If your Upsert specifies "Insert before Update" the insert will always work because there is nothing preventing it. And you get duplicate keys. Try changing the order to "Update before Insert".
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

They didn't mention "upserts" yet but was hoping that would come out in the details they followed up with if they were. And yes, in that case doing them in that order would be... problematical.
-craig

"You can never have too many knives" -- Logan Nine Fingers
chandra.shekhar@tcs.com
Premium Member
Premium Member
Posts: 353
Joined: Mon Jan 17, 2011 5:03 am
Location: Mumbai, India

Post by chandra.shekhar@tcs.com »

@Craig
Not able to view your whole response, but I am doing what you are saying. My target is DB2 Connector and I have selected auto generated query.
The query i see in the director logs is correct and is according to my requirement.
Still its not working, I m out of options now.
By the way, who is "they" ?? Didn't understand your second response.. :oops:

@Ray
I am using "Insert then Update" option and even tried with the other one too.
Thanx and Regards,
ETL User
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

You is they. :wink:

You have to understand that with either "upsert" the first action must fail before the second is triggered. An insert only fails when it violates a unique key constraint, so without one all you'll ever get is inserts.

For "Update else Insert" the update fails if it updates zero records.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply