highest surrogate key

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

rsunny
Participant
Posts: 223
Joined: Sat Jul 03, 2010 10:22 pm

Post by rsunny »

Hi ,

i am doing look up based on the business keys and and using the same surrogate key for the updates and inserting new surrogate key for the inserts.

thanks in advance
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

And all of your lookups are failing? You didn't explain how you implemented them - hashed file, direct database, what? Have you debugged the job to see why your keys aren't matching?
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

1. Pre-load a hashed file with the CURRENT records from the target, with the business, or natural, key as the key column, the surrogate key as a non-key column, and any columns needed for change detection also as non-key columns.

2. In a Transformer stage check for existence. If not found, generate a new surrogate key value and INSERT the row into the target. Also update the hashed file with the new surrogate key value for that business key.

3. If the business key does exist in the target then check whether any of the other columns reflects a change. If so generate a new surrogate key valkue and INSERT the new row into the target and UPDATE the existing target row to make it non-current. Also update the hashed file with the new surrogate key value for that business key.

All of the above processing (steps 2 and 3) should be performed in the one Transformer stage.

Note that this simple mechanism does not take into account records in the target but no longer in the source. This requres a separate pass.
Last edited by ray.wurlod on Mon Oct 04, 2010 1:12 am, edited 1 time in total.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

The single argument to the key management routines is the NAME OF A SEQUENCE. The sequence is stored in the DataStage repository, however, not in the target database.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
rsunny
Participant
Posts: 223
Joined: Sat Jul 03, 2010 10:22 pm

Post by rsunny »

Hi everyone,

first of all i would like to thanks everyone for your valuable suggestions.first i did lookup with direct database it didnt work for me.but when i Pre-load a hashed file with the CURRENT records from the target , it is working fine.now its resolved.But can you guys expplain me as to why its not working if i do lookup directly with the database and its working only when i Pre-load a hashed file with the CURRENT records from the target and do a look up.


thanks in advance.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Right - it is a label, key for a record in a hashed file and one I typically name after the target it will be used in. All this would be revealed by a simple double-click on the routine itself and an inspection of the code. You would then learn the name of the hashed file and could thus query or source it if needed.
Last edited by chulett on Mon Oct 04, 2010 6:48 am, edited 1 time in total.
-craig

"You can never have too many knives" -- Logan Nine Fingers
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

rsunny wrote:But can you guys expplain me as to why its not working if i do lookup directly with the database
Not possible to say without all of the gory details we'd need or actually seeing how you implemented it. All I can guess is that your lookup query wasn't keyed correct. [shrug]

The hashed file is the better choice here, regardless.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Natural keys occur more than once each in the target database. So, unless you use complex logic (constrain also on the largest associated surrogate key, which would involve a correlated subquery), you're probably not going to get the correct result except by chance. You never did explain what "not working" meant.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
rsunny
Participant
Posts: 223
Joined: Sat Jul 03, 2010 10:22 pm

Post by rsunny »

Hi everyone ,

when i use hash file and do a lookup , everything is working fine .But here if i use my key as a business key , then if the keys are repeated , hash file is going to pick any one record with the same key.so how do i do lookup based on the hash file as i might get a chance of hash file picking up only one record repeating same business key and i might get a chance of getting the record with cuurent indicator as 'N'i.e. expired but i have to compare the recorsd with the current indicator as 'Y' only.I couldnt take my key as aprimary key in hash file as my source doesnt have the primary key(Surrogate id).Can any one tell what is the best solution for this , if i use direct database for lookup , the records are not updating only inserts are working.


thanks in advance
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Only populate the hashed file with "current" records i.e. ones where your current indicator = 'Y', something I thought you said you were already doing. :?
-craig

"You can never have too many knives" -- Logan Nine Fingers
rsunny
Participant
Posts: 223
Joined: Sat Jul 03, 2010 10:22 pm

Post by rsunny »

chulett wrote:Only populate the hashed file with "current" records i.e. ones where your current indicator = 'Y', something I thought you said you were already doing. :?
hi craig,

Then we get only one record which is set as current indicator as 'Y' and one record which is set as current indicator as 'N', if we keep the current indicator as Key.i tried taking the key for hash file input as primary key and output hash file as business key, but it didnt work for me. So i think whatever key you take that should be same for input and output hash file.So what might be the best possible solution for this?

thanks in advance
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

The hashed file only ever contains the current record. If there becomes a new current record, this OVERWRITES the previous one in the hashed file. Correct inserts and updates are directed to the target table.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
rsunny
Participant
Posts: 223
Joined: Sat Jul 03, 2010 10:22 pm

Post by rsunny »

Hi ,

So i have my target database and connecting to hash file and doing lookup .so now if the record is repeated more than three or four times having the same business ky , then hash file will take in the order what is ther in the target database.so what if the last record in the target might be a record with the bussinees key having current indicator as 'N', hash file is going to take that record, but i want hash file to be taken the latest record in thetarget if that record is repeated more than two or three times of the same businees key.so in my case if i do repeated updates , its not working as hash file is taking different records , may be one time its working fine as the hash file is picking correct record to do look up but some times not.how to overcome this solution.can any one help me out?


thanks in advance
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Completely wrong. The hashed file (and not it's "hashed", not "hash") only ever contains current records. It is pre-loaded with current records and receives replacements to these.

Tip: This will happen any time you need to generate a new surrogate key for the target table.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

It would help were you to obtain a premium membership so as to be able to see the totality of Craig's and my replies.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply