Lookup to Hashfile and Writing to same Hash File

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
dylsing
Participant
Posts: 35
Joined: Thu May 04, 2006 9:56 pm

Lookup to Hashfile and Writing to same Hash File

Post by dylsing »

I have a transformer stage which uses a hashfile for lookup purposes. The contents of the hashfile is written by the transformer itself.

Everytime I have new contents, I write it to the hashfile and later will do a lookup to see if the contents already exist so I won't let these duplicates through.

To summarise, the transformer uses a lookup that the transformer itself create.

The problem is that for the first row, the transformer will want to do a lookup but since it hasn't created the hashfile, the access to the non-existant hashfile will fail.
- Is there a way I can create a empty hashfile based on my needs so that it can access the empty hashfile for the lookup check?
- Or can I avoid the the lookup for the 1st row?[/list]


Any suggestions will be greatly appreciated. Thank you.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

The hashed file must be opened for writing first in order to create it before it is opened for reading in the job. Accomplish this by linking a Transformer stage to the hashed lookup. Yes, just a Transformer.

Create a bogus stage variable so the job will compile. Match the metadata to the hashed file and set all column derivations to anything, @NULL for example, it won't be used. Set the constraint in the link to @FALSE. Set the actions on the Input Link in the hashed file to whatever you need - clear, delete / create, etc.

When the job starts, the link will run and create / clear the hashed file but process no rows. Then the 'main' portion of your job will run.

Easy Peasy. :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Temporarily change the name of the "read from" hashed file to one that exists (for example VOCLIB). Validate the job, which will create the hashed file in the Hashed File stage that has the input link. Then change the name of the "read from" hashed file back to what it needs to be (the one you have just created).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

That's all well and good... and clever. However, that only solves the initial creation issue. Ongoing, I'm sure the hashed file needs to be cleared in one fashion or another each run, and hanging the transformer off the reference hashed file solves both issues.
-craig

"You can never have too many knives" -- Logan Nine Fingers
dylsing
Participant
Posts: 35
Joined: Thu May 04, 2006 9:56 pm

Post by dylsing »

I don't under the @FALSE constraint. Will it set the link to never be processed and therefore no values will be written into the hashfile but yet the hashfile will still be created?
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Exactly. :D
-craig

"You can never have too many knives" -- Logan Nine Fingers
dylsing
Participant
Posts: 35
Joined: Thu May 04, 2006 9:56 pm

Post by dylsing »

Fascinating, the ways to get round these issues. Thank you. :D
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

I afraid, wether I havnt got the issue correctly :oops:
If the job is validated the Hashed files will be created in its respective path. Since the transformer is looking up the same Hashed file which it cerates, hope it shouldnt give up any issue during run time.
I dont have server access, so i could not test it either. :?
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

"Create" is clever enough not to create if it already exists.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

Validating the job once and running should run smoothly, am I right?
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
sb_akarmarkar
Participant
Posts: 232
Joined: Fri Sep 30, 2005 4:52 am
Contact:

Post by sb_akarmarkar »

Use routine UtilityHashLookup which returns file or table not found if there is no file ..... by this routine we can validate the file present or not...

Thanks,
Anupam
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Totally unnecessary to use the routine like that. :?

While Validating can be used to create hashed files amongst other things, I can't remember the last time I actually validated anything now that we no longer have to do that to precreate them. :lol: Yes, boys and girls, once upon a time that was The Way to create hashed files.

The point of hanging the transformer off the reference hashed file is two-fold:

1) Allow a write operation to happen first in the job, so that it will create the hashed file the first time the job run and so one doesn't have to remember to validate it in every new environment. And as noted, nothing is actually written during this phase, the hashed file is mearly opened and then closed.

2) More importantly, it allows the job to clear / reset the hashed file at the proper point in the job for each run. You can't simply set the 'clear' option on the target hashed stage as it is 'too late' then - one lookup has already been done against a non-empty reference hashed file.

Hope that helps explain the why of this.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply