Lookup to Hashfile and Writing to same Hash File
Moderators: chulett, rschirm, roy
Lookup to Hashfile and Writing to same Hash File
I have a transformer stage which uses a hashfile for lookup purposes. The contents of the hashfile is written by the transformer itself.
Everytime I have new contents, I write it to the hashfile and later will do a lookup to see if the contents already exist so I won't let these duplicates through.
To summarise, the transformer uses a lookup that the transformer itself create.
The problem is that for the first row, the transformer will want to do a lookup but since it hasn't created the hashfile, the access to the non-existant hashfile will fail.
- Is there a way I can create a empty hashfile based on my needs so that it can access the empty hashfile for the lookup check?
- Or can I avoid the the lookup for the 1st row?[/list]
Any suggestions will be greatly appreciated. Thank you.
Everytime I have new contents, I write it to the hashfile and later will do a lookup to see if the contents already exist so I won't let these duplicates through.
To summarise, the transformer uses a lookup that the transformer itself create.
The problem is that for the first row, the transformer will want to do a lookup but since it hasn't created the hashfile, the access to the non-existant hashfile will fail.
- Is there a way I can create a empty hashfile based on my needs so that it can access the empty hashfile for the lookup check?
- Or can I avoid the the lookup for the 1st row?[/list]
Any suggestions will be greatly appreciated. Thank you.
The hashed file must be opened for writing first in order to create it before it is opened for reading in the job. Accomplish this by linking a Transformer stage to the hashed lookup. Yes, just a Transformer.
Create a bogus stage variable so the job will compile. Match the metadata to the hashed file and set all column derivations to anything, @NULL for example, it won't be used. Set the constraint in the link to @FALSE. Set the actions on the Input Link in the hashed file to whatever you need - clear, delete / create, etc.
When the job starts, the link will run and create / clear the hashed file but process no rows. Then the 'main' portion of your job will run.
Easy Peasy.
Create a bogus stage variable so the job will compile. Match the metadata to the hashed file and set all column derivations to anything, @NULL for example, it won't be used. Set the constraint in the link to @FALSE. Set the actions on the Input Link in the hashed file to whatever you need - clear, delete / create, etc.
When the job starts, the link will run and create / clear the hashed file but process no rows. Then the 'main' portion of your job will run.
Easy Peasy.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Temporarily change the name of the "read from" hashed file to one that exists (for example VOCLIB). Validate the job, which will create the hashed file in the Hashed File stage that has the input link. Then change the name of the "read from" hashed file back to what it needs to be (the one you have just created).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
That's all well and good... and clever. However, that only solves the initial creation issue. Ongoing, I'm sure the hashed file needs to be cleared in one fashion or another each run, and hanging the transformer off the reference hashed file solves both issues.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
I afraid, wether I havnt got the issue correctly
If the job is validated the Hashed files will be created in its respective path. Since the transformer is looking up the same Hashed file which it cerates, hope it shouldnt give up any issue during run time.
I dont have server access, so i could not test it either.
If the job is validated the Hashed files will be created in its respective path. Since the transformer is looking up the same Hashed file which it cerates, hope it shouldnt give up any issue during run time.
I dont have server access, so i could not test it either.
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
-
- Participant
- Posts: 232
- Joined: Fri Sep 30, 2005 4:52 am
- Contact:
Totally unnecessary to use the routine like that.
While Validating can be used to create hashed files amongst other things, I can't remember the last time I actually validated anything now that we no longer have to do that to precreate them. Yes, boys and girls, once upon a time that was The Way to create hashed files.
The point of hanging the transformer off the reference hashed file is two-fold:
1) Allow a write operation to happen first in the job, so that it will create the hashed file the first time the job run and so one doesn't have to remember to validate it in every new environment. And as noted, nothing is actually written during this phase, the hashed file is mearly opened and then closed.
2) More importantly, it allows the job to clear / reset the hashed file at the proper point in the job for each run. You can't simply set the 'clear' option on the target hashed stage as it is 'too late' then - one lookup has already been done against a non-empty reference hashed file.
Hope that helps explain the why of this.
While Validating can be used to create hashed files amongst other things, I can't remember the last time I actually validated anything now that we no longer have to do that to precreate them. Yes, boys and girls, once upon a time that was The Way to create hashed files.
The point of hanging the transformer off the reference hashed file is two-fold:
1) Allow a write operation to happen first in the job, so that it will create the hashed file the first time the job run and so one doesn't have to remember to validate it in every new environment. And as noted, nothing is actually written during this phase, the hashed file is mearly opened and then closed.
2) More importantly, it allows the job to clear / reset the hashed file at the proper point in the job for each run. You can't simply set the 'clear' option on the target hashed stage as it is 'too late' then - one lookup has already been done against a non-empty reference hashed file.
Hope that helps explain the why of this.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers