Hash file lookup and simultaneous update
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 24
- Joined: Wed Dec 01, 2004 11:12 am
Hash file lookup and simultaneous update
Good afternoon.
I'm not sure if this is even possible, but I wanted to perform a lookup against a hash file. Now the tricky part is for each record that is used in the hash file for looking up I want to update a column named "PROCESSED" to TRUE. That way I can keep track of which records from the hash file were used during the run.
Is this possible?
Many thanks to all that lend a helping hand. :D
I'm not sure if this is even possible, but I wanted to perform a lookup against a hash file. Now the tricky part is for each record that is used in the hash file for looking up I want to update a column named "PROCESSED" to TRUE. That way I can keep track of which records from the hash file were used during the run.
Is this possible?
Many thanks to all that lend a helping hand. :D
-
- Premium Member
- Posts: 385
- Joined: Wed Jun 16, 2004 12:43 pm
- Location: Virginia, USA
- Contact:
Yes.
Do NOT enable the Pre-load file into memory option in your reference lookup hash file stage, and do NOT enable Write cache in your output hash file stage.
Remember, the hash file stage does a row level write. Therefore, your lookup and output stages must read and then write all columns in the hash file row.
Do NOT enable the Pre-load file into memory option in your reference lookup hash file stage, and do NOT enable Write cache in your output hash file stage.
Remember, the hash file stage does a row level write. Therefore, your lookup and output stages must read and then write all columns in the hash file row.
Chuck Smith
www.anotheritco.com
www.anotheritco.com
-
- Participant
- Posts: 24
- Joined: Wed Dec 01, 2004 11:12 am
Chuck thanks for the response. I'm not sure if I understand you so let me recap. I already have both of those options turned off. my job looks like this:
The OCI1 loads the hash file then OCI2 goes to the transformer and does lookup against the hash file. I then write out the result to a table using the OCI_BULK. But I want the hash to now reflect which of it's records were used by the lookup for another job that is going to use that same hash file.
In the second job it's going to check for the records that were not used in the lookup from the first job. Something like:
Code: Select all
OCI1 --> HASH
|
|
OCI2 --> XFM --> OCI_BULK
Code: Select all
UV1(With HASH) --> XFM --> OCI_BULK
Code: Select all
Select * from HASH where PROCESSED = FALSE
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
If they're separate jobs, then they're not simultaneous and you shouldn't have any problem.
(If you wanted to do it all in one job, then you should enable lock for update when reading from the hashed file.)
(If you wanted to do it all in one job, then you should enable lock for update when reading from the hashed file.)
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 24
- Joined: Wed Dec 01, 2004 11:12 am
-
- Participant
- Posts: 3337
- Joined: Mon Jan 17, 2005 4:49 am
- Location: United Kingdom
-
- Participant
- Posts: 24
- Joined: Wed Dec 01, 2004 11:12 am
-
- Participant
- Posts: 24
- Joined: Wed Dec 01, 2004 11:12 am
I have tried updating the hash file as shown below, but as you all know there is no updating feature in the hash file stage. What stage do I use to update a hash file?
Code: Select all
HASH_UPDATE(HASH Stage)
/
OCI1 --> HASH /
| /
|/
OCI2 --> XFM --> OCI_BULK
-
- Participant
- Posts: 24
- Joined: Wed Dec 01, 2004 11:12 am
-
- Participant
- Posts: 24
- Joined: Wed Dec 01, 2004 11:12 am
It WORKED!!!!!
Thank you all for your help. Below is the final set-up for JOB 1:
Code: Select all
HASH_UPDATE(UV Stage)
/
OCI1 --> HASH /
| /
|/
OCI2 --> XFM --> OCI_BULK
I don't know why you're using the UV stage. Simply use the hash file stage and map all reference columns to the output hash file. The act of using the UV stage simply hides the fact that under the covers, DataStage is reading the whole row, updating the interested column in the row in memory, then writing the whole row back.
Your reference lookup ALREADY HAS READ THIS ROW, simply write it back out. Your current solution DOUBLE READS THE SAME ROW. Change the UV stage for a Hash stage and all is well. Just don't use any caching if repeating driving rows have to "see" the prior row.
Your reference lookup ALREADY HAS READ THIS ROW, simply write it back out. Your current solution DOUBLE READS THE SAME ROW. Change the UV stage for a Hash stage and all is well. Just don't use any caching if repeating driving rows have to "see" the prior row.
Kenneth Bland
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
-
- Participant
- Posts: 3337
- Joined: Mon Jan 17, 2005 4:49 am
- Location: United Kingdom
-
- Participant
- Posts: 24
- Joined: Wed Dec 01, 2004 11:12 am
Update to Hash instead of UV
Well I tried changing it to a hash file instead of the UV stage, but now my Hash file only contains the records that went through the XFM as Not(NOTFOUND)'s. I wanted the hash file to contain the same amount of records that it had when it began. Since not all the records in the hash file are being used by the lookup I want to only update those that have been used and leave the others as they were. So the one's the were used by the lookup would have a PROCESSED = TRUE, but the ones that were not used by the lookup would be left alone, as PROCESSED = FALSE. The UV stage accomplishes this fine. The hash stage on the other hand appears to delete hash file completely and then loads it with the records from the XFM output constaint of Not(NOTFOUND), which is not what I want.
Kc I'm not sure how a record would be read twice. Does the UV stages UPDATE perform a read?
Kc I'm not sure how a record would be read twice. Does the UV stages UPDATE perform a read?
Just set the value of the column PROCESSED to either true or false depending on the result of the lookup (link variable is lnk.NOTFOUND found in the context menu for derivations). All hash writes are desctructive so you basically have to build the record like you want it, but the outcome of using the hash would be the same as if you had used the UV stage.
Also, make sure you are not constraing your output by .NOTFOUND. Make sure all rows flow through the process.
I think this might help you.
Regards,
Also, make sure you are not constraing your output by .NOTFOUND. Make sure all rows flow through the process.
I think this might help you.
Regards,
Mike Hester
mhester@petra-ps.com
mhester@petra-ps.com
-
- Participant
- Posts: 3337
- Joined: Mon Jan 17, 2005 4:49 am
- Location: United Kingdom