Lock For Update For HashFile Stage - Am I missing something?
Posted: Wed May 19, 2004 1:43 pm
The answer is probably yes, but I will ask the question anyway
I have a job that has a stream link into Transformer1 that does a lookup against a HashFileStage (HFstg1), if a record is not found I pass the data out of Transformer1 into Transformer2 to do some jiggery-pokery and insert the record into HFstg2. Its a bit more complicated than that (at least thats what I tell my boss ) but that is the pertinant logic.
All works well if the inital stream link has discrete key values for the hash file, however if it has (say) two records with the same key (e.g. custno) - it seems to not find them in the HFstg1 stage(presuming its not there initially) and pass both down to Transformer2 and add them to HFStg2, this causes a destuctive update as the 'jiggery pokery' is assigning surrogate keys.
So the steps are
1) Record one - custno = 12345
2) Not in HFStg1 send to Transformer2 , add surr key , 1
3) Insert 12345,1 into HFstg2
4) Record two - custno = 12345
5) Not in HFStg1 send to Transformer2 , add surr key , 2
6) Update HFstg2 to create 12345,2 into HFstg2
Obvious things first
a) although I am using different HF stages they all point to same hash file
b) I have Disabled, Lock for updates on HFStg1
c) I havent got write stage caching on
d) I am an amateur (though talented - not)
My understanding is that if the Disabled, Lock for updates causes a failed lookup to wait for the keyed record to be inserted which is indeed done downstream.
I have got round this for now by placing an agregator stage between the two transformers to 'uniqueify' the naturalkeys before they hit the second transformer - but this seems a bit silly.
I would be grateful for any pointers, solutions derisory remarks as this is quite a crucial piece of logic for our application.
Thanks in advance
fridge
I have a job that has a stream link into Transformer1 that does a lookup against a HashFileStage (HFstg1), if a record is not found I pass the data out of Transformer1 into Transformer2 to do some jiggery-pokery and insert the record into HFstg2. Its a bit more complicated than that (at least thats what I tell my boss ) but that is the pertinant logic.
All works well if the inital stream link has discrete key values for the hash file, however if it has (say) two records with the same key (e.g. custno) - it seems to not find them in the HFstg1 stage(presuming its not there initially) and pass both down to Transformer2 and add them to HFStg2, this causes a destuctive update as the 'jiggery pokery' is assigning surrogate keys.
So the steps are
1) Record one - custno = 12345
2) Not in HFStg1 send to Transformer2 , add surr key , 1
3) Insert 12345,1 into HFstg2
4) Record two - custno = 12345
5) Not in HFStg1 send to Transformer2 , add surr key , 2
6) Update HFstg2 to create 12345,2 into HFstg2
Obvious things first
a) although I am using different HF stages they all point to same hash file
b) I have Disabled, Lock for updates on HFStg1
c) I havent got write stage caching on
d) I am an amateur (though talented - not)
My understanding is that if the Disabled, Lock for updates causes a failed lookup to wait for the keyed record to be inserted which is indeed done downstream.
I have got round this for now by placing an agregator stage between the two transformers to 'uniqueify' the naturalkeys before they hit the second transformer - but this seems a bit silly.
I would be grateful for any pointers, solutions derisory remarks as this is quite a crucial piece of logic for our application.
Thanks in advance
fridge