same hash file usage in multiple batch jobs
Moderators: chulett, rschirm, roy
same hash file usage in multiple batch jobs
Hi All,
Is there any limitations if i use hash file with more than 1 job same time.
The size of hash file would be more than 2 GB.
can any one please help out to know is there any limitations and possible soultions for the same.
Is there any limitations if i use hash file with more than 1 job same time.
The size of hash file would be more than 2 GB.
can any one please help out to know is there any limitations and possible soultions for the same.
RD
Not "limit" per se with looking up to the same hashed file from multiple jobs simultaneously, your normal resource constraints would apply just as if they were X different hashed files. This is a way to enable (don't recall the exact name) system level caching of hashed files such that one cached copy can be leveraged by multiple jobs... but setting that up is not for the faint of heart, IMHO.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
-
- Premium Member
- Posts: 120
- Joined: Thu Oct 28, 2004 4:24 pm
A hashed file of that size is quite large, Just using it as a lookup in 1 job would require that you tune the building of that hash file for the best performance. Of course this depends on the structure of the record, num of columns and size.
Because if you have x amount of jobs reading that same slow file you will definitly have some performance issues.
So I would start with the one job and make sure it performs at its best before I have multiple jobs reading the same hashed file.
Depending on your system, how much memory you have will influence it as well. If you have enough memory to load the hashed file into memory Then it will be shared.
Because if you have x amount of jobs reading that same slow file you will definitly have some performance issues.
So I would start with the one job and make sure it performs at its best before I have multiple jobs reading the same hashed file.
Depending on your system, how much memory you have will influence it as well. If you have enough memory to load the hashed file into memory Then it will be shared.
"Don't let the bull between you and the fence"
Thanks
Gregg J Knight
"Never Never Never Quit"
Winston Churchill
Thanks
Gregg J Knight
"Never Never Never Quit"
Winston Churchill
Not true, I'm afraid, if you simply mean the 'Preload file in memory' option in the stage. As I noted earlier, there are steps to leverage system caching that are noted in the Hash Stage Disk Caching technical bulletin pdf that may (or may not) still ship with the product.greggknight wrote:If you have enough memory to load the hashed file into memory Then it will be shared.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
-
- Premium Member
- Posts: 120
- Joined: Thu Oct 28, 2004 4:24 pm
Well I beleive the original question was the basic use of hashed files. So I wasn't going into any details of tuning them or utilizing cach.
If the question was how to tune a hash file and jobs usinging them I might of elaborated a little bit mor like saying.
In order to pre-load a file into cache we have to ensure that the same process used by the DataStage Transformer also pre-loads the files into cache. This can be achieved by utilizing the following procedures:
Use the ExecTCL Before Stage routine and enter the following:
COUNT FileName
FileName should be the name of the hashed file used for reference lookups. Remember to specify the correct case.
If more than one file needs to be referenced, a paragraph must be created at DataStage TCL level using the editor and the name of the paragraph entered as the ExecTCL Before Stage routine. This can be accomplished by invoking a Telnet session to the DataStage Server. From the > prompt enter the following:
ED VOC ParagraphName (Substitute a descriptive name for ParagraphName)
The Editor will output status info indicating that this is a New Record. If this is not the case type Q, to exit the editor and select a different name for your paragraph.
Type I to enter input mode
Type PA to specify that this entry is a paragraph
Type COUNT FileName1 (substitute the actual filename for FileName1)
Type COUNT FileName2 (substitute the actual filename for FileName2)
Enter as many files as are required, then press the Enter key on a blank line to return to command mode, then type FILE to file the paragraph.
If the question was how to tune a hash file and jobs usinging them I might of elaborated a little bit mor like saying.
In order to pre-load a file into cache we have to ensure that the same process used by the DataStage Transformer also pre-loads the files into cache. This can be achieved by utilizing the following procedures:
Use the ExecTCL Before Stage routine and enter the following:
COUNT FileName
FileName should be the name of the hashed file used for reference lookups. Remember to specify the correct case.
If more than one file needs to be referenced, a paragraph must be created at DataStage TCL level using the editor and the name of the paragraph entered as the ExecTCL Before Stage routine. This can be accomplished by invoking a Telnet session to the DataStage Server. From the > prompt enter the following:
ED VOC ParagraphName (Substitute a descriptive name for ParagraphName)
The Editor will output status info indicating that this is a New Record. If this is not the case type Q, to exit the editor and select a different name for your paragraph.
Type I to enter input mode
Type PA to specify that this entry is a paragraph
Type COUNT FileName1 (substitute the actual filename for FileName1)
Type COUNT FileName2 (substitute the actual filename for FileName2)
Enter as many files as are required, then press the Enter key on a blank line to return to command mode, then type FILE to file the paragraph.
"Don't let the bull between you and the fence"
Thanks
Gregg J Knight
"Never Never Never Quit"
Winston Churchill
Thanks
Gregg J Knight
"Never Never Never Quit"
Winston Churchill
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Incidentally, using COUNT verb probably doesn't load anything into memory, much less the DataStage cache. COUNT is typically resolved from the count of records stored in the hashed file header (unless a current write is open on the hashed file, which can be determined from the group lock table).
Loading into public shared cache is the topic of an entirely separate DataStage manual - this is not a feature of UniVerse, only of DataStage.
Loading into public shared cache is the topic of an entirely separate DataStage manual - this is not a feature of UniVerse, only of DataStage.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.