Hi All,
I am developing a parallel job which need to perform a lookup on a huge database table. The lookup involves 2 key columns and 1 data column (about 30 bytes in total) and about 10 million rows?
What kind of lookup mechanism is advisable for this job? How can I decide whether to use a ODBC lookup stage or a hashfile in this job?
Also, first of all, can hashfiles be used in parallel jobs in datastage or is it not advisable/available?
Thanks in advance
Venkatesh
Use of Hashfiles in a parallel job
Moderators: chulett, rschirm, roy
Re: Use of Hashfiles in a parallel job
First of all, can they be used? Technically yes, in a Server Shared Container. Should they be used? No.avenki77 wrote:Also, first of all, can hashfiles be used in parallel jobs in datastage or is it not advisable/available?
You've still got your Server Thinking Cap on. It needs to go back in the closet and you need to approach these problems with a different mindset. There are specific 'lookup' stages in PX you should be using - Join, Merge, Lookup - all of which are discussed in the Parallel Job Developer's Guide pdf.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
There is no such thing as a hash file in DataStage. A hashed file is a popular way to store lookup reference data in server jobs. They should not be used in parallel jobs, as to do so will thwart the automatic scaling capability of these jobs.
Stop thinking like a server job developer and investigate parallel alternatives such Lookup File Sets or normal lookups via a Lookup stage. Or Join stage or Merge stage where appropriate.
Stop thinking like a server job developer and investigate parallel alternatives such Lookup File Sets or normal lookups via a Lookup stage. Or Join stage or Merge stage where appropriate.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.