How to use Hash Files (server Jobs) in Parallel Jobs?

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
nitin376
Charter Member
Charter Member
Posts: 40
Joined: Tue Apr 11, 2006 9:38 am

How to use Hash Files (server Jobs) in Parallel Jobs?

Post by nitin376 »

Hi
I have hash files in my server jobs as those files were used as lookups, I need to access hash files for lookups in parallel but parallel ext doesn't support hash files. So, do I need to convert hash files to different format to access them in parallel and how ?
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

If you absolutely have to use the hashed files, then you can use a server job which is not advisable unless the job is small. The other way is to build a lookup file set just the way you built the hashed file.
You can also load the contents of the hashed file into a sequential file in a server job and then load that sequential file in a lookup file set.
So you have a few options.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
nitin376
Charter Member
Charter Member
Posts: 40
Joined: Tue Apr 11, 2006 9:38 am

How to build lookup file set ?

Post by nitin376 »

DSguru2B wrote:If you absolutely have to use the hashed files, then you can use a server job which is not advisable unless the job is small. The other way is to build a lookup file set just the way you built the hashed file.
You can also load the contents of the hashed file into a sequential file in a server job and then load that sequential file in a lookup file set.
So you have a few options.
Hi,
Thanks for the promt reply !
I don't know how to build lookup file set or hash file , I would appreciate if you could let me know the procedure to build the hash or lookup file set.

Thanks
kris007
Charter Member
Charter Member
Posts: 1102
Joined: Tue Jan 24, 2006 5:38 pm
Location: Riverside, RI

Post by kris007 »

By saying build the hashed files, he means that you need to develop the job to load the data into the look up file sets the same way you load the hashed files in your server job.
Kris

Where's the "Any" key?-Homer Simpson
avi21st
Charter Member
Charter Member
Posts: 135
Joined: Thu May 26, 2005 10:21 am
Location: USA

Re: How to build lookup file set ?

Post by avi21st »

nitin376 wrote:
DSguru2B wrote:If you absolutely have to use the hashed files, then you can use a server job which is not advisable unless the job is small. The other way is to build a lookup file set just the way you built the hashed file.
You can also load the contents of the hashed file into a sequential file in a server job and then load that sequential file in a lookup file set.
So you have a few options.
Hi,
Thanks for the promt reply !
I don't know how to build lookup file set or hash file , I would appreciate if you could let me know the procedure to build the hash or lookup file set.

Thanks
Using Hash files by creating a sever shared container in a parallel job is not advisable. Create a job which extracts from the database and loads a dataset. Use this dataset in another job and join it on the basis of the join keys using a Lookup stage.

If the data volume is larger(>2 GB) use a Join stage.
Avishek Mukherjee
Data Integration Architect
Chicago, IL, USA.
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

The easiest way is to write the hashed file to a sequential file and then use a PX job to read that sequental file into any of the lookup mechanisms available in PX. Consider using a container on the hashed to sequential portion and embed inside the PX job. You want to make sure the data is moved into the parallel framework for optimal processing.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Re: How to use Hash Files (server Jobs) in Parallel Jobs?

Post by ray.wurlod »

nitin376 wrote:Hi
I have hash files in my server jobs as those files were used as lookups, I need to access hash files for lookups in parallel but parallel ext doesn't support hash files. So, do I need to convert hash files to different format to access them in parallel and how ?
Instead of Hashed File stage, use Lookup File Set stage. You provide the name of the File Set control file (for example blah.fs); and you use a Lookup stage rather than a Transformer stage for performing the lookup.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply