Page 1 of 1

How to use Hash Files (server Jobs) in Parallel Jobs?

Posted: Mon Oct 02, 2006 10:52 am
by nitin376
Hi
I have hash files in my server jobs as those files were used as lookups, I need to access hash files for lookups in parallel but parallel ext doesn't support hash files. So, do I need to convert hash files to different format to access them in parallel and how ?

Posted: Mon Oct 02, 2006 11:20 am
by DSguru2B
If you absolutely have to use the hashed files, then you can use a server job which is not advisable unless the job is small. The other way is to build a lookup file set just the way you built the hashed file.
You can also load the contents of the hashed file into a sequential file in a server job and then load that sequential file in a lookup file set.
So you have a few options.

How to build lookup file set ?

Posted: Mon Oct 02, 2006 11:29 am
by nitin376
DSguru2B wrote:If you absolutely have to use the hashed files, then you can use a server job which is not advisable unless the job is small. The other way is to build a lookup file set just the way you built the hashed file.
You can also load the contents of the hashed file into a sequential file in a server job and then load that sequential file in a lookup file set.
So you have a few options.
Hi,
Thanks for the promt reply !
I don't know how to build lookup file set or hash file , I would appreciate if you could let me know the procedure to build the hash or lookup file set.

Thanks

Posted: Mon Oct 02, 2006 11:39 am
by kris007
By saying build the hashed files, he means that you need to develop the job to load the data into the look up file sets the same way you load the hashed files in your server job.

Re: How to build lookup file set ?

Posted: Mon Oct 02, 2006 11:41 am
by avi21st
nitin376 wrote:
DSguru2B wrote:If you absolutely have to use the hashed files, then you can use a server job which is not advisable unless the job is small. The other way is to build a lookup file set just the way you built the hashed file.
You can also load the contents of the hashed file into a sequential file in a server job and then load that sequential file in a lookup file set.
So you have a few options.
Hi,
Thanks for the promt reply !
I don't know how to build lookup file set or hash file , I would appreciate if you could let me know the procedure to build the hash or lookup file set.

Thanks
Using Hash files by creating a sever shared container in a parallel job is not advisable. Create a job which extracts from the database and loads a dataset. Use this dataset in another job and join it on the basis of the join keys using a Lookup stage.

If the data volume is larger(>2 GB) use a Join stage.

Posted: Mon Oct 02, 2006 12:31 pm
by kcbland
The easiest way is to write the hashed file to a sequential file and then use a PX job to read that sequental file into any of the lookup mechanisms available in PX. Consider using a container on the hashed to sequential portion and embed inside the PX job. You want to make sure the data is moved into the parallel framework for optimal processing.

Re: How to use Hash Files (server Jobs) in Parallel Jobs?

Posted: Mon Oct 02, 2006 3:06 pm
by ray.wurlod
nitin376 wrote:Hi
I have hash files in my server jobs as those files were used as lookups, I need to access hash files for lookups in parallel but parallel ext doesn't support hash files. So, do I need to convert hash files to different format to access them in parallel and how ?
Instead of Hashed File stage, use Lookup File Set stage. You provide the name of the File Set control file (for example blah.fs); and you use a Lookup stage rather than a Transformer stage for performing the lookup.