Using Hashed Files

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
sh.bangash
Participant
Posts: 15
Joined: Wed Aug 01, 2007 5:23 am
Location: Islamabad

Using Hashed Files

Post by sh.bangash »

Hi Gurus,

This may sound very basic one to you; but as it is said a thousand miles journey starts with a single leap, so is case with me:

By now, I know what hashed files are; for what purpose we use these. But, I need to know,
1.How do we populate these hashed files first time, for example If i have a text file containing list of countries and thier dialing codes as primary key, how to delcare or say in ADS Designer that this is a hashed file which is source for all lookup of dialing codes against the country name.
2. Secondly, how to use it for the lookup purpose in a server job.

I hope you will help me in taking a single leap to cover a thousand miles journy of ADS.

Regards,
Shahid.
Shahid.
Akumar1
Participant
Posts: 48
Joined: Tue May 22, 2007 3:38 am
Location: bangalore
Contact:

Re: Using Hashed Files

Post by Akumar1 »

You need to create the hash file for populating it with data.
Once the hash file is created you can use the hash file as a look up(specifying the same file name in input and output) subsequently based on the key value you can pass the corresponding value to destination database.

Regards,
Akumar1

sh.bangash wrote:Hi Gurus,

This may sound very basic one to you; but as it is said a thousand miles journey starts with a single leap, so is case with me:

By now, I know what hashed files are; for what purpose we use these. But, I need to know,
1.How do we populate these hashed files first time, for example If i have a text file containing list of countries and thier dialing codes as primary key, how to delcare or say in ADS Designer that this is a hashed file which is source for all lookup of dialing codes against the country name.
2. Secondly, how to use it for the lookup purpose in a server job.

I hope you will help me in taking a single leap to cover a thousand miles journy of ADS.

Regards,
Shahid.
sh.bangash
Participant
Posts: 15
Joined: Wed Aug 01, 2007 5:23 am
Location: Islamabad

Re: Using Hashed Files

Post by sh.bangash »

Hi Kumar,
Thank you for the response; but my question is what steps do I need to take to convert/make a text file to hashed file; and then using it as a source for doing lookups?

Thanks,
Shahid.
Akumar1 wrote:You need to create the hash file for populating it with data.
Once the hash file is created you can use the hash file as a look up(specifying the same file name in input and output) subsequently based on the key value you can pass the corresponding value to destination database.

Regards,
Akumar1

sh.bangash wrote:Hi Gurus,

This may sound very basic one to you; but as it is said a thousand miles journey starts with a single leap, so is case with me:

By now, I know what hashed files are; for what purpose we use these. But, I need to know,
1.How do we populate these hashed files first time, for example If i have a text file containing list of countries and thier dialing codes as primary key, how to delcare or say in ADS Designer that this is a hashed file which is source for all lookup of dialing codes against the country name.
2. Secondly, how to use it for the lookup purpose in a server job.

I hope you will help me in taking a single leap to cover a thousand miles journy of ADS.

Regards,
Shahid.
Shahid.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

1. DataStage will automatically create the hashed file, or you can specify that it does so by checking the "Create file" check box in the hashed file stage that populates it.

2. The Hashed File stage is on the upstream end of a reference link that is input to a Transformer stage. The Transformer stage includes a reference key expression that supplies the value to be looked up as the key of the hashed file (this can be more than one column, in which case there is more than one reference key expression in the Transformer stage).

Akumar1, it's hashed file, not hash file.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
sh.bangash
Participant
Posts: 15
Joined: Wed Aug 01, 2007 5:23 am
Location: Islamabad

Post by sh.bangash »

Hi,
Here i list steps that I have taken to create a server job that uses hashed files, but is not working properly.

1. I dropped an ODBC-stage and cofigured it to get data from sql-server database table authors.
2. Linked it to a transformer which is actually to do a lookup of author contry code and get country name based on country code matching between author table and author_country hashed file.
3. I dropped a hashed file on canvas and linked it to transformer stage.
I configured it as on output tab:
a. Use directory path : d:\acensial\mydata
b. File Name: Authors_codes.csv
4. I configured the transformer stage to lookup author country name from hashed file on the basis of Author.Country_Code=Country.Country_Code

5. I linked the result of transformer to a sequential file.

Now, I am facing following problems:

@1. when i execute the job; it fails with error job aborted

Am I missing some basic point here...

Regards,
Shahid.[/img]
Shahid.
Akumar1
Participant
Posts: 48
Joined: Tue May 22, 2007 3:38 am
Location: bangalore
Contact:

Re: Using Hashed Files

Post by Akumar1 »

Please follow the below steps for creating the hash file and reusing it as look-up
1.take-up either text file or relational database as a source, take-up hash file stage as a intermediate target which you want to take as a look-up purpose.
2. Check the create file option.
3. Specify the path for the file to be written.
Execute it (the hash file would be created) at requisite directory.
4:Now design another job, take-up your source and take-up the hash file stage for look-up, specify the same path where you have created the hash file and make sure that in the input and output both should contain same filename and you should be able to view the hash file data once you specify the filename. Connect this to transformer.
5:take-up the stage for data to be written and match the required column in the target in output option.

Based upon the key value of look-up (hash file) and as per your condition it will filter and pass the record in the target DB or file.

Hope it will work.

Regards,
Akumar1


sh.bangash wrote:Hi Kumar,
Thank you for the response; but my question is what steps do I need to take to convert/make a text file to hashed file; and then using it as a source for doing lookups?

Thanks,
Shahid.
Akumar1 wrote:You need to create the hash file for populating it with data.
Once the hash file is created you can use the hash file as a look up(specifying the same file name in input and output) subsequently based on the key value you can pass the corresponding value to destination database.

Regards,
Akumar1

sh.bangash wrote:Hi Gurus,

This may sound very basic one to you; but as it is said a thousand miles journey starts with a single leap, so is case with me:

By now, I know what hashed files are; for what purpose we use these. But, I need to know,
1.How do we populate these hashed files first time, for example If i have a text file containing list of countries and thier dialing codes as primary key, how to delcare or say in ADS Designer that this is a hashed file which is source for all lookup of dialing codes against the country name.
2. Secondly, how to use it for the lookup purpose in a server job.

I hope you will help me in taking a single leap to cover a thousand miles journy of ADS.

Regards,
Shahid.
sh.bangash
Participant
Posts: 15
Joined: Wed Aug 01, 2007 5:23 am
Location: Islamabad

Post by sh.bangash »

Hi Kumar,

Thank you for the guidance; It worked!
I did create a hashed file; used it to lookup country name based on country-code and wrote results to target odbc table.
While working on this I oberved following:
1. When I created a hashed file; it resulted in creation of a directory which contained three files as .TYPE30, DATA30 and OVER30.
2. One question popped in my mind that currently, I have 100 records in the text file which i used to create the hashed file; In future I will have additional records in hte text file, do I have to create each time hash file to include new records into it?

Thanks again and Regards.
Shahid.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

1. Yes, the default type of hashed file is a 'Dynamic' or Type 30 hashed file. It is stored in a directory of the same name with the three files you've noted. Of course, there are other 'static' types as well but the vast majority of what you do will be with dynamic hashed files.

2. That depends. If you need to 'add' the new records to the old data, simply write them to the hashed file. Keep in mind the 'destructive overwrite' key handling done - there's no such thing as duplicate records in a hashed file per key and 'last one in wins'. This also implies no 'update' is done, new records completely replace old records. If you need to replace the old contents with new content, ensure the 'Clear' check-box is checked on the Input link.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Akumar1
Participant
Posts: 48
Joined: Tue May 22, 2007 3:38 am
Location: bangalore
Contact:

Re: Using Hashed Files

Post by Akumar1 »

Hi,

hope information provided by "chulett" is more then enough.i would like to add one more thing that is whenever you creat a hash file, by default it creats TYPE30, DATA30 and OVER30 files.which you can't view because the format is binary.

Regards,
Akumar1
chulett wrote:1. Yes, the default type of hashed file is a 'Dynamic' or Type 30 hashed file. It is stored in a directory of the same name with the three files you've noted. Of course, there are other 'static' types as well but the vast majority of what you do will be with dynamic hashed files.

2. That depends. If you need to 'add' the new records to the old data, simply write them to the hashed file. Keep in mind the 'destructive overwrite' key handling done - there's no such thing as duplicate records in a hashed file per key and 'last one in wins'. This also implies no 'update' is done, new records completely replace old records. If you need to replace the old contents with new content, ensure the 'Clear' check-box is checked on the Input link.
Post Reply