Hash files and equijoin

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
babbu9
Premium Member
Premium Member
Posts: 75
Joined: Tue Jun 01, 2004 9:44 am

Hash files and equijoin

Post by babbu9 »

Hi
I am retrieving 3 columns into a hash file and would like to do a equijoin with 3 columns coming from a flat file in the transformer stage.
1.I would like to know what happens if I define all three columns as keys in the hash file. How is this interpreted by Datastage. Is it that each column is primary key or that the combination of three columns is a primary key.

2. The equijoin in transformer is it equal to SQL equijoin i.e
is it same as

select *
from table a, table b
where a.col1=b.col1 and a.col2=b.col2 and c.col3 =c.col3


Please inform.
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

The metadata used to write a hash file has to be the metadata used to reference the hash. The values defined as the keys are written to a different area in a hash file, therefore, using a different metadata definition does not correctly find rows in the hash file. If you used 3 columns as a multi-part key, then those 3 columns have to be used to find the row when referencing the hash file.

If you "uncheck" a column as part of the multi-part key, then NO rows can ever be found. Unlike a SQL join, a hash file does not reference a row using matching criteria and can find 0, 1, or many rows. A hash reference finds either the exact match (there can only be one) or none.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
babbu9
Premium Member
Premium Member
Posts: 75
Joined: Tue Jun 01, 2004 9:44 am

More Info

Post by babbu9 »

Thank you for the promt reply.Can you eloborate the statement "The metadata used to write a hash file has to be the metadata used to reference the hash". Do the datatypes have to exactly the same including the lengths. In my case the datatype to create hash file is a char for the 3 cols and a varchar for referencing those 3 cols.

My problem is like this. I have a table(reference table) with carrier, account and group ids, all characters. I have to make sure that the records coming in through a flat file have exactly the same values for these 3 cols as in the reference table else reject the record.

How will we implement this.

Please inform.
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

For a hash file, the order of the columns (hash files are really delimited strings of data, therefore "positional") and the key designations are what matters. The data types and widths currently don't matter to hash files.

Make sure you compare apples to apples. When referencing a hash file, you must make sure that the key expression in the reference link has the EXACT value it that should match the lookup row. This means making sure you have TRIMmed if necessary, and performed a case alignment (UPCASE,etc). "FRED" = "FRED" is the match, never "fred" = " FRED". The most common mistake is trailing/leading spaces or case inconsistencies, followed by numerics having trailing zeroes or not ("10.2" = "10.20" is not TRUE).
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
Post Reply