Page 1 of 1

Hash files and equijoin

Posted: Thu Feb 10, 2005 8:56 pm
by babbu9
Hi
I am retrieving 3 columns into a hash file and would like to do a equijoin with 3 columns coming from a flat file in the transformer stage.
1.I would like to know what happens if I define all three columns as keys in the hash file. How is this interpreted by Datastage. Is it that each column is primary key or that the combination of three columns is a primary key.

2. The equijoin in transformer is it equal to SQL equijoin i.e
is it same as

select *
from table a, table b
where a.col1=b.col1 and a.col2=b.col2 and c.col3 =c.col3


Please inform.

Posted: Thu Feb 10, 2005 9:07 pm
by kcbland
The metadata used to write a hash file has to be the metadata used to reference the hash. The values defined as the keys are written to a different area in a hash file, therefore, using a different metadata definition does not correctly find rows in the hash file. If you used 3 columns as a multi-part key, then those 3 columns have to be used to find the row when referencing the hash file.

If you "uncheck" a column as part of the multi-part key, then NO rows can ever be found. Unlike a SQL join, a hash file does not reference a row using matching criteria and can find 0, 1, or many rows. A hash reference finds either the exact match (there can only be one) or none.

More Info

Posted: Thu Feb 10, 2005 9:24 pm
by babbu9
Thank you for the promt reply.Can you eloborate the statement "The metadata used to write a hash file has to be the metadata used to reference the hash". Do the datatypes have to exactly the same including the lengths. In my case the datatype to create hash file is a char for the 3 cols and a varchar for referencing those 3 cols.

My problem is like this. I have a table(reference table) with carrier, account and group ids, all characters. I have to make sure that the records coming in through a flat file have exactly the same values for these 3 cols as in the reference table else reject the record.

How will we implement this.

Please inform.

Posted: Thu Feb 10, 2005 9:42 pm
by kcbland
For a hash file, the order of the columns (hash files are really delimited strings of data, therefore "positional") and the key designations are what matters. The data types and widths currently don't matter to hash files.

Make sure you compare apples to apples. When referencing a hash file, you must make sure that the key expression in the reference link has the EXACT value it that should match the lookup row. This means making sure you have TRIMmed if necessary, and performed a case alignment (UPCASE,etc). "FRED" = "FRED" is the match, never "fred" = " FRED". The most common mistake is trailing/leading spaces or case inconsistencies, followed by numerics having trailing zeroes or not ("10.2" = "10.20" is not TRUE).