Hash File (mis)behaviour
Posted: Tue Aug 30, 2005 1:27 am
Hi,
We have a job which looks as follows:
Ora9i ----> Trans_1 ----> Hash1 -----> (Further Stages)
The output Link of Hash1 is a reference link to further stages in the job.
Output row counts:
---------------------
Ora9i to Trans_1 = 5150800
Trans_1 to Hash1 = 5150800
Hash1 to Further Stages = 5150811
Please note that the input row count from the Transformer_1 to Hash1 is 5150800 whereas it outputs 5150811 rows. This is quite strange because we expect the input and output row counts to be equal. The file names and column definitions in the 'Input' and 'Output' tabs of Hash1 are exactly the same and only the first field (column) of the Hash1 has been defined as the Key in the 'Input' and 'Output' tabs.
We took this hash file (HashFile1), created a new job and output it to a sequential file. In this case the Input Link and Output Link show the row count as 5150800.
Can someone tell us what could be the reason for the HashFile1 to output more rows than input?
Regards
Shrikanth
We have a job which looks as follows:
Ora9i ----> Trans_1 ----> Hash1 -----> (Further Stages)
The output Link of Hash1 is a reference link to further stages in the job.
Output row counts:
---------------------
Ora9i to Trans_1 = 5150800
Trans_1 to Hash1 = 5150800
Hash1 to Further Stages = 5150811
Please note that the input row count from the Transformer_1 to Hash1 is 5150800 whereas it outputs 5150811 rows. This is quite strange because we expect the input and output row counts to be equal. The file names and column definitions in the 'Input' and 'Output' tabs of Hash1 are exactly the same and only the first field (column) of the Hash1 has been defined as the Key in the 'Input' and 'Output' tabs.
We took this hash file (HashFile1), created a new job and output it to a sequential file. In this case the Input Link and Output Link show the row count as 5150800.
Can someone tell us what could be the reason for the HashFile1 to output more rows than input?
Regards
Shrikanth