Page 1 of 1

Link Collector with Hashed file giving Empty Data set

Posted: Wed Mar 29, 2006 1:58 pm
by gateleys
Hi,
I never faced this problem before. I have a job where 3 sequential files are passed through a Link Collector, and the output is a hashed file (essentially to remove duplicate rows).

Code: Select all

SeqFile1--------\
SeqFile2--------LinkCollector----------HashedFile
SeqFile3--------/
Each file has 2 input columns with the same metadata. Both fields are set as key in the Hashed file (its a requirement). The Collector has been set with RoundRobin collection algorithm. When I run the job, the output link from LinkCollector states that there are 125,000 rows passed to the hashed file. But, when I try to view data in hashed file, it tells me data set is empty. When I replace the Hashed file with a sequential file, it works fine. So, the problem seems to be with the LinkCollector (algorithm) and Hashed file incompatibility (due to some mis-settings!! on my part). I have also tried with the Sort/Merge Algorithm with both of them as Sort Keys. Also tried to have a single field as key, etc, etc. But, the hashed file shows an empty set. Please let me know where I could have gone wrong.

Thanks.
gateleys

Re: Link Collector with Hashed file giving Empty Data set

Posted: Wed Mar 29, 2006 2:17 pm
by gateleys
What I now did is change my earlier design to-

Code: Select all

SeqFile1--------\ 
SeqFile2--------LinkCollector----->SeqFile------>HashedFile 
SeqFile3--------/ 
And it works fine. Of course, the job is done, but I am still curious about the behavior of the HashedFile when used at the output of a LinkCollector :? .

gateleys

Re: Link Collector with Hashed file giving Empty Data set

Posted: Thu Mar 30, 2006 7:17 am
by gateleys
I was just hoping if anyone would give me some information on this.
Thanks,
gateleys

Posted: Thu Mar 30, 2006 7:21 am
by kumar_s
I dont guess duplicate should also be an issue, because
SeqFile------>HashedFile worked without any error. :?

Posted: Sun Apr 02, 2006 8:31 pm
by kcbland
Try replacing the intermediate sequential file with a transformer stage. You'll get the ability to add a reject link, which could trap out the rows being rejected from the hashed file stage.

Posted: Sun Apr 02, 2006 9:17 pm
by rleishman
Make sure that the LC Inputs and Outputs have the same KEY columns checked. I've had similar probs when they are not matched.