Page 1 of 1

How to funnel multiple links into single output link ?

Posted: Sat Mar 24, 2007 12:31 am
by coehcl
Hi,
I have a transformer which provides 5 output links each having the same metadata.
I need to horinatally join all the records into a single link. Basically i am trying to achieve the funnel stage functionality of PX flavour,

I tried
TRANFORMER ===> Link Collector ==> Seq. File
, but obviously it will not work as both are active stage.

Could some one give me a pointer on this,

Warm Regards,
Coehcl

Posted: Sat Mar 24, 2007 1:04 am
by us1aslam1us
Check this reply by Ray.Wurlod from a previous post :

Code: Select all

The Link Collector stage creates a process of its own; therefore to communicate with other active stages some form of inter-process communication mechanism is required - inter-process row buffering must be enabled (which is made very clear in the manual). If you want visibility of the buffering, you can add Inter Process Communication (IPC) stages to the job design. 
IHTH
sAM

Posted: Sat Mar 24, 2007 1:14 am
by coehcl
Hi,
Thanks for the response.
But IPC does cannot take more than 1 input,
hence
TXF ==>(Cannot take more than 1 Input) IPC ==> Link Collector ==> Seq. File
will not work,

Any other suggestion pls.

Thanks

Posted: Sat Mar 24, 2007 2:15 am
by rejith.ry
Use hash files before ans after the link collector. Use one hash file/per each input & output link of link collector.

Posted: Sat Mar 24, 2007 2:20 am
by coehcl
Hi Rejith, Thanks for your thoughts, I am not sure if it will work, but I feel that the usage of hashfile (which is gonna occupy physical memory) for this scenario is not an apt and optimal solution,

can't we derive the output on the fly ??

Posted: Sat Mar 24, 2007 5:18 am
by ray.wurlod
Five IPC stages. But these also take physical memory (for the buffers). Why can't your design take physical memory?

Posted: Sat Mar 24, 2007 12:56 pm
by coehcl
We have these jobs for almost 40 countries(hence 40 jobs/module, we have number of modules and design is generic) and considering the volume of records, i personally felt that its better we have a design which occupies less phyical space, please correct me if i am wrong in my thoughts. Even though we can purge these hash files, the house keeping activity becomes an over head.

Thanks for the response.

Posted: Sat Mar 24, 2007 1:03 pm
by ray.wurlod
There will always be an overhead. You cannot process large (or even small) volumes of data without demanding some resources to do so.
Simple economics.

Try a number of techniques, benchmark and measure them in your environment, and make your choice based upon those results.

"There ain't no such thing as a free lunch."

Posted: Sat Mar 24, 2007 1:51 pm
by chulett
I'd be curious what 'house keeping' you are worried about. And if you really were worried about hashed files, they can be easily purged automagically - heck, you could even make that the trailing end of your job stream if you so desired.

Posted: Sat Mar 24, 2007 9:32 pm
by DSguru2B
You can also load the 5 links into 5 different flat files and in an after job subroutine, cat them all together.

Posted: Sun Mar 25, 2007 7:52 am
by chulett
More house keeping. :wink:

Posted: Sun Mar 25, 2007 9:10 am
by coehcl
Thank You everyone for your expert suggestions,

I am closing this thread with the below solution as the optimal one,
TXF (5 o/p links)==>5 IPC ==> Link Collector ==> Seq. File TXF
Warm regards,