Inter-process

admiral69 · Post by **admiral69** » Mon Nov 08, 2004 8:38 am

I've a very simple job: SeqFile->Transformer->SeqFile.
I want to improve a performance.
There is a multi-processor system.
When I use the IPC stage I have a great performance.
But when I specify it implicitly by turning inter process
row buffering on via Data Stage Administrator,
a speed of reading the data like single-process mode job.
But I can see all of the processors are working

Thanks,
George

ogmios · Post by **ogmios** » Mon Nov 08, 2004 10:46 am

From sequential file to sequential flie there's not much you can do do parallize it... it's named sequential file for something. What you could is to do is split jobs to output to several files and then recombine them afterwards. but whether this would be really worth it.

Ogmios

shawn_ramsey · Post by **shawn_ramsey** » Mon Nov 08, 2004 12:21 pm

admiral69 wrote:I've a very simple job: SeqFile->Transformer->SeqFile.
I want to improve a performance.
There is a multi-processor system.
When I use the IPC stage I have a great performance.
But when I specify it implicitly by turning inter process
row buffering on via Data Stage Administrator,
a speed of reading the data like single-process mode job.
But I can see all of the processors are working
Thanks,
George

If I recall correctly turning the automatic buffering in DataStage will only insert the IPC between two active stages. In you case the only active stage that you have is the one transformer so no IPC stage was inserted.

shawn_ramsey · Post by **shawn_ramsey** » Mon Nov 08, 2004 12:29 pm

Ogmios,

We use the IPC stage quite frequently in this type of a scenario and have seen some significant performance benefits. It has less with paralleization of the processing of rows and more with splitting the processing of the single stream across multiple processors. The biggest benefit we have seen is where the source is a complex flat file and the destination is sequential. CFF -> IPC -> Xfrm -> Sequential.

admiral69 · Post by **admiral69** » Tue Nov 09, 2004 1:51 am

shawn_ramsey wrote:Ogmios,

We use the IPC stage quite frequently in this type of a scenario and have seen some significant performance benefits. It has less with paralleization of the processing of rows and more with splitting the processing of the single stream across multiple processors. The biggest benefit we have seen is where the source is a complex flat file and the destination is sequential. CFF -> IPC -> Xfrm -> Sequential.

We also use it in more complicate job, but the question is - why the performance is different if I use implicitly option via DataStage Administrator?

ray.wurlod · Post by **ray.wurlod** » Tue Nov 09, 2004 2:40 pm

Because, as Shawn rightly said, implicit row buffering only occurs on a link that joins two active stages. Your job design only has one active stage (the Transformer stage).
Explicit IPC stages force a process boundary to exist.

ewartpm · Post by **ewartpm** » Fri Nov 12, 2004 6:11 am

Hi George

Have you tried using the LINK PARTITIONER stage. It will allow a single input stream to be split up to 64 ways thereby utilising the SMP architecture (you need to have the Inter-Process option selected).

admiral69 · Post by **admiral69** » Sun Nov 14, 2004 1:40 am

ewartpm wrote:Hi George

Have you tried using the LINK PARTITIONER stage. It will allow a single input stream to be split up to 64 ways thereby utilising the SMP architecture (you need to have the Inter-Process option selected).

Yep, I also use this stage.
Thank you all for your posts

DSXchange

Inter-process

Inter-process

Re: Inter-process

Re: Inter-process

Re: Inter-process

Re: Inter-process