Inter-process

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
admiral69
Participant
Posts: 3
Joined: Sun Aug 29, 2004 5:52 am

Inter-process

Post by admiral69 »

I've a very simple job: SeqFile->Transformer->SeqFile.
I want to improve a performance.
There is a multi-processor system.
When I use the IPC stage I have a great performance.
But when I specify it implicitly by turning inter process
row buffering on via Data Stage Administrator,
a speed of reading the data like single-process mode job.
But I can see all of the processors are working :?:
Thanks,
George
ogmios
Participant
Posts: 659
Joined: Tue Mar 11, 2003 3:40 pm

Re: Inter-process

Post by ogmios »

From sequential file to sequential flie there's not much you can do do parallize it... it's named sequential file for something. What you could is to do is split jobs to output to several files and then recombine them afterwards. but whether this would be really worth it.

Ogmios
In theory there's no difference between theory and practice. In practice there is.
shawn_ramsey
Participant
Posts: 145
Joined: Fri May 02, 2003 9:59 am
Location: Seattle, Washington. USA

Re: Inter-process

Post by shawn_ramsey »

admiral69 wrote:I've a very simple job: SeqFile->Transformer->SeqFile.
I want to improve a performance.
There is a multi-processor system.
When I use the IPC stage I have a great performance.
But when I specify it implicitly by turning inter process
row buffering on via Data Stage Administrator,
a speed of reading the data like single-process mode job.
But I can see all of the processors are working :?:
Thanks,
George
If I recall correctly turning the automatic buffering in DataStage will only insert the IPC between two active stages. In you case the only active stage that you have is the one transformer so no IPC stage was inserted.
Shawn Ramsey

"It is a mistake to think you can solve any major problems just with potatoes."
-- Douglas Adams
shawn_ramsey
Participant
Posts: 145
Joined: Fri May 02, 2003 9:59 am
Location: Seattle, Washington. USA

Re: Inter-process

Post by shawn_ramsey »

Ogmios,

We use the IPC stage quite frequently in this type of a scenario and have seen some significant performance benefits. It has less with paralleization of the processing of rows and more with splitting the processing of the single stream across multiple processors. The biggest benefit we have seen is where the source is a complex flat file and the destination is sequential. CFF -> IPC -> Xfrm -> Sequential.
Shawn Ramsey

"It is a mistake to think you can solve any major problems just with potatoes."
-- Douglas Adams
admiral69
Participant
Posts: 3
Joined: Sun Aug 29, 2004 5:52 am

Re: Inter-process

Post by admiral69 »

shawn_ramsey wrote:Ogmios,

We use the IPC stage quite frequently in this type of a scenario and have seen some significant performance benefits. It has less with paralleization of the processing of rows and more with splitting the processing of the single stream across multiple processors. The biggest benefit we have seen is where the source is a complex flat file and the destination is sequential. CFF -> IPC -> Xfrm -> Sequential.
We also use it in more complicate job, but the question is - why the performance is different if I use implicitly option via DataStage Administrator?
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Because, as Shawn rightly said, implicit row buffering only occurs on a link that joins two active stages. Your job design only has one active stage (the Transformer stage).
Explicit IPC stages force a process boundary to exist.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
ewartpm
Participant
Posts: 97
Joined: Wed Jun 25, 2003 2:15 am
Location: South Africa
Contact:

Post by ewartpm »

Hi George

Have you tried using the LINK PARTITIONER stage. It will allow a single input stream to be split up to 64 ways thereby utilising the SMP architecture (you need to have the Inter-Process option selected).
admiral69
Participant
Posts: 3
Joined: Sun Aug 29, 2004 5:52 am

Post by admiral69 »

ewartpm wrote:Hi George

Have you tried using the LINK PARTITIONER stage. It will allow a single input stream to be split up to 64 ways thereby utilising the SMP architecture (you need to have the Inter-Process option selected).
Yep, I also use this stage.
Thank you all for your posts :)
Post Reply