Hi,
I have a server job where i am reading a fixed width file and separating the columns using a transformer. Again i am using one more basic transformer to do the validations. If i do all these in a single transformer, will the performance improve.
This is some existing design, where i am trying to reduce the time taken for the job to run.
transformer question
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 467
- Joined: Tue Mar 20, 2007 6:36 am
- Location: Chennai
- Contact:
I am not a pro at server jobs, But let me give it a shot.
If you think about it, Even with the Pipeline paralleism in place for your two transformers, The rows are going to be processed by Transformer1 and then by Transformer2 in a sequential fashion. And you can always put the logic of two consecutive transformers into one. So, I am not sure if moving the logic from two transformers into one will really help. You may not see much of a difference in the performance of these. But, if you want to improve the performance, you can instead use a Link Partitioner and partition the data into two streams and then process the data using two transformers on your streams and then merge the streams together using a Link collector.
Server Gurus, Correct me if I am wrong.
If you think about it, Even with the Pipeline paralleism in place for your two transformers, The rows are going to be processed by Transformer1 and then by Transformer2 in a sequential fashion. And you can always put the logic of two consecutive transformers into one. So, I am not sure if moving the logic from two transformers into one will really help. You may not see much of a difference in the performance of these. But, if you want to improve the performance, you can instead use a Link Partitioner and partition the data into two streams and then process the data using two transformers on your streams and then merge the streams together using a Link collector.
Server Gurus, Correct me if I am wrong.
Minhajuddin
<a href="http://feeds.feedburner.com/~r/MyExperi ... ~6/2"><img src="http://feeds.feedburner.com/MyExperienc ... lrow.3.gif" alt="My experiences with this DLROW" border="0"></a>
<a href="http://feeds.feedburner.com/~r/MyExperi ... ~6/2"><img src="http://feeds.feedburner.com/MyExperienc ... lrow.3.gif" alt="My experiences with this DLROW" border="0"></a>
I am not a Guru, but you stand corrected.Minhajuddin wrote:Server Gurus, Correct me if I am wrong.
However, if you use an inter-process buffering between the 2 transformers, you are certainly invoking the second process in a pipeline fashion, thereby giving you a performance gain.Minhajuddin wrote:Even with the Pipeline paralleism in place for your two transformers, The rows are going to be processed by Transformer1 and then by Transformer2 in a sequential fashion.
gateleys
-
- Participant
- Posts: 78
- Joined: Wed Jun 04, 2008 2:59 am
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
I am a server job guru (!) and wish to clarify as follows.
If two Transformer stages - or any kind of active stages - are directly connected by a link then, by default, they will run in the same process.
Enabling inter-process row buffering, whether at the job level or by placing an IPC stage between the two active stages, will cause them to run in separate processes.
Whether or not this results in improved throughput will depend on a number of factors, primarily the total load on the machine. If it's already maxed out, then no gain is possible by adding processes.
If two Transformer stages - or any kind of active stages - are directly connected by a link then, by default, they will run in the same process.
Enabling inter-process row buffering, whether at the job level or by placing an IPC stage between the two active stages, will cause them to run in separate processes.
Whether or not this results in improved throughput will depend on a number of factors, primarily the total load on the machine. If it's already maxed out, then no gain is possible by adding processes.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.