Hi,
We are on DataStage version 8.0. I've just got through the pain of writing a nice little job that reads from a database and then creates an XML document using multiple XML output stages.
I've now been told by someone on my team that it is not best practice to use the XML Output stage in a parallel job because it loads in a sequence job per instance of the stage and hence degrades performance?
Is this correct? I've got my doubts because the XML Output stage is in the Parallel job developer guide.
thanks
XML Output stage in Parallel Job - bad performance
Moderators: chulett, rschirm, roy
-
- Premium Member
- Posts: 27
- Joined: Tue Jan 05, 2010 12:04 am
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
I don't believe it is correct, but am struggling to remember back to 8.0. As a general rule, though, a stage that is marked as parallel executes in multiple processes one per processing node defined in the configuration file.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
What? ...xml brings in a Sequence Job?
No. The xmlOutput Stage is a regular Stage like any other...it can be parallelized and behaves like any other Stage. It's not a screamer --- no xml processing usually is, but it certainly doesn't launch a Sequence.
As you move up in releases, you will be happy to start using the new xml Stage (8.5+), which dramatically simplifies the number of steps/stages needed to write a complex, multi-node xml document.
Ernie
No. The xmlOutput Stage is a regular Stage like any other...it can be parallelized and behaves like any other Stage. It's not a screamer --- no xml processing usually is, but it certainly doesn't launch a Sequence.
As you move up in releases, you will be happy to start using the new xml Stage (8.5+), which dramatically simplifies the number of steps/stages needed to write a complex, multi-node xml document.
Ernie
Ernie Ostic
blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
-
- Premium Member
- Posts: 27
- Joined: Tue Jan 05, 2010 12:04 am
The xmlOutput Stage was written a long time ago, before Enterprise Edition, and like all the stages at that time, it had to be adapted to parallel jobs.....and that meant that a special wrapper had to be created in order for them to run in parallel.....but even so, it didn't mean launching a server job or anything like that. The closest the architecture comes to that is with the BASIC Transformer Stage, but we're talking here about xmlOutput. The xmlOutput Stage, like the xmlInput Stage, use xslt in their processing, whether in an EE Job or a Server Job --- perhaps that is where the initial confusion got started....
Ernie
Ernie
Ernie Ostic
blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
-
- Premium Member
- Posts: 27
- Joined: Tue Jan 05, 2010 12:04 am
Hi,
Yes what Ernie is saying sounds very like what was suggested happens. A wrapper for a parallel job. However from the sounds of all the responses I don't think it is a concern hence I'll go back to my parallel written job and continue with that.
Thanks all for the answers. I'm going to mark this thread off as resolved.
Yes what Ernie is saying sounds very like what was suggested happens. A wrapper for a parallel job. However from the sounds of all the responses I don't think it is a concern hence I'll go back to my parallel written job and continue with that.
Thanks all for the answers. I'm going to mark this thread off as resolved.