Page 1 of 1

Efficient DS Job design to avoid errors

Posted: Sat Oct 08, 2011 10:03 pm
by ppp
I have a job with nearly 100 stages and has multiple instance enabled and it runs 8x8. Until now it was working fine but today the job erred with the below message:
APT_PMPlayer::APT_PMPlayer: fork() failed, Resource temporarily unavailable

Seems that the above is due to resource contention issues.
But I also want to understand if the way my job is built (number of stages)has also contributed towards this error.
Can you please tell me if there are best practices related to the number of stages in a DS Job.

Thank you

Posted: Sat Oct 08, 2011 11:28 pm
by pandeesh
This has been discussed already..
Please search here ..

Posted: Mon Oct 10, 2011 8:53 am
by ppp
Pandeesh,

I searched this forum and found answers related to the error but not with respect to the best practice related to the number of stages per a DS Job and how that could have negative impact on the performance of a job.

Thanks

Posted: Mon Oct 10, 2011 11:17 am
by pandeesh
DOn't worry! Ray or Craig will assist you!

Posted: Mon Oct 10, 2011 2:08 pm
by ray.wurlod
Nice to have that vote of confidence, but we're far from being the only helpful posters on this site.

Getting this kind of error occasionally suggests that you're working your server close to its limits, and very occasionally exceeding them. It's simple supply and demand: you need to increase the supply of resources (CPU, memory, wherever the bottleneck is) or to schedule tasks more cleverly so that demand at any particular time is reduced. I'm not sure what you mean by 8x8; but it looks like you may have 16 other hours to play with.

Posted: Mon Oct 10, 2011 2:30 pm
by FranklinE
What Ray said. :wink:

General design principle: Never do in one job what you can logically break into two or more jobs. In Cobol development slang, that 100-stage job qualifies easily as "spaghetti code".

The primary caution in that principle is not execution performance, it's coding maintenance "performance". If I can spend (say) one hour searching through six jobs for the one where my coding changes need to be done, I can easily spend multiples of that time searching through one job that is as large or larger than those six job combined. If the original coders paid any sort of attention to naming conventions, that hour might be much less.

Posted: Mon Oct 10, 2011 3:35 pm
by lstsaur
Subdividing and breaking down your 100 stages job's logic into smaller and more manageable pieces. Then provide the annotation for each stage.

Posted: Mon Oct 10, 2011 4:15 pm
by ppp
Thank you all for your suggestions.

Posted: Mon Oct 10, 2011 8:05 pm
by jwiles
In addition to the options of breaking apart the job, examine it from the viewpoint of: Does the job really require 100 stages to do what it's doing? And, does it really need to run 8x8 (does your job run with 64 logical nodes per instance)?

If you haven't already, I suggest reading the IBM Redbook on DataStage Parallel Framework Standard Practices (Google it for the free PDF download). While it doesn't cover some of the newer functions such as transformer looping, you may find suggestions applicable to your situation.

Regards,

Posted: Mon Oct 10, 2011 8:08 pm
by chulett
jwiles wrote:If you haven't already, I suggest reading the IBM Redbook on DataStage Parallel Framework Standard Practices
http://www.redbooks.ibm.com/abstracts/S ... SEF&mync=E