Efficient DS Job design to avoid errors

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
ppp
Participant
Posts: 21
Joined: Mon Aug 31, 2009 11:53 am

Efficient DS Job design to avoid errors

Post by ppp »

I have a job with nearly 100 stages and has multiple instance enabled and it runs 8x8. Until now it was working fine but today the job erred with the below message:
APT_PMPlayer::APT_PMPlayer: fork() failed, Resource temporarily unavailable

Seems that the above is due to resource contention issues.
But I also want to understand if the way my job is built (number of stages)has also contributed towards this error.
Can you please tell me if there are best practices related to the number of stages in a DS Job.

Thank you
pandeesh
Premium Member
Premium Member
Posts: 1399
Joined: Sun Oct 24, 2010 5:15 am
Location: CHENNAI, TAMIL NADU

Post by pandeesh »

This has been discussed already..
Please search here ..
pandeeswaran
ppp
Participant
Posts: 21
Joined: Mon Aug 31, 2009 11:53 am

Post by ppp »

Pandeesh,

I searched this forum and found answers related to the error but not with respect to the best practice related to the number of stages per a DS Job and how that could have negative impact on the performance of a job.

Thanks
pandeesh
Premium Member
Premium Member
Posts: 1399
Joined: Sun Oct 24, 2010 5:15 am
Location: CHENNAI, TAMIL NADU

Post by pandeesh »

DOn't worry! Ray or Craig will assist you!
pandeeswaran
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Nice to have that vote of confidence, but we're far from being the only helpful posters on this site.

Getting this kind of error occasionally suggests that you're working your server close to its limits, and very occasionally exceeding them. It's simple supply and demand: you need to increase the supply of resources (CPU, memory, wherever the bottleneck is) or to schedule tasks more cleverly so that demand at any particular time is reduced. I'm not sure what you mean by 8x8; but it looks like you may have 16 other hours to play with.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
FranklinE
Premium Member
Premium Member
Posts: 739
Joined: Tue Nov 25, 2008 2:19 pm
Location: Malvern, PA

Post by FranklinE »

What Ray said. :wink:

General design principle: Never do in one job what you can logically break into two or more jobs. In Cobol development slang, that 100-stage job qualifies easily as "spaghetti code".

The primary caution in that principle is not execution performance, it's coding maintenance "performance". If I can spend (say) one hour searching through six jobs for the one where my coding changes need to be done, I can easily spend multiples of that time searching through one job that is as large or larger than those six job combined. If the original coders paid any sort of attention to naming conventions, that hour might be much less.
Franklin Evans
"Shared pain is lessened, shared joy increased. Thus do we refute entropy." -- Spider Robinson

Using mainframe data FAQ: viewtopic.php?t=143596 Using CFF FAQ: viewtopic.php?t=157872
lstsaur
Participant
Posts: 1139
Joined: Thu Oct 21, 2004 9:59 pm

Post by lstsaur »

Subdividing and breaking down your 100 stages job's logic into smaller and more manageable pieces. Then provide the annotation for each stage.
ppp
Participant
Posts: 21
Joined: Mon Aug 31, 2009 11:53 am

Post by ppp »

Thank you all for your suggestions.
jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Post by jwiles »

In addition to the options of breaking apart the job, examine it from the viewpoint of: Does the job really require 100 stages to do what it's doing? And, does it really need to run 8x8 (does your job run with 64 logical nodes per instance)?

If you haven't already, I suggest reading the IBM Redbook on DataStage Parallel Framework Standard Practices (Google it for the free PDF download). While it doesn't cover some of the newer functions such as transformer looping, you may find suggestions applicable to your situation.

Regards,
- james wiles


All generalizations are false, including this one - Mark Twain.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

jwiles wrote:If you haven't already, I suggest reading the IBM Redbook on DataStage Parallel Framework Standard Practices
http://www.redbooks.ibm.com/abstracts/S ... SEF&mync=E
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply