Datastage job uses 1 CPU while remaning 7 are idle

Umeshkn1704 · Post by **Umeshkn1704** » Thu Dec 29, 2016 9:33 am

Hi Team,
Advance wishes to you all - Happy New year.

I have a scenario where I've not found any solution so far and I seek your expert help.

IIS version - 8.7.0.1
OS - Linux 6.7 Santiago
CPU - 8 CPU

I have a sequencer when run it consumes 98% CPU. I did a mpstat and found that it is just using 1 CPU and remaining 7 CPU are sitting idle.

Can you please let me know what steps / configuration that needs to be made to utiliize all the available CPUs.

Thanks in advance.

chulett · Post by **chulett** » Thu Dec 29, 2016 10:32 am

Run more jobs.

JRodriguez · Post by **JRodriguez** » Thu Dec 29, 2016 10:45 am

DataStage sequences are of "server job" type, not parallel...in case that you don't have parallel jobs in that sequence

What's the value for the APT_CONFIG_FILE environment value? Are you including the that environment variable as a parameter in your sequence?

chulett · Post by **chulett** » Thu Dec 29, 2016 10:57 am

I know my reply sounded a bit facetious but it really wasn't. DataStage doesn't have any control over the number of CPUs that get leveraged - your operating system does. And unless you want to start worrying about processor affinity I would suggest you not worry about it all that much and let the O/S do its job.

UCDI · Post by **UCDI** » Thu Dec 29, 2016 11:11 am

make sure the parallel jobs are set to run in parallel.

You can force the issue by splitting the data stream and forking the chunks into repeated copies of the same job (container, parallel job, or even a particular stage) if you must. I dislike the practice, but it works when nothing else will. This technique will cause your job to get an unfair % of available resources when many jobs are running at once, the others may "starve" if you over-do it. I highly recommend that you never manually fork beyond 1/2 of your available cpus, so if you have 8, fork it 4 times at most, maybe try just 3.

I assume that your server and OS and all are set up and configured to actually run things in parallel correctly. If you are set up for single threaded only, the above won't help a bit.

---- HAH, I remember the excitable boy song. Classic alternative stuff

Umeshkn1704 · Post by **Umeshkn1704** » Fri Dec 30, 2016 3:35 am

Hi,
Thanks for your response.

All the jobs designed within Sequencer are parallel jobs. The environment variable APT_CONFIG_FILE is set to use 4 node config file which is default in our environment within Sequencer and Individual jobs as well.

Thanks

JRodriguez · Post by **JRodriguez** » Fri Dec 30, 2016 3:58 pm

Umesh,
What happens when you run one of the parallel jobs by itself from DS Director? One that you have in that DS sequence...Did it execute in one node or four?

Could you put here the value of the environment variable APT_CONFIG_FILE from the logs? ( I know, you stated that it is set to 4 nodes by default, we just need to see what's the value to be able to determine what's causing the issue)

PaulVL · Post by **PaulVL** » Sat Dec 31, 2016 11:54 am

When your sequencer runs, are your underlying jobs set to run sequentially or at the same time. Note that this is NOT a question of "Are they Parallel Canvas Jobs". I am asking if your sequencer is kicking them off at the same time, or are they chained together sequentially.

when you run "ps -ef | grep DSD.RUN" do you see multiple jobs running?

Are you executing all on one host or does your APT file send them to other hosts (SMP/Grid setup)?