Hi Team,
Advance wishes to you all - Happy New year.
I have a scenario where I've not found any solution so far and I seek your expert help.
IIS version - 8.7.0.1
OS - Linux 6.7 Santiago
CPU - 8 CPU
I have a sequencer when run it consumes 98% CPU. I did a mpstat and found that it is just using 1 CPU and remaining 7 CPU are sitting idle.
Can you please let me know what steps / configuration that needs to be made to utiliize all the available CPUs.
Thanks in advance.
Datastage job uses 1 CPU while remaning 7 are idle
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 26
- Joined: Fri Aug 01, 2014 11:47 am
Datastage job uses 1 CPU while remaning 7 are idle
Thanks
Umesh
Umesh
-
- Premium Member
- Posts: 425
- Joined: Sat Nov 19, 2005 9:26 am
- Location: New York City
- Contact:
DataStage sequences are of "server job" type, not parallel...in case that you don't have parallel jobs in that sequence
What's the value for the APT_CONFIG_FILE environment value? Are you including the that environment variable as a parameter in your sequence?
What's the value for the APT_CONFIG_FILE environment value? Are you including the that environment variable as a parameter in your sequence?
Julio Rodriguez
ETL Developer by choice
"Sure we have lots of reasons for being rude - But no excuses
ETL Developer by choice
"Sure we have lots of reasons for being rude - But no excuses
I know my reply sounded a bit facetious but it really wasn't. DataStage doesn't have any control over the number of CPUs that get leveraged - your operating system does. And unless you want to start worrying about processor affinity I would suggest you not worry about it all that much and let the O/S do its job.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
make sure the parallel jobs are set to run in parallel.
You can force the issue by splitting the data stream and forking the chunks into repeated copies of the same job (container, parallel job, or even a particular stage) if you must. I dislike the practice, but it works when nothing else will. This technique will cause your job to get an unfair % of available resources when many jobs are running at once, the others may "starve" if you over-do it. I highly recommend that you never manually fork beyond 1/2 of your available cpus, so if you have 8, fork it 4 times at most, maybe try just 3.
I assume that your server and OS and all are set up and configured to actually run things in parallel correctly. If you are set up for single threaded only, the above won't help a bit.
---- HAH, I remember the excitable boy song. Classic alternative stuff
You can force the issue by splitting the data stream and forking the chunks into repeated copies of the same job (container, parallel job, or even a particular stage) if you must. I dislike the practice, but it works when nothing else will. This technique will cause your job to get an unfair % of available resources when many jobs are running at once, the others may "starve" if you over-do it. I highly recommend that you never manually fork beyond 1/2 of your available cpus, so if you have 8, fork it 4 times at most, maybe try just 3.
I assume that your server and OS and all are set up and configured to actually run things in parallel correctly. If you are set up for single threaded only, the above won't help a bit.
---- HAH, I remember the excitable boy song. Classic alternative stuff
-
- Participant
- Posts: 26
- Joined: Fri Aug 01, 2014 11:47 am
-
- Premium Member
- Posts: 425
- Joined: Sat Nov 19, 2005 9:26 am
- Location: New York City
- Contact:
Umesh,
What happens when you run one of the parallel jobs by itself from DS Director? One that you have in that DS sequence...Did it execute in one node or four?
Could you put here the value of the environment variable APT_CONFIG_FILE from the logs? ( I know, you stated that it is set to 4 nodes by default, we just need to see what's the value to be able to determine what's causing the issue)
What happens when you run one of the parallel jobs by itself from DS Director? One that you have in that DS sequence...Did it execute in one node or four?
Could you put here the value of the environment variable APT_CONFIG_FILE from the logs? ( I know, you stated that it is set to 4 nodes by default, we just need to see what's the value to be able to determine what's causing the issue)
Julio Rodriguez
ETL Developer by choice
"Sure we have lots of reasons for being rude - But no excuses
ETL Developer by choice
"Sure we have lots of reasons for being rude - But no excuses
When your sequencer runs, are your underlying jobs set to run sequentially or at the same time. Note that this is NOT a question of "Are they Parallel Canvas Jobs". I am asking if your sequencer is kicking them off at the same time, or are they chained together sequentially.
when you run "ps -ef | grep DSD.RUN" do you see multiple jobs running?
Are you executing all on one host or does your APT file send them to other hosts (SMP/Grid setup)?
when you run "ps -ef | grep DSD.RUN" do you see multiple jobs running?
Are you executing all on one host or does your APT file send them to other hosts (SMP/Grid setup)?