Page 1 of 1

How to find out the processes forked by a DS job thru Unix?

Posted: Thu Sep 23, 2010 7:54 am
by synsog
Hi,

We have a project requirement where we have to find out all the processes forked by a Datastage job and to check if any of those processes are utilizing excessive CPU or IO. This will be done with a shell script.

Please help me out in finding a way in which we can -

1. List all the processes created by running a DS job.
2. Finding out the CPU and IO utilization of the same (I am using vmstat for CPU and iostat for IO. Please suggest me if there are any other efficient ways to do that.)

We have to check these things via Unix only. No changes can be made to the jobs.

Thanks in advance.

Posted: Thu Sep 23, 2010 1:03 pm
by kduke
I tried to do this once. It is hard to do. You can set a parameter to show the PIDs in the log. You need to capture the processes along with their PIDs to trace these. You would need to capture the CPU utilization at the same time. Their are public domain tools to do this. You need to load all these in separate tables and do joins across the times. These are difficult joins because you are sampling the CPU information every 10 seconds or whatever time interval you choose. So a job starts up and maybe 80 or more processes kick off. You need to find the average CPU utilization for each process based on these samples.

10:00:00 Job1 starts
10:00:10 subprocess1 from job1 starts
10:10:00 subprocess1 from job1 ends
10:01:10 subprocess2 from job1 starts
10:08:00 subprocess2 from job1 ends
and so on

10:00:00 CPU 50%
1):01:00 CPU 90%
10:02:00 CPU 75%

Now average the CPU percents for each subprocess. Not easy .

Posted: Thu Sep 23, 2010 5:53 pm
by ray.wurlod
Use the Monitor in Director.
Double click on any component - its PID should be displayed amongst the other information.

Posted: Fri Sep 24, 2010 7:35 am
by synsog
Hi,

We need to do this via unix only. Can't make any changes to the existing jobs or use the director client.

We fired a ps aux for getting the top 10 CPU consuming processes and for fiding the running DS jobs we grepped DSD.RUN. Now the problem is how we associate the job pids with the top 10 process pids that we got from ps aux.

There is a possibility that the job forked some processes and one of those processes might be consuming more CPU. So what we want is the processes forked by that particular job.

The commands that I tried to find the subprocesses is -

proctree
ps -fL

Please let me know if there is any other way for finding the subprocesses of a DS job in Unix.

Posted: Fri Sep 24, 2010 8:34 am
by chulett
Wouldn't all of the subprocesses show their parent's PID, i.e. their PPID? Am I missing something or couldn't you just simply walk that chain of pids? :?

Posted: Fri Sep 24, 2010 10:41 am
by kduke
chulett wrote:Wouldn't all of the subprocesses show their parent's PID, i.e. their PPID? Am I missing something or couldn't you just simply walk that chain of pids? :?
I thought so too but that is not how they implemented it. They phantom some which means they have no parent. I think the section leaders are that way. I pretty sure that means they now become the parent not the original job process.

Posted: Fri Sep 24, 2010 2:42 pm
by ray.wurlod
synsog wrote:We need to do this via unix only.
Why?

Why did you bother buying the DataStage tool?

Posted: Mon Sep 27, 2010 3:55 am
by synsog
chulett wrote:Wouldn't all of the subprocesses show their parent's PID, i.e. their PPID? Am I missing something or couldn't you just simply walk that chain of pids? :?
We tried using proctree <job_pid> and ps -fL <job_pid> for getting the sub processes created by the job process but its not working. It just shows me two processes -

1. The running job
2. Osh Monitor for that job

Posted: Mon Sep 27, 2010 4:03 am
by synsog
ray.wurlod wrote:
synsog wrote:We need to do this via unix only.
Why?

Why did you bother buying the DataStage tool?
Ray,

We need a monitoring script in Unix which we would be using to find out the CPU and IO usage by the running DS jobs - so that we can isolate the bad jobs (ones causing High Disk and CPU usage).

For this we just need to find out a way in which we can capture all the sub process ids created by a job process and then we are going to check the CPU & IO usage - all in just one script run.

Posted: Mon Sep 27, 2010 4:07 am
by synsog
kduke wrote:
chulett wrote:Wouldn't all of the subprocesses show their parent's PID, i.e. their PPID? Am I missing something or couldn't you just simply walk that chain of pids? :?
I thought so too but that is not how they implemented it. They phantom some which means they have no parent. I think the section leaders are that way. I pretty sure that means they now become the parent not the original job process.
Kim,

Is there any way in which we can list the process ids of the sub processes created by the DS job? Or is it just a wild goose chase :( ? Please suggest what approach we should take if we really need to find out the pids...

Posted: Mon Sep 27, 2010 5:58 am
by ray.wurlod
The Performance Monitoring tool within DataStage will give you all of the things you mentioned.

Posted: Mon Sep 27, 2010 3:44 pm
by kduke
Not sure what Ray is talking about. Sounds like you need to try it.

My way is to turn on the parameter to show PIDs. Run one job. Get PIDs from log file. As the job is running then do ps -ef >outfile1.txt several times incrementing the 1 to 2 then 3 and so on. Load text files into table or spreadsheet. Find all the PIDs and add their sizes together. That will give you the RAM used.