Page 1 of 1

I/O thoughput

Posted: Tue Oct 18, 2011 11:46 am
by pavan_test
I am having a performance issue with a job I am working running currently. The source is a fileset. the job read around 3 million records. when I ran the job it completes in 2 minutes. when I run the same job 6 instances at a time then each instance takes around 25 minutes, out of which 23 minutes is spent on I/O throughput. the job takes 23 minutes just to read the source file.

Can someone please explain me where do I look to find out why the single instance takes 2 minutes while running 6 instances, takes 23 minutes for I/O throughput.

Thanks
Mark

I/O thoughput

Posted: Tue Oct 18, 2011 11:58 am
by pavan_test
when I ran single instance 20,306 records are processed per second. And 1685 rows/sec were processed when 6 instances of the same job were running concurrently. the jobs were running with a 2x1 configuration file.

Thanks
Mark

Posted: Tue Oct 18, 2011 3:58 pm
by ray.wurlod
In a word, contention.

I/O throughput

Posted: Wed Oct 19, 2011 8:33 am
by pavan_test
Thank You. Where do I find this word contention

I/O throughput

Posted: Wed Oct 19, 2011 9:09 am
by pavan_test
I could not find any word resource or contention in the datastage log

Posted: Wed Oct 19, 2011 2:50 pm
by jwiles
You won't. You are likely facing I/O contention issues, caused by many processes trying to read the same files at the same time. You will need to work with your system administrators to look at system reports (iostat, vmstat, sar, etc.) to see where there may be a bottleneck.

Regards,

Re: I/O throughput

Posted: Wed Oct 19, 2011 11:16 pm
by deeplind07
you were suppose to search word 'contention' in a dictionary...Just a joke :) sorry if you find this comment rude
But as jwiles said., you are facing i/o contention...multiple processes reading the same file..there is going to be a resource sharing between these processes..which is causing the slow down

Posted: Wed Oct 19, 2011 11:28 pm
by vishal_rastogi
use TOP command in parallel window while running the job to find out the CPU utlization i think it will be more in your case