Job performance

Raghavendra · Post by **Raghavendra** » Wed Dec 06, 2006 10:52 am

I am having two extract jobs which are having same design and options.
But I find a huge difference from performance point of view for both jobs.

Job design is as follows:

DB2database ------> Hashfile (Use account name has enabled)

Job1 SQL:

SELECT LTRIM(RTRIM(Column1)),LTRIM(RTRIM(Column2)),Column3,Column4,Column5,Column6
FROM #Schema#.Table1
WHERE START_DT <= Processing Date
AND END_DT >= Processing Date

Result for Job1 is as follows:

Number of rows extracted: 180076
CPU seconds used:185.250
seconds elapsed:224.328

Job2 SQL:

SELECT Column1,LTRIM(RTRIM(Column2)),Column3
FROM #Schema#.Table2
WHERE START_DT <= Processing Date
AND END_DT >= Processing Date

Result for Job1 is as follows:

Number of rows extracted: 171420
CPU seconds used:723.300
seconds elapsed:746.108.

My concern here is we have mentioned all options same for both the jobs. But still job2 is taking more time to finish.
We don't have any before job sub-routine/after job sub routine for Job2.
Infact the number of records and columns extracted are less in Job2 when we compare with Job1.

What could be the reason why the second job is taking more time than the first one.
I have run the job three times today and the result is always same.

Can you give me some pointers to check why the second job taking more time.

Raghavendra · Post by **Raghavendra** » Wed Dec 06, 2006 10:58 am

Total number of records are more in table1 ( used in job1) than table2 ( used in job2).

ArndW · Post by **ArndW** » Wed Dec 06, 2006 11:09 am

Table1 and Table2 are different in your DataBase. Different numbers of records and perhaps you have different internal table setups / indices as well.

narasimha · Post by **narasimha** » Wed Dec 06, 2006 12:03 pm

How long do these queries take to give you results outside datastage?
That could be your first check?

ray.wurlod · Post by **ray.wurlod** » Wed Dec 06, 2006 7:18 pm

It would be very unusual for both jobs to take exactly the same amount of time. There are so many factors that can cause variation.

Is it the same hashed file? Loading an empty hashed file is quicker than loading additional records into a populated hashed file. Are you using write cache? Perhaps in one job and not in the other?

sb_akarmarkar · Post by **sb_akarmarkar** » Thu Dec 07, 2006 1:01 am

Hi,

I think It also depends on datatype you are selecting in both select statements and also index on WHERE clause column.

Thanks,
Anupam

kumar_s · Post by **kumar_s** » Thu Dec 07, 2006 1:57 am

As mentioned, it many not only within Datastage, you you may need to focus on Database side as well. It also depends on the two table. Are the both same. Are they both indexed in similar columns. Are they both analyzed recently. Are the query from each table taking same time.

Raghavendra · Post by **Raghavendra** » Sun Dec 10, 2006 11:04 am

My initial impression was that the indices of the two tables also same as the two tables are of same type.
I had look at the two tables and found the indices different for the tables.
This is the reason why i got the difference.
Thank for your valuble pointers.