Page 1 of 1

Job is running very slow

Posted: Tue Dec 20, 2011 4:29 pm
by ravij
Hi,

I have a simple one to one mapping job is taking long time to run. My job design is OraOci -> Trf -> OraOCI.

To Process 4440 records it took 30mins and the other day it took 58 mins to procsess just 125 records in Production. We have a user defined source extract query to pick the data based on dates. The query is

Code: Select all

select * from (select LZ.*,row_number() over (partition by incident_id order by incident_id) rn from lz_incident lz where lz.LAST_UPDATED_DATE >=  to_date('#pFROM_DATE#', 'YYYY-MM-DD-HH24:MI:SS')  and 
lz.LAST_UPDATED_DATE < to_date('#pTO_DATE#', 'YYYY-MM-DD-HH24:MI:SS')  and  lz.last_updated_date in
 (select max(last_updated_date) from lz_incident group by incident_id)) where rn=1
and
LAST_UPDATED_DATE >=  to_date('#pFROM_DATE#', 'YYYY-MM-DD-HH24:MI:SS') 
and LAST_UPDATED_DATE < to_date('#pTO_DATE#', 'YYYY-MM-DD-HH24:MI:SS')
The target Write Method is 'Load'. I really don't understand the behaviour of this job. Can anybody help me where do I look into the issue.

Thanks in advance.
RJ

Re: Job is running very slow

Posted: Tue Dec 20, 2011 4:38 pm
by samdsx
Split the job into 2 jobs (extracts to dataset) and (dataset to load) and see if it improves.

Posted: Tue Dec 20, 2011 6:33 pm
by qt_ky
I ran into that on 7.5 some years back with same job design taking 11 hours. After splitting the job exactly as samdsx suggested here, the total time went to 30 minutes on millions of rows. I never found the root cause for performance issue within a single job.

Re: Job is running very slow

Posted: Tue Dec 20, 2011 7:06 pm
by SURA
OraOci -> peek-->Trf ->peek-->OraOCI

Try this.

DS User

Re: Job is running very slow

Posted: Tue Dec 20, 2011 11:12 pm
by ray.wurlod
SURA wrote:OraOci -> peek-->Trf ->peek-->OraOCI

Try this.

DS User
Why? What do you believe it will accomplish in terms of the original question?

Posted: Tue Dec 20, 2011 11:23 pm
by pandeesh
How long it takes, if you run the below query in toad or any other clients?

Code: Select all

select * from (select LZ.*,row_number() over (partition by incident_id order by incident_id) rn from lz_incident lz where lz.LAST_UPDATED_DATE >=  to_date('#pFROM_DATE#', 'YYYY-MM-DD-HH24:MI:SS')  and
lz.LAST_UPDATED_DATE < to_date('#pTO_DATE#', 'YYYY-MM-DD-HH24:MI:SS')  and  lz.last_updated_date in
 (select max(last_updated_date) from lz_incident group by incident_id)) where rn=1
and
LAST_UPDATED_DATE >=  to_date('#pFROM_DATE#', 'YYYY-MM-DD-HH24:MI:SS')
and LAST_UPDATED_DATE < to_date('#pTO_DATE#', 'YYYY-MM-DD-HH24:MI:SS')

Re: Job is running very slow

Posted: Tue Dec 20, 2011 11:34 pm
by SURA
Ray

1) If an issue in the source and it took time(!) to pull the data from the table.

In that case the option what i said will not help.

Whereas something wrong while process in or after tfm on that case if peek is used, you can find the data flow without hanging somewhere is the whole aim.

In other way to narrow down...

DS User

Posted: Wed Dec 21, 2011 12:31 am
by jwiles
Is this a parallel or server job? You have OracleOCI listed as the source stage....

Is the table lz_incident indexed on LAST_UPDATED_DATE?

Posted: Wed Dec 21, 2011 1:58 am
by manoj_23sakthi
Hi,
Could You please give the stats of this extract query in both Datastage and Toad ?
Please index columns for Joins
I am not sure about this table and it key column
so i blindly giving this query..
WITH VAL AS ( KEYCOLUMNS,rn )
(SELECT KEYCOLUMNS ,row_number() over (partition by incident_id order by
incident_id) rn from
lz_incident lz
where
lz.LAST_UPDATED_DATE >= to_date('#pFROM_DATE#', 'YYYY-MM-DD-HH24:MI:SS')
and lz.LAST_UPDATED_DATE < to_date('#pTO_DATE#', 'YYYY-MM-DD-HH24:MI:SS') )

SELECT LZ.*,A.rn FROM lz_incident LZ ,VAL A
WHERE
L2 KEYCOLUMNS = A.KEYCOLUMNS
AND rn= 1 AND LAST_UPDATED_DATE >= to_date('#pFROM_DATE#', 'YYYY-MM-DD-HH24:MI:SS')
and LAST_UPDATED_DATE < to_date('#pTO_DATE#', 'YYYY-MM-DD-HH24:MI:SS')

after this extract use sort stage and remove dups stage to get max(LAST_UPDATED_DATE) by its grouping column



Please let me know if i am wrong

Posted: Wed Dec 21, 2011 1:59 am
by manoj_23sakthi
Hi,
Could You please give the stats of this extract query in both Datastage and Toad ?
Please index columns for Joins
I am not sure about this table and it key column
so i blindly giving this query..
WITH VAL AS ( KEYCOLUMNS,rn )
(SELECT KEYCOLUMNS ,row_number() over (partition by incident_id order by
incident_id) rn from
lz_incident lz
where
lz.LAST_UPDATED_DATE >= to_date('#pFROM_DATE#', 'YYYY-MM-DD-HH24:MI:SS')
and lz.LAST_UPDATED_DATE < to_date('#pTO_DATE#', 'YYYY-MM-DD-HH24:MI:SS') )

SELECT LZ.*,A.rn FROM lz_incident LZ ,VAL A
WHERE
L2 KEYCOLUMNS = A.KEYCOLUMNS
AND rn= 1 AND LAST_UPDATED_DATE >= to_date('#pFROM_DATE#', 'YYYY-MM-DD-HH24:MI:SS')
and LAST_UPDATED_DATE < to_date('#pTO_DATE#', 'YYYY-MM-DD-HH24:MI:SS')

after this extract use sort stage and remove dups stage to get max(LAST_UPDATED_DATE) by its grouping column



Please let me know if i am wrong