Performance issue in funnel

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
dr.murthy
Participant
Posts: 224
Joined: Sun Dec 07, 2008 8:47 am
Location: delhi

Performance issue in funnel

Post by dr.murthy »

HI ,

I designed with 6 OCI stages and combines data for all 6 stages using funnel and loaded into a final table, how ever this job took 50 mins to complete execution. Actually here out of the six OCI stages initially first 2 stages fetches data renaming stages wait till completes that then next two OCI stages start processing once these completes then next two stages srarts processing data. All these six stages connected to one funnel but these are not fetching data parallel.

Am running this job in two node configuration, Please suggest me how to fetch data from all 6 OCI stages parallely.
D.N .MURTHY
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Re: Performance issue in funnel

Post by ray.wurlod »

dr.murthy wrote:... but these are not fetching data parallel.
How do you know that?

What style of funnel are you performing?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
qt_ky
Premium Member
Premium Member
Posts: 2895
Joined: Wed Aug 03, 2011 6:16 am
Location: USA

Post by qt_ky »

Run a test job where you replace the the final table stage with a Copy stage and compare the run times.
Choose a job you love, and you will never have to work a day in your life. - Confucius
dr.murthy
Participant
Posts: 224
Joined: Sun Dec 07, 2008 8:47 am
Location: delhi

Post by dr.murthy »

Yes , I ran that by putting copy stage i could see some improvement in elapsed time but its not fetching data parallel from all the six OCI stages,
By looking at the performance stats i could see OCI stages not fetches data parallel y.

All these six stages we have once common table called TXN_HDR used in join it has 30 million records .
D.N .MURTHY
qt_ky
Premium Member
Premium Member
Posts: 2895
Joined: Wed Aug 03, 2011 6:16 am
Location: USA

Post by qt_ky »

Any chance the 6 select statements have varying complexity in the joins, if any, or the where clauses? It could be that each statement takes a different amount of time to for the database to start returning results.
Choose a job you love, and you will never have to work a day in your life. - Confucius
dr.murthy
Participant
Posts: 224
Joined: Sun Dec 07, 2008 8:47 am
Location: delhi

Post by dr.murthy »

Yes , all the six Sequels has different joins with different tables on TXN table , if all starts picks the records at a same time it can reduce 30 % of run time.
D.N .MURTHY
qt_ky
Premium Member
Premium Member
Posts: 2895
Joined: Wed Aug 03, 2011 6:16 am
Location: USA

Post by qt_ky »

It does not sound like the problem is within the DataStage server. My guess is that the DataStage job may be spending 90% of the time just waiting on the database to provide results from one or more queries, so you need to focus on that aspect.

Run each of the 6 queries separately and see how long each one takes. Then work with another developer and/or DBA to tune the longest running query first, 2nd longest 2nd, optimize each one, make sure they're hitting indexes and not doing full table scans, see if any queries may be combined, etc.

Also have a DBA monitor the database while the normal job run all 6 queries to make sure the database is actually working on each query in parallel and to make sure that enough database server resources are available.
Choose a job you love, and you will never have to work a day in your life. - Confucius
Post Reply