Poor performance at sequential file stage while reading

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
csc.datastage
Premium Member
Premium Member
Posts: 23
Joined: Wed Sep 10, 2008 4:46 pm
Location: USA

Poor performance at sequential file stage while reading

Post by csc.datastage »

Hello:

While processing end-to-end run in my project one of the DS server job is demonstrating poor performance. Here is the job design:

Code: Select all

Sequential File Stage 
     |
     V
Transformer <-- OraOCI9 Stage as Lookup1 
   Stage        <-- OraOCI9 Stage as Lookup2
     |
     V
Sequential File Stage
We're been running this process for 5 yrs and have not had a problem until now. Suddenly it started to run with slow performance. The sequential file has only 34500 records. The before dependent job for this source ran successfully without having any problem and also post-dependent jobs are also working fine. Only this particular process is not working with successfully. Both lookups are extracting 28 million records. Earlier we used hashed-lookup but due to some development issues, we're changed those lookups into ORAOCI9.

Regularly this job will take only 5-15 minutes and the average records are 30K-50K in regular process. Currently, the sequential file stage was not reading from the source. Since 15 minutes it was showing 0 records.

Job Properties --> Performance tab --> Active-to-active link performance --> Enable row buffer (checked) --> Inter process (selected) --> Buffer size = 128KB and Timeout = 600 seconds

Please respond with your advice, ideas or suggestions.
Focus on ABC: Audit , Balance and Control..
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

You don't have a "sequential file reading" issue, you have a direct database lookup issue. Think about the change you made to the job, from a local possibly cached direct hashed file lookup to sending one database request at a time over the network. And I'll wager you have user-defined lookups in the OCI stages. And enabling row buffering is no magic bullet.

I'm curious what the "development issues" were that made you remove the hashed files. You need to work them out and get them back in the job to have any hope of decent performance here. IMHO.
-craig

"You can never have too many knives" -- Logan Nine Fingers
roy
Participant
Posts: 2598
Joined: Wed Jul 30, 2003 2:05 am
Location: Israel

Post by roy »

I agree with Craig.

Considering you said 28 million records I guess you hit the 2GB limit on the Hashed file. Bearing in mind there could be also a ranged lookup that might be an issue.

To make a long story short, I think you can simply load your 30K records to a temporary table and use a simple select to get the join you need.

IHTH(I Hope This Helps)...
Roy R.
Time is money but when you don't have money time is all you can afford.

Search before posting:)

Join the DataStagers team effort at:
http://www.worldcommunitygrid.org
Image
nvkuriseti
Premium Member
Premium Member
Posts: 38
Joined: Sat Jan 17, 2004 12:29 am
Location: Hamburg, Germany

Post by nvkuriseti »

Thanks for your response. Basically this job was designed very long back and the developer by that time has replaced from hashed-file to Oracle as lookups due to run-window issues. But now the issue came back :)

Business rule is the same table we are calling as 2 lookups with the Veh_ID as 17 characters as key field (Lookup1) and Veh_ID as 11 characters as key field (Lookup2). If the Lookup1 does not exist with source record then need to check with Lookup2 with the source Veh_ID[1,11]. If both do not exist, then create a new Veh_ID. This is what existing job process does. So in this situation I cannot join and check with 17 characters and 11 characters value.

The option is once again to re-back the hashed-file process as Lookups. Please let me know If you have any other ideas or suggest another design.

PS: Got chance to test with hashed-files as lookup instead of ORA-OCI and It was working fine. All the source records loaded successfully in the target sequential file.
PaulVL
Premium Member
Premium Member
Posts: 1315
Joined: Fri Dec 17, 2010 4:36 pm

Post by PaulVL »

Are any other jobs in your environment running slow too?
SURA
Premium Member
Premium Member
Posts: 1229
Joined: Sat Jul 14, 2007 5:16 am
Location: Sydney

Re: Poor performance at sequential file stage while reading

Post by SURA »

Check the index

DS User
Post Reply