Page 1 of 1

Poor performance at sequential file stage while reading

Posted: Mon Jul 25, 2011 5:40 am
by csc.datastage
Hello:

While processing end-to-end run in my project one of the DS server job is demonstrating poor performance. Here is the job design:

Code: Select all

Sequential File Stage 
     |
     V
Transformer <-- OraOCI9 Stage as Lookup1 
   Stage        <-- OraOCI9 Stage as Lookup2
     |
     V
Sequential File Stage
We're been running this process for 5 yrs and have not had a problem until now. Suddenly it started to run with slow performance. The sequential file has only 34500 records. The before dependent job for this source ran successfully without having any problem and also post-dependent jobs are also working fine. Only this particular process is not working with successfully. Both lookups are extracting 28 million records. Earlier we used hashed-lookup but due to some development issues, we're changed those lookups into ORAOCI9.

Regularly this job will take only 5-15 minutes and the average records are 30K-50K in regular process. Currently, the sequential file stage was not reading from the source. Since 15 minutes it was showing 0 records.

Job Properties --> Performance tab --> Active-to-active link performance --> Enable row buffer (checked) --> Inter process (selected) --> Buffer size = 128KB and Timeout = 600 seconds

Please respond with your advice, ideas or suggestions.

Posted: Mon Jul 25, 2011 6:49 am
by chulett
You don't have a "sequential file reading" issue, you have a direct database lookup issue. Think about the change you made to the job, from a local possibly cached direct hashed file lookup to sending one database request at a time over the network. And I'll wager you have user-defined lookups in the OCI stages. And enabling row buffering is no magic bullet.

I'm curious what the "development issues" were that made you remove the hashed files. You need to work them out and get them back in the job to have any hope of decent performance here. IMHO.

Posted: Mon Jul 25, 2011 9:09 am
by roy
I agree with Craig.

Considering you said 28 million records I guess you hit the 2GB limit on the Hashed file. Bearing in mind there could be also a ranged lookup that might be an issue.

To make a long story short, I think you can simply load your 30K records to a temporary table and use a simple select to get the join you need.

IHTH(I Hope This Helps)...

Posted: Mon Jul 25, 2011 9:32 am
by nvkuriseti
Thanks for your response. Basically this job was designed very long back and the developer by that time has replaced from hashed-file to Oracle as lookups due to run-window issues. But now the issue came back :)

Business rule is the same table we are calling as 2 lookups with the Veh_ID as 17 characters as key field (Lookup1) and Veh_ID as 11 characters as key field (Lookup2). If the Lookup1 does not exist with source record then need to check with Lookup2 with the source Veh_ID[1,11]. If both do not exist, then create a new Veh_ID. This is what existing job process does. So in this situation I cannot join and check with 17 characters and 11 characters value.

The option is once again to re-back the hashed-file process as Lookups. Please let me know If you have any other ideas or suggest another design.

PS: Got chance to test with hashed-files as lookup instead of ORA-OCI and It was working fine. All the source records loaded successfully in the target sequential file.

Posted: Mon Jul 25, 2011 1:10 pm
by PaulVL
Are any other jobs in your environment running slow too?

Re: Poor performance at sequential file stage while reading

Posted: Mon Jul 25, 2011 2:45 pm
by SURA
Check the index

DS User