Poor performance using sequential file stage

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
DStage_Developer
Participant
Posts: 15
Joined: Fri Mar 09, 2007 8:22 am

Poor performance using sequential file stage

Post by DStage_Developer »

I have a simple job that reads a sequential file passes it through a transformer and then deletes those records from a table. The seuqential file has about 250 million rows. I'm experiencing very low performance reading the file. (71 rows/ sec) according to stats in designer.

I've verified that the AIX box isn't anywhere near capapcity. There are only a couple of MicroStrategy reports being run at the same time.

Is there a more efficient way to delete such a large amount of data using DataStage?
nagarjuna
Premium Member
Premium Member
Posts: 533
Joined: Fri Jun 27, 2008 9:11 pm
Location: Chicago

Post by nagarjuna »

how you are reading the sequential file ? Try using multiple readers per node or number of readers per node ...
Nag
JRodriguez
Premium Member
Premium Member
Posts: 425
Joined: Sat Nov 19, 2005 9:26 am
Location: New York City
Contact:

Post by JRodriguez »

I guess that the botleneck is not in the sequential file but in the oracle stage ...yoou can test this by replacing your Oracle stage with another sequential stage


What's your commit strategy?

Delete operation is database intensive and time consuming, mainly because it forces the database to generate and save to disk significant amounts of redo and undo data ... so you might need to manipulate the logging capability of your database
Julio Rodriguez
ETL Developer by choice

"Sure we have lots of reasons for being rude - But no excuses
DStage_Developer
Participant
Posts: 15
Joined: Fri Mar 09, 2007 8:22 am

Post by DStage_Developer »

I tested the bottleneck by replacing the database stage with another sequential file. It read roughly 2500 rows per sec. I'm working with the DBA to look at the stats on this table as it has been inserted & deleted into numerous times.

I was reading the file sequentially. I'll have to change the file to a fixed length file before I'm able to test the multiple nodes per reader option
Kryt0n
Participant
Posts: 584
Joined: Wed Jun 22, 2005 7:28 pm

Post by Kryt0n »

Are you trying to delete 250 million rows from the table? How many rows are in the table prior to starting?

Would sure hope this is a one-off job!

You may be better off identifying those records you don't want to delete, moving them to a temp table, truncating the existing table and moving retained records back.
Post Reply