Poor performance using sequential file stage

DStage_Developer · Post by **DStage_Developer** » Mon Jul 20, 2009 1:44 pm

I have a simple job that reads a sequential file passes it through a transformer and then deletes those records from a table. The seuqential file has about 250 million rows. I'm experiencing very low performance reading the file. (71 rows/ sec) according to stats in designer.

I've verified that the AIX box isn't anywhere near capapcity. There are only a couple of MicroStrategy reports being run at the same time.

Is there a more efficient way to delete such a large amount of data using DataStage?

nagarjuna · Post by **nagarjuna** » Mon Jul 20, 2009 1:58 pm

how you are reading the sequential file ? Try using multiple readers per node or number of readers per node ...

JRodriguez · Post by **JRodriguez** » Mon Jul 20, 2009 2:06 pm

I guess that the botleneck is not in the sequential file but in the oracle stage ...yoou can test this by replacing your Oracle stage with another sequential stage

What's your commit strategy?

Delete operation is database intensive and time consuming, mainly because it forces the database to generate and save to disk significant amounts of redo and undo data ... so you might need to manipulate the logging capability of your database

DStage_Developer · Post by **DStage_Developer** » Tue Jul 21, 2009 12:37 pm

I tested the bottleneck by replacing the database stage with another sequential file. It read roughly 2500 rows per sec. I'm working with the DBA to look at the stats on this table as it has been inserted & deleted into numerous times.

I was reading the file sequentially. I'll have to change the file to a fixed length file before I'm able to test the multiple nodes per reader option

Kryt0n · Post by **Kryt0n** » Tue Jul 21, 2009 5:54 pm

Are you trying to delete 250 million rows from the table? How many rows are in the table prior to starting?

Would sure hope this is a one-off job!

You may be better off identifying those records you don't want to delete, moving them to a temp table, truncating the existing table and moving retained records back.