Page 1 of 1

ETL Job v/s Stored Procedure

Posted: Tue Aug 16, 2011 5:40 pm
by Raamc
Hi All,

I have a query regarding the tuning a datastage job.

My requirement is to read 10 millions of data from a table and do a lookup (Left outer join) on some other table which is on same database, then insert the data into some other table.

My ETL job for above requirement is taking more than 10 mins time to process it.


My Query is
As all the tables (Source, Lookup and Target) are on same database, If I write a stored procedure for doing the same join in SQL query and insert into a table through stored procedure, is it increase the performance?

As we are using the Database concepts for loading and join, is it benefit anything in performance?

Can any one clarify?

Re: ETL Job v/s Stored Procedure

Posted: Tue Aug 16, 2011 6:05 pm
by SURA
If you ask others, then you need to make more clear like whether all the tables are properly indexed? On which columns, did the table used any partitioning etc

If you want to compare etl vs sql, you can write a procedure and test it. I suggest, that would be the better idea.

DS User

Posted: Tue Aug 16, 2011 6:22 pm
by Raamc
Thanks Sura for the Suggestion.

I started writing the PROC, before that I just want to know whether any others come across the same situation and their ideas about this?

And regarding the Indexes, Yes the tables are having the indexes on the columns which I am using in joins.

Posted: Tue Aug 16, 2011 10:24 pm
by ray.wurlod
Then it's almost certain that performing the join in the database - whether using SQL from DataStage or using a stored procedure - will be faster than performing the join in DataStage itself. If the SQL from DataStage is exactly the same as the SELECT statement executed by the stored procedure, then the overheads of invoking the stored procedure are likely to impact total retrieval time negatively.