Page 1 of 1

performance comparison of custom SQL vs ADS stage's SQL

Posted: Tue Sep 04, 2007 9:48 am
by sh.bangash
Hi Gurus,

I have just started using ADS for ETL, before I used to do all the ETL scripting in SQL/PLSQL, Shell scripting etc.

I have a pipe separated ascii file of about 546 MB that needs to be ETLed to our data warehouse for recharges subject area.
There are two approaches to it
1. write a custom script; such as pl/sql scripts in stored procedures to load the data form file to staging table. call this script in ADS processing stage. then write another custom script using pure SQL in oracle stored procedures to apply the transformations and insert the load ready delta into final production table. this second script shall also be called by an ADS processing stage by looking at the return result of first processing stage.
2. second approach is that I purely rely on ADs stages to do all the ETL for me like:
a. using a transformation stage i load data from sequential file to staging ODBC/Oracle table. then using transformation stage again I do required transformations and insert/update the results in Load Ready delta table; from where I finally apply insert/update using another transformation stage.

Now, My question is that which approach is more practical one while using ADS and Why?

I hope to have comprehensive views from all ADs gurus over here!

Thanks and Regards,

Posted: Tue Sep 04, 2007 4:26 pm
by ray.wurlod
Either is apposite. The DataStage job is probably the easier to maintain long term, particularly if the customer changes database from Oracle to something else.

Posted: Tue Sep 04, 2007 10:45 pm
by sh.bangash
Hi,
Can Any one who have done ETL through scripting and latter on moved to ADS give his experiened response?

Regards,

Posted: Wed Sep 05, 2007 1:19 am
by hamzaqk
When you do ETL by using TD utilities you are using the power of teradata server, which ofcourse will run faster and more effeciently than a DS server.

Even if you use that SQL in datastage, you are just using DS as a wrapper and not utilizing its power. Thats what experience tells us for 2 different projects which were carried out in these 2 scenarios.

it more like you are comparing TD server and DS server which i think would be like comparing two entirely different things. TD as you would know is more suiltable for ELT where as DS is a ETL tool.


i would agree with ray to use DS when it comes to maitaining jobs in long run.

Posted: Wed Sep 05, 2007 1:23 am
by hamzaqk
p.s i think the same answer would be for any other db other than TD.

Posted: Wed Sep 05, 2007 1:46 am
by sh.bangash
Hi,
Thank you for the response. Actually, while asking this question I have in mind the delta size of a source table(s). For example one source one day call detail records are of approx. 35 GB per day and if i load and transform it using ADS stages, what ETL time performance impact it will have as compared to the DBMS native utilities.

Regards,