performance comparison of custom SQL vs ADS stage's SQL

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
sh.bangash
Participant
Posts: 15
Joined: Wed Aug 01, 2007 5:23 am
Location: Islamabad

performance comparison of custom SQL vs ADS stage's SQL

Post by sh.bangash »

Hi Gurus,

I have just started using ADS for ETL, before I used to do all the ETL scripting in SQL/PLSQL, Shell scripting etc.

I have a pipe separated ascii file of about 546 MB that needs to be ETLed to our data warehouse for recharges subject area.
There are two approaches to it
1. write a custom script; such as pl/sql scripts in stored procedures to load the data form file to staging table. call this script in ADS processing stage. then write another custom script using pure SQL in oracle stored procedures to apply the transformations and insert the load ready delta into final production table. this second script shall also be called by an ADS processing stage by looking at the return result of first processing stage.
2. second approach is that I purely rely on ADs stages to do all the ETL for me like:
a. using a transformation stage i load data from sequential file to staging ODBC/Oracle table. then using transformation stage again I do required transformations and insert/update the results in Load Ready delta table; from where I finally apply insert/update using another transformation stage.

Now, My question is that which approach is more practical one while using ADS and Why?

I hope to have comprehensive views from all ADs gurus over here!

Thanks and Regards,
Shahid.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Either is apposite. The DataStage job is probably the easier to maintain long term, particularly if the customer changes database from Oracle to something else.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
sh.bangash
Participant
Posts: 15
Joined: Wed Aug 01, 2007 5:23 am
Location: Islamabad

Post by sh.bangash »

Hi,
Can Any one who have done ETL through scripting and latter on moved to ADS give his experiened response?

Regards,
Shahid.
hamzaqk
Participant
Posts: 249
Joined: Tue Apr 17, 2007 5:50 am
Location: islamabad

Post by hamzaqk »

When you do ETL by using TD utilities you are using the power of teradata server, which ofcourse will run faster and more effeciently than a DS server.

Even if you use that SQL in datastage, you are just using DS as a wrapper and not utilizing its power. Thats what experience tells us for 2 different projects which were carried out in these 2 scenarios.

it more like you are comparing TD server and DS server which i think would be like comparing two entirely different things. TD as you would know is more suiltable for ELT where as DS is a ETL tool.


i would agree with ray to use DS when it comes to maitaining jobs in long run.
hamzaqk
Participant
Posts: 249
Joined: Tue Apr 17, 2007 5:50 am
Location: islamabad

Post by hamzaqk »

p.s i think the same answer would be for any other db other than TD.
sh.bangash
Participant
Posts: 15
Joined: Wed Aug 01, 2007 5:23 am
Location: Islamabad

Post by sh.bangash »

Hi,
Thank you for the response. Actually, while asking this question I have in mind the delta size of a source table(s). For example one source one day call detail records are of approx. 35 GB per day and if i load and transform it using ADS stages, what ETL time performance impact it will have as compared to the DBMS native utilities.

Regards,
Shahid.
Post Reply