PL/SQL Vs. Datastage jobs

Archive of postings to DataStageUsers@Oliver.com. This forum intended only as a reference and cannot be posted to.

Moderators: chulett, rschirm

Locked
admin
Posts: 8720
Joined: Sun Jan 12, 2003 11:26 pm

PL/SQL Vs. Datastage jobs

Post by admin »

This is a topic for an orphaned message.
admin
Posts: 8720
Joined: Sun Jan 12, 2003 11:26 pm

Post by admin »

The following is only my personal OPINION based on experience with projects that have done it both ways.


The answer depends on how complicated the current ETL process with the PL/SQL is. Does the PL/SQL make extensive use of reference tables to map the data or is a lot of the decision-making hard coded.

I would suggest that the "best" solution probably involves designing/redesigning mapping/reference tables and building as much of the ETL process as possible in DataStage. You may find that a small proportion of the logic is best done in PL/SQL (say 5%?).

You could retain all the PL/SQL but you gain none of the advantages of DataStage in the process.

If you can convince management to spend the money, it is worth investing a little extra now to rebuild as much as possible in DataStage. In the future, changes to the DataStage jobs will be orders of magnitude easier than changing the PL/SQL. Ive done both.

Consider an example that I used at a presentation a year or so ago. My task was to create a new dimension table, map the source data to populate it, add a new foreign key to the fact table, and change the job that populates the fact table. With DataStage, this took me 3 days to code, test and release (go easy on me all you DataStage gurus, it wasnt an easy one). I had some friends at that presentation who were doing all their ETL in PL/SQL. They just sat there with their jaws on the floor. Such a change to them was a major enhancement (measured in weeks, not days).

However, I do understand your position. It is probably a big job to completely replace all your PL/SQL. You may be best to do it gradually. Management will also have to decide why you are looking at DataStage. It is not cheap, but on any reasonable size project, it would easily pay for itself in terms of saved effort when compared with PL/SQL (OK, I admit, not really a fair comparison - bit like comparing earth moving equipment to picks and shovels).

The bottom line is this. I have no doubt that DataStage is a better way to do your ETL than PL/SQL. The problem is, if you dont use the product, you dont get the advantages. There is little point in using DataStage as a shell around your existing PL/SQL.

Do you have any new development to add to your system? If so, then a practical approach might be to use DataStage for all the new work. Once you have seen it in action, you would have a better idea of how to approach the replacement of the PL/SQL (and the motivation).

You asked about guidelines about what to do in DataStage and what to do in PL/SQL. Sorry, cant really help here. Every product has its own way of operating. DataStage is no different. Once you learn to think the same way as DataStage instead of working against it, you will find that it can do just about anything you need it to. We do have a couple of situations where the processing was so messy that we used PL/SQL instead.


Sorry for carrying on so much. I am an Oracle developer, and I happen to think that DataStage is a really good way to get data into an Oracle database.

David Barham
Information Technology Consultant
InformAtect Pty Ltd
Brisbane, Australia

e-mail: david@barham.hm

-----Original Message-----
From: Vivek Pandey. [mailto:vive@sonata-software.com]
Sent: Thursday, 3 May 2001 4:16 PM
To: informix-datastage@oliver.com
Subject: PL/SQL Vs. Datstage jobs

Hi all,

A brief background.....
Our client has an enterprise financial datawarehouse running on Oracle 7 and Express 6.2 (ROLAP arch), and it is not happy with the performance. Issues relate mainly to ETL and maintenance part. Currently ETL is handled thru PL/SQL procedures. We intend to perform this task using Ardent Datastage. Also, we also plan to migrate the DW from Oracle 7 to 8i to take adv of OLAP features of 8i.

The problem.....
Since im new to Datastage, my query is regarding that. The ETL logic of stored procedures can be built into Datastage stages/jobs. That way we eliminate the PL/SQL routines. The other option is to call the same PL/SQL routines from Datastage jobs (since those routines r already there). Or perhaps a third option can be to use an optimal mix of the prev two options (what is optimal?) Whats the best thing to do? Are there guidelines that help in deciding what to do within Datastage and what to do outside(PL/SQL)?

Regds,
Vivek Pandey
Consultant - BI
http://sonata-software.com
email: vive@sonata-software.com
Locked