Architecture: DB2/UDB API Stage or Stored Procedures?

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
AlanP
Premium Member
Premium Member
Posts: 17
Joined: Tue Apr 19, 2005 5:16 am
Location: Melbourne, Australia

Architecture: DB2/UDB API Stage or Stored Procedures?

Post by AlanP »

The company I am assigned to is building a Transactions database from the ground up. It will be hosted on DB2. Our Datastage team has been tasked with a) extracting MI data from it and b) writing a few interfaces which update it in Batch mode.

Online (Real-Time) updates will take place via Web Services which call Stored Procedures to update the DB.

The DBAs are recommending we write some Stored Procedures to retrieve data from the database as well.

I can see the advantage of reusing their update SPs within our jobs, when we need to update the database. I don't really have any issue with that.

However, I can only see disadvantages with using SPs to select data:-

1) You have to write code in 2 places IBMs Dev Center and Datastage.
2) Looks like getting a list of SPs into the repository needs an ODBC connection?
3) You can't preview data by right-clicking the SP stage
4) All lookups would require moving data via a hash file (best practice anyway)

The supposed advantages would be:-

1) Re-use
But the DS code is likely to be very specific and unlikely to be reused
2) Isolation from changes
Any changed or new/removed fields in the tables are likely to need the jobs changing in Datastage anyway

As I am not familiar with either stage (or similar Oracle situations), are there any further technical reasons I should be wary of using SPs?
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

If you're pulling in micro-batches all day long, you may get into some issues coordinating data across tables, meaning if inserts/updates are not packaged as a single transaction you may get into situations where some rows in a table are not synchronized with their children at th time you snapshot the tables independently.

It could be that your DBA's are recommending SP's because those can be aware of other SP's simultaneously executing and coordinate their pulls. Something to consider is that the SP's write to extraction tables that you can then fetch using DataStage. This is a very valid technique, especially when two or more tables have to be streamed as a coordinated extract. You can't do this with a single job, so have the SP's truncate and load a a set of tables.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
AlanP
Premium Member
Premium Member
Posts: 17
Joined: Tue Apr 19, 2005 5:16 am
Location: Melbourne, Australia

Post by AlanP »

Kenneth

Thanks for your response. Luckily we will only be retrieving data at end of day, so I doubt we will have to coordinate transactions.

Alan
Post Reply