select & update the same table in the same job

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
luca
Participant
Posts: 81
Joined: Wed May 14, 2003 11:34 pm

select & update the same table in the same job

Post by luca »

Hi !

The job we want to do is the following:
1/ ORAOCI8 to select all the rows from a table,
2/ transformer and calculation of columns of this table,
3/ ORAOCI8 to update the same table.

What do we have to take care about for this kind of job ?
Will the query be a "select for update" ?

Moretheless, I'd like to know :
What is the difference between Transaction size and Rows per transaction ?
What is Transaction isolation ((read committed/serializable) ?

Thanks for your answers.
hughsm
Participant
Posts: 12
Joined: Sun May 04, 2003 12:27 am

Post by hughsm »

Man, I am so temped to say write a PL/SQL procedure to do this. Oracle should outperform DataStage internally every time. But, then you are supporting 2 separate environments and the cost goes way up. You have to have support that is familiar with PL/SQL and DataStage.
If you get in a bind, I can write the PL/SQL for you. I haven't a clue how to do this in DataStage yet. I will watch for the solution.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

PL/SQL is definitely a possibility if the transformations required are not too complex; in fact you could possibly even use a single UPDATE statement based on a SELECT with command line expressions to generate required values.
DataStage does not require SELECT FOR UPDATE, although there is nothing specifically to stop you from doing so. It is sufficient, however, to set transaction isolation level to "read committed" or higher, and control the update perhaps as a single transaction (if it's very large you may need huge rollback segment, check with your Oracle DBA).
I will get back to your specific questions about transaction isolation level, rows per transaction and transaction size in the near future.

Ray Wurlod
Education and Consulting Services
ABN 57 092 448 518
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

On the subject of Transaction Size versus Rows per Transaction...

From what I recall (at home right now) Transaction size is the 'old' way (old as in up to 4.x, I believe) of controlling rows per transaction and was left in for compatibility reasons. It has no effect now as it was replaced by the Rows per Transaction setting. Both are a mechanism to control commit size, with zero = commit only at the end, 1 = commit every row, etc.

Transaction Isolation is a bit harder to put into easily digestable words. It, and the items above, are 'explained' in the online help, and the explanation for TI is a little obtuse. I'm hoping that Ray comes back with something that makes a wee bit more sense. [:)]

-craig
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

The help topic you want to read is "Defining ORAOCI8 Input Data". To paraphrase some of its sub-topics:

Transaction size exists for backwards compatibility but is ignored in the version 3.0 plug-in (ORAOCI8) stage. Transaction size is now handled by "rows per transaction" and the Transaction Handling tab.

Rows per transaction is the number of rows sent to Oracle before a COMMIT is sent to Oracle.

Array size is the number of rows (that is, sets of statement parameter marker replacement values) sent to Oracle in a single write. This is intended principally to make it possible to make more efficient use of network packets.

Transaction isolation level is how much of the database (and other transactions) you need to lock out when your insert/update operation is happening. Only two possibilities exist for the ORAOCI8 stage; read committed and serializable.
Read Committed (the default) takes exclusive locks on modified data and shareable locks on other data.
Serializable takes exclusive locks on modified data and exclusive locks on all other data in the tables being accessed (this is not what the help states - it uses identical descriptions for the two levels). One way to implement fully serializable transactions is never to update or delete a row; instead all such operations insert a new version of the row, and the database server looks after delivering the most up to date (committed) version of any row to a query.

As a possibly interesting aside, terminology for transaction isolation level was originally developed by the SQL CAE Access Group and enhanced by Microsoft Corporation.

For the job described by the orignal poster, I'd venture to suggest that "read committed" is an adequate transaction isolation level, on both the extraction and loading stages.
Post Reply