Inline/embeded SQL

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Mike
Premium Member
Premium Member
Posts: 1021
Joined: Sun Mar 03, 2002 6:01 pm
Location: Tampa, FL

Post by Mike »

From what I recall, it was only a partial feature: something like you could use job parameters in the WHERE clause of an UPDATE statement, but not with a SELECT statement. I assumed it had something to do with the SQL optimization technology (since an UPDATE statement probably didn't go through the SQL optimization process). It would be a nice feature if they extended it to SELECT statements as well.

Mike
rupeshg
Premium Member
Premium Member
Posts: 60
Joined: Thu Dec 22, 2005 6:02 am
Location: United Kingdom

Post by rupeshg »

Mike, one more problem which I am facing is

Is there a way in MVS where I can handle a rejected row of Database Insertion due to Primary key constraint.

As of now everytime I insert a row into a DB table I am doing a look up before that stage, to make sure if the row is existing in the table already or not.
But what if I dont want to do the look up before and directly want to insert, and if insert fails how to catch that condition.

Also I dont want to update if the insert fails. So the option "Insert new or update the existing record" is also of no use to me.

Thanks and Regards
Rupesh
Mike
Premium Member
Premium Member
Posts: 1021
Joined: Sun Mar 03, 2002 6:01 pm
Location: Tampa, FL

Post by Mike »

Rupesh,

First, it's always a good idea to avoid database errors that can be avoided.

I assume you're asking because the performance that you're seeing with your current implementation isn't satisfactory. I'll assume that you're doing a direct lookup to the target table using the relational stage since that is the simplest implementation and the one most likely to provide poor performance.

Current data volumes and anticipated growth rate (unspecified) will dictate which approach to use:

1) Low volume, low growth rate. For this case it would be best to unload the target table to a flat file in a prior job. Then the flat file can be used for a hash lookup to determine if the row already exists in the target. The hash lookup table will be preloaded to memory at the start of the job (and there is no way to update it during the job), so you will have to handle duplicates in your input stream (i.e. the row doesn't exist in the target table, but you get 2 rows with same key in the new input file).

There are a couple of ways to deal with duplicates in your input stream if that needs to be handled.

2) High volume. For this case, you should use the join stage with a two-file match technique to identify rows that already exist. Both inputs must be sorted by the join keys. Use the ORDER BY clause in the relational stage that reads your current target table. Use the pre-sort on your new input file if it's a flat file (I'm assuming it's a flat file).

Important point: Make sure that you use the pre-sort tab in the flat file stage for sorting. The SORT stage in a mainframe job should never be used.

Mike
Post Reply