Load job as a reusable design

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
senthilkumarr
Participant
Posts: 6
Joined: Sat Jun 18, 2011 11:18 am
Location: Chennai,India

Load job as a reusable design

Post by senthilkumarr »

Hi Ray,

Suppose after capturing the insert/update records using CDC in One dataset in first job.

Can i use the dataset as RCP in second job to load into SQL server (ODBC COnnection).I like to use the load job as a reusable design for all 87 table load and also i am facing issue with BLOB data type loading in SQL server.
senthil kumarr
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

... and off you go to your own topic.

Split from this post if anyone is curious.
-craig

"You can never have too many knives" -- Logan Nine Fingers
mhester
Participant
Posts: 622
Joined: Tue Mar 04, 2003 5:26 am
Location: Phoenix, AZ
Contact:

Post by mhester »

The split in threads seems to actually be a logical one - 2 separate questions really.

To answer this question reusable load jobs are perfect for what you want to do. You will need to do a few things in the load job and you will actually probably have three (3) separate load jobs (makes more sense). One will delete, one will update and one will insert.

The insert job is a no brainer - no keys etc... involved so this can be coded quite quickly and requires no key exposure - simply stream the data.

The update and delete are a bit trickier, but not terribly so. The caveat with these is that you need a key and there are a couple of ways to do this. The easiest way is to supply the "WHERE" clause directly to the stage via a job parameter (very common). Another way to do this would be to pass a parameter with a schema file that defines the key for the particular table you are updating or deleting. In your update and delete job you will expose some generic column(s) as the key columns using the column export operator. This key could then be used in a dynamic SQL statement in the stage (user defined of course).

Below I will give you an example of what to do if you will be passing the WHERE clause and not exposing keys.

You will choose not to generate the SQL in the stage and you can then parameterize the SQL statement for your needs. You could have something similar to -

Code: Select all

DELETE FROM #TABLE_NAME_PARM# WHERE #DELETE_CLAUSE#
You will supply the parms and the DELETE_CLAUSE might look something like

Code: Select all

ACTUAL_TABLE_KEY=ORCHESTRATE.ACTUAL_TABLE_KEY
Any method you choose will have pros and cons. Fully RCP designs can be more difficult to debug and reviewing the data can be challenging (not too bad, but challenging). Exposing generic keys or writing specific WHERE clauses both have merit and you need to decide which is better for your situation.

This type of job will be multi instance and the number you can run is entirely dependent on system resources as Ray pointed out in the other thread.

There are ways to wrap database operators so that you can actually do update and deletes within the same job (without job parms etc...), but I prefer to separate them for ease of maintenance.

Have fun!

RCP is great for things like sourcing and loading, but don't go wild trying to make everything generic - too much of a hassle and very hard to maintain in my opinion.
highpoint
Premium Member
Premium Member
Posts: 123
Joined: Sat Jun 19, 2010 12:01 am
Location: Chicago

Post by highpoint »

mhester wrote:Another way to do this would be to pass a parameter with a schema file that defines the key for the particular table you are updating or deleting. In your update and delete job you will expose some generic column(s) as the key columns using the column export operator. This key could then be used in a dynamic SQL statement in the stage (user defined of course).
Hi,

I was trying to implement Generic Update Job, and was trying to understand the above procedure.

I am not able to understand how to implement generic update using column export operator. Can you please elaborate on this.

for UPDATE statement we would need to SET fields other than specifying key fields in where clause. For DELETE it seems to work if we could some way specify only the key fields. But for UPDATE, we need to have fields in SET clause as well.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Your new mantra: NOT EVERYTHING CAN BE DONE GENERICALLY
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply