running order of container within job

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
haisen
Participant
Posts: 11
Joined: Sun May 11, 2008 1:42 am

running order of container within job

Post by haisen »

Requirement: To delete existing record in table before insert, based on a sequential file.

Version 7.2 used to have an option to delete existing records before insert but I don't see it in version 8.

Problem: Parallel job batch up the delete and insert of records. Eg. The file have 500 records, it delete 100 records based on text file, do insert for another 100 records, and then do delete & insert for another batches. Therefore, even when 500 records is expected to be loaded, less than that number is actually loaded.

Container:
As there is a few files (of different colums but same primary key positioning), we have made the following into a shared container.
same_sequential_file --------> Transformer A--------> DB (delete rows where primary key match)

In actual job (1 job only):
Shared_Container
same_sequential_file --------> Transformer B --------> DB (upsert)

There is no link between the shared container to the sequential file.
We were expecting that the container's portion will be executed followed by the upsert.

Is there anyway to set the container to complete the execution then run the upsert's flow? Must there be a link from the container to the upsert's flow?

I realised that the container can be joined to "Transformer B" then load it to DB, however only the first 3 colums of the files is same while the rest of the colums will varies.

For the same design, the container will be executed first for small volume of records.
mahadev.v
Participant
Posts: 111
Joined: Tue May 06, 2008 5:29 am
Location: Bangalore

Post by mahadev.v »

Without an output link from the container to the upsert flow, they will be treated as two different flows and will be executed simultaneously. And a sequential cannot have a input link and output link (unless it is a reject link). So this approach wont work. Have the second flow (upsert) in a second job. Guess that will be the best solution.
"given enough eyeballs, all bugs are shallow" - Eric S. Raymond
Post Reply