Multiple parameterized instances in parallel

jerome_rajan · Post by **jerome_rajan** » Wed Apr 30, 2014 1:21 am

Hi,

We have a set of multi-instance jobs that form a master sequencer. These jobs currently process data for multiple business units(BU) at one go. We are trying to separate the process for each BU and have them run in parallel through the same existing logic.
The names and number of BUs will decrease or increase based on business dynamics and we would like the DataStage flow to handle this.
The idea is to have one sequencer that will read the list of business units and pass the names of the BUs to the multiple instance jobs as parameters and process each instane in parallel. I created a very high level design. Can someone please improve on it or tell me if there's something amiss with the idea?

Below is the design.

ray.wurlod · Post by **ray.wurlod** » Wed Apr 30, 2014 1:28 am

The actual_Sequence will only be executed once. You need to process the BU within the loop, passing the BU value via activity variable StartLoop_BU_Processing.$Counter

jerome_rajan · Post by **jerome_rajan** » Wed Apr 30, 2014 1:32 am

Then the processing would become sequential.

Here, the dummy job(which would basically do nothing) would run as part of the loop in every iteration. On successful completion, the dummy job will trigger the 2 subsequent links - the actual_sequence with eh start loop counter variable as a parameter and also the next iteration. This would ensure that multiple instances of actual_sequence will run in parallel for the different BUs.

ray.wurlod · Post by **ray.wurlod** » Wed Apr 30, 2014 2:43 am

That's not obvious in your master sequence design. At the very least more information (Annotations) required. Why do you want to "do nothing" a number of times? This too is not clear.

jerome_rajan · Post by **jerome_rajan** » Wed Apr 30, 2014 3:32 am

Thank you Ray for taking time to answer my query.

Agreed. The dummy job stage is really not required. Here is the updated design. The nested condition stage would trigger 2 links as mentioned in my earlier post. I think that this would trigger the actual_sequence for different BUs in parallel albeit with a slight delay in start time for each BU. Is my understanding right?

ray.wurlod · Post by **ray.wurlod** » Wed Apr 30, 2014 4:32 pm

That's better, but the layout doesn't really jump out at me that actual_Sequence is in the loop (which I missed first time too). Perhaps a layout like this would make it clearer.

Code: Select all

ExtractBUNames  ---->  StartLoop  <---------------  EndLoop
                              |                         ^
                              |                         |
                              V                         |
                        uvBUName   -------->     NestedCondition
                                                        |
                                                        |
                                                        V
                                                 actualSequence