Order of execution

Bilwakunj · Post by **Bilwakunj** » Sun Apr 03, 2005 8:48 am

Hi,
I want to know how is the order of execution in PX. Say my PX job has processed 1 row. Now 2nd row is on the verge of beginning. Does PX first flush off all the information of the previous row before actually start the processing of new row OR sequentially when PX approaches the column derivation it modifies the data of the previous record.
In another way I want to know say I've following columns:
col A
col B
Now after 1st record execution , before 2nd one the values of A & B will be initialized (say 0 0) or eventually when it will come to processing of A, this record will overwrite the data of 1st record and so on.

Thanks in advance!!!

ray.wurlod · Post by **ray.wurlod** » Sun Apr 03, 2005 6:13 pm

Almost certainly PX will be giving you "pipeline parallelism", in which a downstream stage can be processing row #1 while an upstream stage has already begun processing row #2 (even though row #1 is still somewhere in the job). You can code to prevent this, but that's defeating one of the things that make parallel jobs go fast.

For more information on pipeline and partition parallelism read Chapter 2 of the Parallel Job Developer's Guide

T42 · Post by **T42** » Sun Apr 03, 2005 7:46 pm

You need to sort your data in order to maintain control in this environment. Sort/partition only on the key fields, and sort only on the dependent fields.

You can control the processing of data in the stage variables with the transformer. In fact, stage variables are executed in order received, so progressive calculations, among other ideas, can be done.

Do a search for "stage variables" to get some ideas on how to handle this concept.

Bilwakunj · Post by **Bilwakunj** » Sun Apr 03, 2005 7:50 pm

Thanks Ray.
I've a situation as described below. In fact I had posted this on the forum but here is the full version of the requirement. So I'm posting it again.

Col A - char (8)
Col B - char (2)
Col A_Date char(8)
Col D_Date char(8)
col C char(3)

Now my job demands, if Col A = Col B = Col C, mark them as related . Now from the group of related col, find the "earliest A_Date" . Now if the A_Date of the next record is in between the earliest A_Date of prev record and (D_Date+1)of previous record then the earliest A_Date is same as that of the prev record else the "earliest A_Date" for this record is the "A_Date" of that col. Again the comparison shd continue and depending on the match the earliest A_date shd be found.
I tried this using the stage variable but as the life of stage variable is limited to 1 record, I couldn't get the correct result. I tried using look up as well but I can't update (i.e.e do read and write) of the same look up file for the updation of the new earliest A_Date among the related records.
Please let me know how this can be done in PX?

ray.wurlod wrote:Almost certainly PX will be giving you "pipeline parallelism", in which a downstream stage can be processing row #1 while an upstream stage has already begun processing row #2 (even though row #1 is still somewhere in the job). You can code to prevent this, but that's defeating one of the things that make parallel jobs go fast.

For more information on pipeline and partition parallelism read Chapter 2 of the Parallel Job Developer's Guide

ray.wurlod · Post by **ray.wurlod** » Mon Apr 04, 2005 1:16 am

Search the forum for how to use stage variables to remember values from the previous row. Once you can do this (which is very easy), the rest should follow.