Implimenting For loop in stage variables?

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
irajasekharhexa
Premium Member
Premium Member
Posts: 82
Joined: Fri Jun 03, 2005 5:23 am
Location: Bangalore
Contact:

Implimenting For loop in stage variables?

Post by irajasekharhexa »

Hi,

Is it possible to impleiment For loop by using stage variables?
I has a task some thing like.

I need to populate the fixed amount of records with a incrementing value as indicator based on the loading number of records.


Suppose I have 200 records in my source

so I need to populate each 10 records with batch number 1
and next 10 records with batch number 2
and next 10 records with batch number 3 and so on..... in the same data set.



Is it possible to achive this logic by using stage variables?

or any routine is requiered?

Can anybody focus some light on the above one pls...

Regds
Rajasekhar
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

There's no need for a loop as you can initialize your stage variables and on a detected condition increment the stage variable. If you wanted to count by 10 and assign a new number after each set, you need 3 stage variables. One to track the current group (initialize to some value), one to track the batch number (initialize to 0), and one to be the last group seen.

Code: Select all

CurrGroup = link.col values in a concatenated key
BatchNbr = If LastGroup <> CurrGroup Then BatchNbr + 1 Else BatchNbr
LastGroup = CurrGroup
Use BatchNbr as within your example in the derivation of a column. Your data must be sorted to keep all rows within groups together.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
leomauer
Premium Member
Premium Member
Posts: 100
Joined: Mon Nov 03, 2003 1:33 pm

Post by leomauer »

Rajasekhar
If that is as simple as you described, then a single stage variable will do.
Name is batchNum
Initial value will be 0
Derivation is
If MOD(@INROWNUM, 10) = 1 Then batchNum = batchNum + 1 Else batchNum
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Note that, using the suggested technique, the same sequence of batch numbers will be generated on each processing node in your parallel environment. Is this the desired outcome?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

Okay, Ray points out that I didn't give an answer that doesn't require some further thought on the part of the original poster. Since the poster didn't say that there had to be an exact order to the sequencing of rows via the batch number, I stated that the data had to be processed in a sorted order to keep rows together. This implies a hash partitioning to keep like rows together. For simplicity serial processing will handle non duplicate batch numbers across nodes as there's only one node (and if you look the poster mentioned only 200 rows of data, which doesn't necessitate the complexity of solving the parallel solution). Parallel processing requires the poster to solve the unique batch numbering by taking into consideration the processing node in the batch numbering sequence by using a more complicated incrementing expression and initializing the batch number to the processing node +1. But I didn't want to give the entire solution, the poster obviously needs to experiment with stage variables. :wink:
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
irajasekharhexa
Premium Member
Premium Member
Posts: 82
Joined: Fri Jun 03, 2005 5:23 am
Location: Bangalore
Contact:

Post by irajasekharhexa »

kcbland wrote:Okay, Ray points out that I didn't give an answer that doesn't require some further thought on the part of the original poster. Since the poster didn't say that there had to be an exact order to the sequencing of rows via the batch number, I stated that the data had to be processed in a sorted order to keep rows together. This implies a hash partitioning to keep like rows together. For simplicity serial processing will handle non duplicate batch numbers across nodes as there's only one node (and if you look the poster mentioned only 200 rows of data, which doesn't necessitate the complexity of solving the parallel solution). Parallel processing requires the poster to solve the unique batch numbering by taking into consideration the processing node in the batch numbering sequence by using a more complicated incrementing expression and initializing the batch number to the processing node +1. But I didn't want to give the entire solution, the poster obviously needs to experiment with stage variables. :wink:



Hi Ray/Bland

As I have taken three stage variables and one Jobparameter.

The jobparameter is having the fixed value in my case it is 200

and stage variables
svl1 Intial value is 0
svGNo to increment the Batch Number.
svGc To capture the batch number changed records
svL1 to count the no of records

Here are the derivations Corresponding to the above columns

if (svL1 = pMAX_DOCUMENT_LINES) then 1 else svL1 + 1----->svL1
if svL1= 1 then svGNo+1 else svGNo---------->svGNo
if svGNo =1 Then 1 Else 0----------------->svGc

My actual requierement is I should populate each group of records header as one record

Ex: For First 10 Records having Gno=1 should have one corresponding Header with corresponding Gno.

For next 10 records one record with corresponding header record with GNo=2 and so on

What i am getting are corret except for the first set of records

Since the GNo is Incrementing immediately after equating the svL1 = pMAX_DOCUMENT_LINES

so I need to capture the record for the first set also.

I thought it is bit lengthy...... But feel more precised....
So any help in this
Rajasekhar
Post Reply