Page 1 of 2

Need to capture a running total at the end of a transform

Posted: Wed Jun 16, 2004 10:33 am
by rcasey
Hi all

I have 3 fields accumulating in a translate that I need to output to a sequential file at the end of the translate step. Any ideas? :?

Posted: Wed Jun 16, 2004 10:54 am
by denzilsyb
Can you give a little more info on the problem?

Can you not read the output at the end? What is the translate step - a transformer?

dnzl

Posted: Wed Jun 16, 2004 11:03 am
by rcasey
Let me explain. Yes it is a transformer step processing/formatting a large amount of data from a DB2 stage. I need the total amounts of three fields at the end for balancing purposes and would like to not have to aggregate the data due to the volume. I can pass a running total into the output link but still would have to deal with how to get that end total.
Thx

Posted: Wed Jun 16, 2004 11:07 am
by kduke
I would use 3 stage variables.

Tot1 = Tot1 + Col1

Posted: Wed Jun 16, 2004 11:09 am
by rcasey
I have the columns accumulating in stage variables, just do not know how to get at them for a one time write at end of the translate.
thx

Posted: Wed Jun 16, 2004 11:13 am
by kcbland
Why don't you want to use the aggregator? To simply stream a few columns to it and group by a static column value, like 'X', will accumulate the totals for you and when the stage closes output that one row of totals. This is the simplest and most elegant solution.

Stage variables don't work for what you are doing.

Posted: Wed Jun 16, 2004 12:40 pm
by mhester
I really think you would be better off to implement what Ken has given you, but my feeling is that you do not want to use the agg stage so here is another option - not as elegant, but will get the job done.

This solution would actually combine Kim Duke's post where he suggests that you use stage variables and a Hash table.

1) In your transform stage hard code the value "TOTAL" in the derivation for the key column of the Hash table.

2) Add the stage variables that you want to accumulate (per Kim's suggestion)

3) Create 3 additional columns in the out link named -

TOTAL1
TOTAL2
TOTAL3

or a column for each total you want to accumulate.

4) Since Hash file writes are destructive you will end up with one (1) record in the Hash file keyed as TOTAL with the 3 columns of summed data.

5) Have the Hash file be the source to a simple sequential file

This is simply overkill when you could use the agg stage, but if you choose not to then this too will work.

Regards,

Posted: Wed Jun 16, 2004 1:10 pm
by chucksmith
I suggest you combine the two approaches. Calculate your totals using stage variables, and feed the stage variables to an aggregator with the "Aggregate function" set to "Last". When the transform completes, the aggregator will output one row of your final totals.

Chuck

Posted: Wed Jun 16, 2004 1:29 pm
by rcasey
I can use the aggregator to make things easer but I think I will also try Chuck's suggestion and accumulate in the stage variables. If I read him right there is an aggregate last function or somthing like that, I have not seen that one before but on the other hand there is a mountain of "stuff"
I have not see yet.
Thanks for the replies!!

Posted: Wed Jun 16, 2004 1:42 pm
by chulett
'Last' is an aggregation method in the Aggregator Stage.

Posted: Wed Jun 16, 2004 1:46 pm
by rcasey
Craig, I saw that after I took the time to look. Can I go on the assumption that if I do a group by in the ag with the last function I will get the correct total from the stage variable?

Posted: Wed Jun 16, 2004 1:52 pm
by chulett
Yup!

Posted: Wed Jun 16, 2004 1:52 pm
by kcbland
You don't need the stage variables. Just pass an 'X' as the value for a column 'GROUPBYME' in the output link to the aggregator. Setup 3 other columns to track the numbers you want.

In the aggregator stage set column GROUPBYME to have group-by checked and listed as the derivation. For the other three columns, derive them with SUM(thecolumnname1 or 2 or 3).

Posted: Wed Jun 16, 2004 3:38 pm
by ketfos
Why not make use of Transform - RowProcRunningTotal (available in Category - sdk\RowProc)
It returns running total of input values.

ketfos

Posted: Wed Jun 16, 2004 3:44 pm
by chucksmith
Just because you use an aggregator, does not mean you have to use a group by. That is one of the nice things about "Last".

This is also a handy trick if you output based upon the difference between the previous row and the current row. Since the last row will never have something to trigger it (a next row), we can use the aggregator to be that trigger. Also, this may be memory efficient than using the aggregator with a "group by".

Chuck