Page 1 of 1

Aggregator (Counting grouped Data)

Posted: Mon May 25, 2009 2:21 pm
by dstager
Hi DS Experts,

I am trying to 'count' records grouped by a certain ID and sorted by Date.

Example:

Input Data:

RowID Sale_Type Date
ABC A1 Jan 1
ABC C2 May 21
BCE A1 Aug 3
DFE K1 Feb 3
ABC Z4 Feb 5
BCE A1 Mar 1
DFE G4 Jan 2

Desired Output:

RowID Sale_Type Date Counter
ABC A1 Jan 1 1
ABC Z4 Feb 5 2
ABC C2 May 21 3
BCE A1 Mar 1 1
BCE A1 Aug 3 2
DFE G4 Jan 2 1
DFE K1 Feb 3 2

In Summary, for every unique RowID (grouped), I would like initiate a counter (incremented by 1) and sorted by date.

Please help.

Thanks.

Posted: Mon May 25, 2009 2:41 pm
by chulett
Since there's no real 'aggregation' - a couple of thoughts are here, more can be found by searching for 'key change column'.

Posted: Mon May 25, 2009 3:32 pm
by dstager
I tried the solution below, but the sVValue keeps incrementing (it does not revert back to 1 when a new key grouping is identified).

How do I initialize a variable? Do I just set the 'initial value' to 1?

Thanks.




Simple solution will be to use stage variables. Try something like this....
Sort the data on your three columns.
Initialize sVValue as 1
Code:

sVCurrent = COLA:COLB:COLC
sVValue = If sVCurrent = sVPrevious then sVValue+1 else sVValue
sVPrevious = sVCurrent



And use sVValue in your column derivation of INT column.

Posted: Mon May 25, 2009 3:41 pm
by chulett
You need to set it back to 1 rather than increment it when the group changes. In other words, change that "else SVValue" to "else 1". That Initial Value is an 'at the moment the job starts' value.

Posted: Tue May 26, 2009 8:27 am
by dstager
Thank you for your help!

Posted: Tue May 26, 2009 1:10 pm
by vinothkumar
How can I achieve the same in Server Edition ?

Posted: Tue May 26, 2009 1:21 pm
by chulett
Same way.

Posted: Tue Nov 10, 2009 5:10 pm
by Perwezakh
chulett wrote:You need to set it back to 1 rather than increment it when the group changes. In other words, change that "else SVValue" to "else 1". That Initial Value is an 'at the moment the job starts' value.
Hi Criag,
Thanks for your reply here. One question.. I am using the same logic and in Parallel env, its not working.. Its creating Two Sets of identical rows (Key Column and Counter) because my process is running on two Nodes. Please advice how to fix this issue.
Thanks in advance..

Posted: Tue Nov 10, 2009 5:33 pm
by Kryt0n
Make sure you partition correctly

Posted: Tue Nov 10, 2009 6:51 pm
by chulett
Or run on one node. :wink: