Aggregator (Counting grouped Data)

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
dstager
Participant
Posts: 47
Joined: Tue Jan 08, 2008 8:43 am

Aggregator (Counting grouped Data)

Post by dstager »

Hi DS Experts,

I am trying to 'count' records grouped by a certain ID and sorted by Date.

Example:

Input Data:

RowID Sale_Type Date
ABC A1 Jan 1
ABC C2 May 21
BCE A1 Aug 3
DFE K1 Feb 3
ABC Z4 Feb 5
BCE A1 Mar 1
DFE G4 Jan 2

Desired Output:

RowID Sale_Type Date Counter
ABC A1 Jan 1 1
ABC Z4 Feb 5 2
ABC C2 May 21 3
BCE A1 Mar 1 1
BCE A1 Aug 3 2
DFE G4 Jan 2 1
DFE K1 Feb 3 2

In Summary, for every unique RowID (grouped), I would like initiate a counter (incremented by 1) and sorted by date.

Please help.

Thanks.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Since there's no real 'aggregation' - a couple of thoughts are here, more can be found by searching for 'key change column'.
-craig

"You can never have too many knives" -- Logan Nine Fingers
dstager
Participant
Posts: 47
Joined: Tue Jan 08, 2008 8:43 am

Post by dstager »

I tried the solution below, but the sVValue keeps incrementing (it does not revert back to 1 when a new key grouping is identified).

How do I initialize a variable? Do I just set the 'initial value' to 1?

Thanks.




Simple solution will be to use stage variables. Try something like this....
Sort the data on your three columns.
Initialize sVValue as 1
Code:

sVCurrent = COLA:COLB:COLC
sVValue = If sVCurrent = sVPrevious then sVValue+1 else sVValue
sVPrevious = sVCurrent



And use sVValue in your column derivation of INT column.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

You need to set it back to 1 rather than increment it when the group changes. In other words, change that "else SVValue" to "else 1". That Initial Value is an 'at the moment the job starts' value.
-craig

"You can never have too many knives" -- Logan Nine Fingers
dstager
Participant
Posts: 47
Joined: Tue Jan 08, 2008 8:43 am

Post by dstager »

Thank you for your help!
vinothkumar
Participant
Posts: 342
Joined: Tue Nov 04, 2008 10:38 am
Location: Chennai, India

Post by vinothkumar »

How can I achieve the same in Server Edition ?
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Same way.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Perwezakh
Premium Member
Premium Member
Posts: 38
Joined: Mon Jun 06, 2005 9:13 am
Location: Chicago, IL

Post by Perwezakh »

chulett wrote:You need to set it back to 1 rather than increment it when the group changes. In other words, change that "else SVValue" to "else 1". That Initial Value is an 'at the moment the job starts' value.
Hi Criag,
Thanks for your reply here. One question.. I am using the same logic and in Parallel env, its not working.. Its creating Two Sets of identical rows (Key Column and Counter) because my process is running on two Nodes. Please advice how to fix this issue.
Thanks in advance..
Kryt0n
Participant
Posts: 584
Joined: Wed Jun 22, 2005 7:28 pm

Post by Kryt0n »

Make sure you partition correctly
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Or run on one node. :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply