Aggregator help

dsdev_123 · Post by **dsdev_123** » Tue Oct 09, 2007 9:23 pm

I have 8 columns in the dataset.
col1,
col2,
col3,
col4,
col5,
col6,
col7,
col8

I want to have columns in the output h seq file as below.

colA = col1
colB= col2
ColC =col3
ColD = sum(col8) group by col1,col2,col3,col4,col5,col6,col7 having col4 =1
ColE= sum(col8) group by col1,col2,col3,col4,col5,col6,col7 having col4 =1 minus(-) sum(col8) group by col1,col2,col3,col4,col5,col6,col7 having col4 =1 and col5=1
ColF= sum(col8) group by col1,col2,col3,col4,col5,col6,col7 having col4=1 and col5=1 and col6 =1
ColG= col4
ColH = col5
ColI = 1 if col5 =1
2 if col5 =0
colJ= col6
colK = col7.

i am trying to figure out by aggregator stage. plz can any body help me..

Thanks
sarathh

Ramani · Post by **Ramani** » Wed Oct 10, 2007 3:54 am

Use aggregator stage for each groups that you have and then use join stage to join all aggregator outputs.

ray.wurlod · Post by **ray.wurlod** » Wed Oct 10, 2007 4:41 pm

You won't be able to do it in one Aggregator stage, since you need sum(col8) in one of your calculations. So you will need at least two Aggregator stages.

Tip: include an explicit Sort stage on the input to the second Aggregator stage to specify "don't sort (previously sorted)".