Implemention of aggregate funtion SUM

ksv2584 · Post by **ksv2584** » Mon Feb 28, 2011 4:20 am

Hi All

I need to sum the amounts(100,200,-300 etc) based on code(abc,def,gkt etc) key column,but i want this implementation to be done without using aggration stage

Since using aggration stage,we cannot carry fwd remaining columns for further transformation. Though we can do self join after aggragation key columns are different in this scenario

Please suggest to sloution to this

Thanks
Vidya

ray.wurlod · Post by **ray.wurlod** » Mon Feb 28, 2011 5:13 am

Use the Aggregator stage, perhaps in a fork-join model as you indicate. It will work, and it's the easiest mechanism.

ksv2584 · Post by **ksv2584** » Mon Feb 28, 2011 10:24 pm

Hi ray,

I just want to clarify one more thing,intially iam doing the lookup b/w source file and lookup file with 5 key cloumns.

But for aggregation ,i just use one key column of the 5 key cloumns,so doing a fork join based on just one key , will this affect my logic flow

Can i perform this without losing any data,if yes i can do a leftouter join making the aggregator output link as right

please clarify above issue

Lots of thanks
Vidya

ray.wurlod · Post by **ray.wurlod** » Mon Feb 28, 2011 11:41 pm

You won't lose any data. Only pass the columns to be grouped or summed to the Aggregator stage - pass all the others along the other link to the Join stage. All will be well, and you should not need an outer join - after all, the grouping keys came from the same data source!

ksv2584 · Post by **ksv2584** » Mon Feb 28, 2011 11:53 pm

Hi Ray

Thanks for the early reply ..

one last clarification ..

The key cloumn now i have for fork join is country code (ind,aus,nep) coming from aggrator output ,which will have duplicate values,so please can you let me know which type of join i need to perform (inner,left)

iam also placing a filter after aggregator based aggragete sum,before join,will this have any effect

ray.wurlod · Post by **ray.wurlod** » Tue Mar 01, 2011 4:37 am

Inner will be fine. There won't be duplicate values out of the aggregator (because this was the grouping column) but there will be duplicates in the stream data. That's OK, that's what a join does.

As to your Filter question, that may cause an issue if you're doing an inner join - depends really on what the Filter is doing. You provided no information about that.

ksv2584 · Post by **ksv2584** » Tue Mar 01, 2011 5:19 am

Thanks Ray ..

Implemented working fn ..