Page 1 of 1

Implemention of aggregate funtion SUM

Posted: Mon Feb 28, 2011 4:20 am
by ksv2584
Hi All

I need to sum the amounts(100,200,-300 etc) based on code(abc,def,gkt etc) key column,but i want this implementation to be done without using aggration stage

Since using aggration stage,we cannot carry fwd remaining columns for further transformation. Though we can do self join after aggragation key columns are different in this scenario

Please suggest to sloution to this

Thanks
Vidya

Posted: Mon Feb 28, 2011 5:13 am
by ray.wurlod
Use the Aggregator stage, perhaps in a fork-join model as you indicate. It will work, and it's the easiest mechanism.

Implemention of aggregate funtion SUM

Posted: Mon Feb 28, 2011 10:24 pm
by ksv2584
Hi ray,

I just want to clarify one more thing,intially iam doing the lookup b/w source file and lookup file with 5 key cloumns.

But for aggregation ,i just use one key column of the 5 key cloumns,so doing a fork join based on just one key , will this affect my logic flow

Can i perform this without losing any data,if yes i can do a leftouter join making the aggregator output link as right

please clarify above issue

Lots of thanks
Vidya

Posted: Mon Feb 28, 2011 11:41 pm
by ray.wurlod
You won't lose any data. Only pass the columns to be grouped or summed to the Aggregator stage - pass all the others along the other link to the Join stage. All will be well, and you should not need an outer join - after all, the grouping keys came from the same data source!

Posted: Mon Feb 28, 2011 11:53 pm
by ksv2584
Hi Ray

Thanks for the early reply .. :)

one last clarification .. :?:

The key cloumn now i have for fork join is country code (ind,aus,nep) coming from aggrator output ,which will have duplicate values,so please can you let me know which type of join i need to perform (inner,left)

iam also placing a filter after aggregator based aggragete sum,before join,will this have any effect

Posted: Tue Mar 01, 2011 4:37 am
by ray.wurlod
Inner will be fine. There won't be duplicate values out of the aggregator (because this was the grouping column) but there will be duplicates in the stream data. That's OK, that's what a join does.

As to your Filter question, that may cause an issue if you're doing an inner join - depends really on what the Filter is doing. You provided no information about that.

Implemention of aggregate funtion SUM

Posted: Tue Mar 01, 2011 5:19 am
by ksv2584
Thanks Ray ..

Implemented working fn .. 8)