Page 1 of 1

Aggregating Data

Posted: Thu Mar 11, 2010 10:04 am
by ppp
I have 4 columns in my input as below
Grp_ID
Act_ID
Agntin
Agtxid

And I want to count the number of rows that are duplicate across the 4 columns. So I am using an aggregate stage and am using the count no of rows to calculate the no of rows by grouping on all the 4 input columns.
I am outputting the count into a new column NO_OF_ROWS.

Is this the right way to do?

Posted: Thu Mar 11, 2010 10:17 am
by chulett
Is it working for you? If so, it is certainly *a* right way.

Re: Aggregating Data

Posted: Fri Mar 12, 2010 12:45 am
by gssr
ppp wrote: I am using an aggregate stage and am using the count no of rows to calculate the no of rows by grouping on all the 4 input columns.


Is this the right way to do?

As you are using all the four columns, you can get the output count of the Whole DUPLICATE ROWS, If you want the count of duplicate value in a particular column, then you have to mention that particular column. :arrow:

Posted: Fri Mar 12, 2010 2:51 am
by rohithmuthyala
Yes... i've tried similar kind of aggregation and it did work.