Aggregating Data

ppp · Post by **ppp** » Thu Mar 11, 2010 10:04 am

I have 4 columns in my input as below
Grp_ID
Act_ID
Agntin
Agtxid

And I want to count the number of rows that are duplicate across the 4 columns. So I am using an aggregate stage and am using the count no of rows to calculate the no of rows by grouping on all the 4 input columns.
I am outputting the count into a new column NO_OF_ROWS.

Is this the right way to do?

chulett · Post by **chulett** » Thu Mar 11, 2010 10:17 am

Is it working for you? If so, it is certainly *a* right way.

gssr · Post by **gssr** » Fri Mar 12, 2010 12:45 am

ppp wrote: I am using an aggregate stage and am using the count no of rows to calculate the no of rows by grouping on all the 4 input columns.

Is this the right way to do?

As you are using all the four columns, you can get the output count of the Whole DUPLICATE ROWS, If you want the count of duplicate value in a particular column, then you have to mention that particular column.

rohithmuthyala · Post by **rohithmuthyala** » Fri Mar 12, 2010 2:51 am

Yes... i've tried similar kind of aggregation and it did work.

DSXchange

Aggregating Data

Aggregating Data

Re: Aggregating Data