Aggregating Data

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
ppp
Participant
Posts: 21
Joined: Mon Aug 31, 2009 11:53 am

Aggregating Data

Post by ppp »

I have 4 columns in my input as below
Grp_ID
Act_ID
Agntin
Agtxid

And I want to count the number of rows that are duplicate across the 4 columns. So I am using an aggregate stage and am using the count no of rows to calculate the no of rows by grouping on all the 4 input columns.
I am outputting the count into a new column NO_OF_ROWS.

Is this the right way to do?
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Is it working for you? If so, it is certainly *a* right way.
-craig

"You can never have too many knives" -- Logan Nine Fingers
gssr
Participant
Posts: 243
Joined: Fri Jan 09, 2009 12:51 am
Location: India

Re: Aggregating Data

Post by gssr »

ppp wrote: I am using an aggregate stage and am using the count no of rows to calculate the no of rows by grouping on all the 4 input columns.


Is this the right way to do?

As you are using all the four columns, you can get the output count of the Whole DUPLICATE ROWS, If you want the count of duplicate value in a particular column, then you have to mention that particular column. :arrow:
RAJ
rohithmuthyala
Participant
Posts: 57
Joined: Wed Oct 21, 2009 4:46 am
Location: India

Post by rohithmuthyala »

Yes... i've tried similar kind of aggregation and it did work.
Rohith
Post Reply