logic for grouping

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
prasson_ibm
Premium Member
Premium Member
Posts: 536
Joined: Thu Oct 11, 2007 1:48 am
Location: Bangalore

logic for grouping

Post by prasson_ibm »

Hi ,
I have source data sample as
msisdn,date
750500337,6/4/2009
750500337 ,6/5/2009
750500337 ,6/6/2009
750500337 ,6/7/2009
750500467,6/4/2009
750500467 ,6/5/2009
750500467,6/6/2009
750500467 ,6/7/2009



and i want output as
750500337,6/4/2009
750500467,6/4/2009
i.e. minimum of date in each msisdn group.

Please help me how to do this..??
keshav0307
Premium Member
Premium Member
Posts: 783
Joined: Mon Jan 16, 2006 10:17 pm
Location: Sydney, Australia

Post by keshav0307 »

use a sort stage,
sort on MISDN and DATE ASC.
remove duplicate on MISDN and keep the first record.
priyadarshikunal
Premium Member
Premium Member
Posts: 1735
Joined: Thu Mar 01, 2007 5:44 am
Location: Troy, MI

Post by priyadarshikunal »

Or use an aggregator. That's why it is there.

group on MISDN and take min of DATE. Also don't forget to set preserve data type to true.
Priyadarshi Kunal

Genius may have its limitations, but stupidity is not thus handicapped. :wink:
prasson_ibm
Premium Member
Premium Member
Posts: 536
Joined: Thu Oct 11, 2007 1:48 am
Location: Bangalore

Post by prasson_ibm »

keshav0307 wrote:use a sort stage,
sort on MISDN and DATE ASC.
remove duplicate on MISDN and keep the first record.
Yes i am doing the same thing but target data is mismatching,due to multiple nodes :oops:

When i am running on single node,it works fine.
So is there any way so that if i run the job on multiple node,it will give the exact result??
Sainath.Srinivasan
Participant
Posts: 3337
Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom

Post by Sainath.Srinivasan »

Hash Partition on Msisdn.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

In other words, you need to ensure each partition includes all values for the grouping keys in that partition.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply