Retaining First/Last Values of a Key Group
Moderators: chulett, rschirm, roy
Retaining First/Last Values of a Key Group
In DataStage Server edition, it's very easy to process rows and retain only the first & last values of a column by grouped columns in an aggregator stage. I notice this isn't an option anymore in a parallel job aggregator stage.
How, in a parallel job, would you most efficiently go about handling this.
For example, if I have these records
Column1 Column2
A 5
A 10
A 15
A 20
B 50
B 45
B 40
B 35
I want this output with the first/last values of Column2 grouping Column1
Column1, First, Last
A,5,20
B,50,35
Thanks! I'm new to parallel so this is probably very simple for somebody here to advise me.
How, in a parallel job, would you most efficiently go about handling this.
For example, if I have these records
Column1 Column2
A 5
A 10
A 15
A 20
B 50
B 45
B 40
B 35
I want this output with the first/last values of Column2 grouping Column1
Column1, First, Last
A,5,20
B,50,35
Thanks! I'm new to parallel so this is probably very simple for somebody here to advise me.
-
- Participant
- Posts: 597
- Joined: Fri Apr 29, 2005 6:19 am
- Location: Singapore
Just try minimum and maximum (for strings) !! If it is a decimal then minimum/maximum value will be selected and not the first and last record.
But if you are sure that the first/last record will have the min/max decimal value, then you can still use minimum/maximum option in aggr stage.
But if you are sure that the first/last record will have the min/max decimal value, then you can still use minimum/maximum option in aggr stage.
Kandy
_________________
Try and Try again…You will succeed atlast!!
_________________
Try and Try again…You will succeed atlast!!
-
- Participant
- Posts: 597
- Joined: Fri Apr 29, 2005 6:19 am
- Location: Singapore
-
- Participant
- Posts: 597
- Joined: Fri Apr 29, 2005 6:19 am
- Location: Singapore
-
- Participant
- Posts: 597
- Joined: Fri Apr 29, 2005 6:19 am
- Location: Singapore
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Re: Retaining First/Last Values of a Key Group
Please don't, Mr Moderator.
gsbrown wrote:In DataStage Server edition, it's very easy to process rows and retain only the first & last values of a column by grouped columns in an aggregator stage. I notice this isn't an option anymore in a parallel job aggregator stage.
How, in a parallel job, would you most efficiently go about handling this.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
-
- Premium Member
- Posts: 132
- Joined: Tue Sep 04, 2007 11:38 am
- Location: NOIDA
I wonder why people are not coming up with a solution ......
Here is one without any aggregator stage : -
SRC ------ > Copy stage ----------------------------------->Sort Stage_1
........................|.............................................................|
........................|..............................................Remove Duplicate stage1
........................|..............................................................|
..................Sort Stage2-->Remove Duplicate Stage2-->Join Stage-->o/p
Here in the Sort stage_1 and Sort stage_2 sort on the Key column Col1 and col2 in ascending order and then use Remove duplicate stages,but make sure that in the Remove Duplicate stage1 keep duplicate to retain as first
and in Remove Duplicate stage2 keep duplicate to retain as last.
Just join the output on the basis of column1 to get he desired output
Here is one without any aggregator stage : -
SRC ------ > Copy stage ----------------------------------->Sort Stage_1
........................|.............................................................|
........................|..............................................Remove Duplicate stage1
........................|..............................................................|
..................Sort Stage2-->Remove Duplicate Stage2-->Join Stage-->o/p
Here in the Sort stage_1 and Sort stage_2 sort on the Key column Col1 and col2 in ascending order and then use Remove duplicate stages,but make sure that in the Remove Duplicate stage1 keep duplicate to retain as first
and in Remove Duplicate stage2 keep duplicate to retain as last.
Just join the output on the basis of column1 to get he desired output
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
-
- Participant
- Posts: 597
- Joined: Fri Apr 29, 2005 6:19 am
- Location: Singapore