Sequence on the output from an aggregator stage

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
saraswati
Premium Member
Premium Member
Posts: 149
Joined: Thu Feb 28, 2008 4:25 pm

Sequence on the output from an aggregator stage

Post by saraswati »

I want to do sum on the values of a column and then populate it into another column.Then, use remove duplicate to remove the duplicate records so that there will be only one records with summed value based on the key columns.

Now,I want to set a sequence or rather a value '1' into another column based on the highest value on a key column.

For Example:
1000 4 1 --------> As 4 is the highest
1000 2 2 ---------> As 2 is the next highest.
1000 1 3 ---------> As 1 is the next highest.
..............

1 , 2 ,3 should be in sequence based on the highest values on the second column.

Please let me know how to implement it.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Use a fork-join design with an Aggregator reporting the max value. Downstream check whether the max value is the same as the current value and set your flag accordingly.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
srinivas.nettalam
Participant
Posts: 134
Joined: Tue Jun 15, 2010 2:10 am
Location: Bangalore

Re: Sequence on the output from an aggregator stage

Post by srinivas.nettalam »

Use the first and second columns in sort as sorting keys in descending order and generate the sequence numbers until the value in the key col1(1000) is changed.That can be achieved using stage variables in the transformer
N.Srinivas
India.
zulfi123786
Premium Member
Premium Member
Posts: 730
Joined: Tue Nov 04, 2008 10:14 am
Location: Bangalore

Post by zulfi123786 »

So far as i understand you are consecutively summing up values of a column and want to flag them so that you can use the flag in remove duplicates stage to identify the record which has got the actual sum for all records bearing the same key.

Well you can do that as suggested or what i feel is you may not need to flag them at all. you could perform a consecutive sum and then pass on the records to remove duplicates stage and sort it on the summed column as desc and fetch the first row, by doing so you would be selecting the correct record which has the actual sum of all rows for that column
Post Reply