Aggregator, Restricting count for individual group

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

It does give you the correct result. It is incapable of doing otherwise. Justify your accusation. As you do, specify how your rows are partitioned and which columns are key columns.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

I think you need to filter your results then pass it through the aggregator. If you are aggregating before doing the filtering, it will give you one record for each group.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
Munish
Participant
Posts: 89
Joined: Sun Nov 19, 2006 10:34 pm

Post by Munish »

Hi There,
It seems I am missing something here.
I have reached a place where I have recordset with Group and aggregated amount in descending order. It looks like

Ac_N,Serv_N,Period_K,Called_Num,(Amt,KeyChange_Key or ClusterKeyChange_Key)
4,1041,11,"870602299"," 001716.00","1"
4,1041,11,"0884001887"," 000480.00","0"
4,1041,11,"0883270001"," 000426.00","0"
4,1041,11,"253612299"," 000381.00","0"
4,1041,11,"WEATHER"," 000369.00"," "0"
4,1041,11,"870612299"," 000228.00","0"
4,1041,11,"128412299"," 000132.00",""0"
4,1041,11,"0418949922"," 000111.00","0"
4,1041,11,"0427123420"," 000108.00","0"
4,1041,11,"34"," 000102.00","0"
4,1041,11,"177"," 000087.00","0"
4,1041,11,"393212299"," 000084.00","0"

4,1091,11,"+61407867882MNE"," 000048.00","1"
4,1091,11,"0438524642MNET"," 000030.00","0"
4,1091,11,"0393004865"," 000027.00","0"
4,1091,11,"0393004865C"," 000027.00","0"
4,1091,11,"0393004885"," 000027.00","0"
4,1091,11,"0393004855"," 000027.00","0"
4,1091,11,"0428508301MNET"," 000027.00","0"
4,1091,11,"0393004865V"," 000027.00","0"
4,1091,11,"0393004875C"," 000027.00","0"
4,1091,11,"0393004875V"," 000024.00","0"
4,1091,11,"0393004875"," 000024.00","0"
4,1091,11,"0393004885C"," 000024.00","0"

What I want is that last column should work as group counter starting from 1 and then 2,3 and so on.
So that I could use filter stage to restrict to Top 5 or Top 10

I have tried using both
KeyChange and Cluster key Change, (In my case key is
Ac_N,Serv_N,Period_K: Composite Key)

Please advice what am I missing or should I be using some other stage to create this counter, currently I am using Sort stage for that.

Thanks and regards,
Munish
MK
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

What are the key fields?
It should be Ac_N,Serv_N,Period_K and not Called_Num included in it.
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
Munish
Participant
Posts: 89
Joined: Sun Nov 19, 2006 10:34 pm

Post by Munish »

Key fields are as mentioned
Ac_N,Serv_N,Period_K

But I want to see the record with Called_Num and amt.

Is there any thing wrong or should I be doing it in a different way.

Thanks,
Munish
MK
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

Iam not referring to the key that been mentioned in the Aggregator stage before the Sort stage used for Key Change. Though I dont understane the use of first sort stage and Aggregator stage.
The Key group should be in the Sort stage. Perhaps you may need to re-partition.
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
Munish
Participant
Posts: 89
Joined: Sun Nov 19, 2006 10:34 pm

Post by Munish »

1. My sort stage where I am using Key_Change property has 3 keys only.
Ac_N,Serv_N,Period_K

2. Use of first sort stage is to make the aggregator work easier. (We are anticipating 280 Mills rows in one batch.)

3. KeyChange is doing right be putting 1 whenever key combination (Ac_N,Serv_N,Period_K ) changes but makes it 0 later on.
What I want is to work it like a counter so
for every individual group (Ac_N,Serv_N,Period_K )
it should start with 1 and move like a counter 2,3 and so on.

And when a new combination comes it should do the same,
1,2----and so on.

Any input will be highly appreciated.
MK
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Code: Select all

Extract ----->  Sort  ----->  Aggregator  ----->  Sort  ----->  Tail  ----->  Load
The first sort allows the Aggregator to use Sort mode; it sorts by the grouping keys. The second sort sorts by the grouping keys (sort mode = "don't sort (already sorted)" then by the count/sum in descending order. The Tail stage selects the top 100. It executes in sequential mode, so that data are collected into a single stream, using a Sort Merge collector.
Last edited by ray.wurlod on Tue Jan 23, 2007 2:33 am, edited 2 times in total.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Munish
Participant
Posts: 89
Joined: Sun Nov 19, 2006 10:34 pm

Post by Munish »

Thanks to all of you for giving helpful advices.
It is done.
A few facts to share:
1. Tall/Head stage: It gets the top or bottom values from each of the node. Since our dev env has 2 nodes. It was getting the records one from each node.

Got it rolling:
Once I did the aggregation
>
used sort stage with KeyChange column which gives value 1 for every key_Change and 0 for all subesequent values for same key
>
Used transformer stage to create artifical counter with in groups.
>
Used filter to filter the top 10 records

Thanks to all of you for giving helpful advices.
If anyone has better approach, please let me know.
Right now closing the thread.

Thanks and regards,
Munish
MK
Post Reply