Reverse the aggregation Process

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
Shamanth_vk
Premium Member
Premium Member
Posts: 9
Joined: Sat Jul 14, 2007 10:47 am

Reverse the aggregation Process

Post by Shamanth_vk »

We have requirement like below, the records are in aggregated format and we need to reverse the aggregation process.

For Ex. Input
a,2000,2
b,1500,3

output
-- a,1000
-- a,1000
-- b,500
-- b,500
-- b,500

Please let me know how to acheive this in datastage. Thanks a lot in advance.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

What is the largest number that will appear in column #3?
What should happen if column #2 is not precisely divisible by column #3?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Shamanth_vk
Premium Member
Premium Member
Posts: 9
Joined: Sat Jul 14, 2007 10:47 am

Post by Shamanth_vk »

ray.wurlod wrote:What is the largest number that will appear in column #3?
What should happen if column #2 is not precisely divisible by column #3? ...

- Largest number that will appear would be 4.
- If column #2 is not precisely divisible by column #3. This scenario will not be occurring in our case.

Please let me know if more information is required.

Thanks & Regards,
Shamanth VK
Sainath.Srinivasan
Participant
Posts: 3337
Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom

Post by Sainath.Srinivasan »

With just max 4 splits, you will be able to code using 4 links from a transformer and funnel them together.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

That's what I would suggest too. Constrain the outputs based on the value in column #3.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

Create a text file with these rows in it:
1,1
2,2
2,2
3,3
3,3
3,3
4,4
4,4
4,4
4,4

In a lookup stage join to this file on the aggregated count column and switch on "Multiple Rows Returned from this Link" so the lookup will then create 1 to 4 output rows depending on the number used. A lookup stage can duplicate input rows if multiple lookup rows are found. Send the output to a Transformer where you define the final columns and divide the aggregated number by the aggregated count to come up with the new output amount.

There are two advantages - you only have to define your mappings and rules once and not four times in the transformer and you will find this Transformer easier to code and maintain. It's extendable without changing or recompiling the job - put 5, 6, 7s into that lookup file and it will automatically handle them if they ever come through.
Post Reply