Reverse the aggregation Process

Shamanth_vk · Post by **Shamanth_vk** » Wed Dec 23, 2009 12:17 am

We have requirement like below, the records are in aggregated format and we need to reverse the aggregation process.

For Ex. Input
a,2000,2
b,1500,3

output
-- a,1000
-- a,1000
-- b,500
-- b,500
-- b,500

Please let me know how to acheive this in datastage. Thanks a lot in advance.

ray.wurlod · Post by **ray.wurlod** » Wed Dec 23, 2009 1:10 am

What is the largest number that will appear in column #3?
What should happen if column #2 is not precisely divisible by column #3?

Shamanth_vk · Post by **Shamanth_vk** » Wed Dec 23, 2009 3:42 am

ray.wurlod wrote:What is the largest number that will appear in column #3?
What should happen if column #2 is not precisely divisible by column #3? ...

- Largest number that will appear would be 4.
- If column #2 is not precisely divisible by column #3. This scenario will not be occurring in our case.

Please let me know if more information is required.

Thanks & Regards,
Shamanth VK

Sainath.Srinivasan · Post by **Sainath.Srinivasan** » Wed Dec 23, 2009 4:30 am

With just max 4 splits, you will be able to code using 4 links from a transformer and funnel them together.

ray.wurlod · Post by **ray.wurlod** » Wed Dec 23, 2009 4:10 pm

That's what I would suggest too. Constrain the outputs based on the value in column #3.

vmcburney · Post by **vmcburney** » Wed Dec 23, 2009 9:50 pm

Create a text file with these rows in it:
1,1
2,2
2,2
3,3
3,3
3,3
4,4
4,4
4,4
4,4

In a lookup stage join to this file on the aggregated count column and switch on "Multiple Rows Returned from this Link" so the lookup will then create 1 to 4 output rows depending on the number used. A lookup stage can duplicate input rows if multiple lookup rows are found. Send the output to a Transformer where you define the final columns and divide the aggregated number by the aggregated count to come up with the new output amount.

There are two advantages - you only have to define your mappings and rules once and not four times in the transformer and you will find this Transformer easier to code and maintain. It's extendable without changing or recompiling the job - put 5, 6, 7s into that lookup file and it will automatically handle them if they ever come through.