Generating records based on column value

bgs_vb · Post by **bgs_vb** » Tue Oct 30, 2012 7:17 am

Hi ,

I have a equirement to generate number of duplicate records based on a column value .Example is shown below as :

Input :
id,name,address,col1
1,abc,xyz,5
2,cde,yz,2

Output:
As col1 is having value as 5 ,so in output we require 5 nstances of the same record .Similarly for second record ,we need to generate 2 records

id,name,address,col1
1,abc,xyz,10
1,abc,xyz,10
1,abc,xyz,10
1,abc,xyz,10
1,abc,xyz,10
2,cde,yz,2
2,cde,yz,2

Please advise how can i achieve this result .

ArndW · Post by **ArndW** » Tue Oct 30, 2012 7:19 am

This looks like the perfect application of transform stage loops available in Version 8.7 and above of DataStage. Do you have this version?

BI-RMA · Post by **BI-RMA** » Tue Oct 30, 2012 7:33 am

Minor correction: Transformer-Loop is available since version 8.5.

bgs_vb · Post by **bgs_vb** » Tue Oct 30, 2012 7:34 am

Hi

I am working on Datastage 8.1 .Also ,is there any other way of doing it rather than using transform lopp variable ? As we are expecting huge data from source and using transform loop variables will slow down the process .

chulett · Post by **chulett** » Tue Oct 30, 2012 7:50 am

bgs_vb wrote:id,name,address,col1
1,abc,xyz,10
1,abc,xyz,10
1,abc,xyz,10
1,abc,xyz,10
1,abc,xyz,10
2,cde,yz,2
2,cde,yz,2

So... you need to turn the '5' in the original record to a '10' for that first id?

BI-RMA · Post by **BI-RMA** » Tue Oct 30, 2012 8:15 am

bgs_vb wrote: As we are expecting huge data from source and using transform loop variables will slow down the process .

Who told You that using loop variables was relevant in terms of performance? It is not. By the way, I thought using transform loop variables is not an option because You are on version 8.1?

ArndW · Post by **ArndW** » Tue Oct 30, 2012 11:44 am

sorry, as corrected above, the loops are available as of 8.5.

Since you have 8.1 and loops within a transform stage are not an option you have a simple, but limited option if the maximum number of iterations is known; i.e. if you know you won't have more than 25 in column1 then you could make a transform stage with 25 outputs that have conditions along the lines of "In.Column1 < 2", "In.Column3 < 1", etc. This is a simple and bulletproof approach but won't work if can have values of 100 or higher.

ray.wurlod · Post by **ray.wurlod** » Tue Oct 30, 2012 7:02 pm

There's a new addition to the Sequential File stage (and to the Big Data stage) in version 9.1 that does exactly this task.

DSXchange

Generating records based on column value

Generating records based on column value

Re: Generating records based on column value