Page 1 of 1
Generating records based on column value
Posted: Tue Oct 30, 2012 7:17 am
by bgs_vb
Hi ,
I have a equirement to generate number of duplicate records based on a column value .Example is shown below as :
Input :
id,name,address,col1
1,abc,xyz,5
2,cde,yz,2
Output:
As col1 is having value as 5 ,so in output we require 5 nstances of the same record .Similarly for second record ,we need to generate 2 records
id,name,address,col1
1,abc,xyz,10
1,abc,xyz,10
1,abc,xyz,10
1,abc,xyz,10
1,abc,xyz,10
2,cde,yz,2
2,cde,yz,2
Please advise how can i achieve this result .
Posted: Tue Oct 30, 2012 7:19 am
by ArndW
This looks like the perfect application of transform stage loops available in Version 8.7 and above of DataStage. Do you have this version?
Posted: Tue Oct 30, 2012 7:33 am
by BI-RMA
Minor correction: Transformer-Loop is available since version 8.5.
Posted: Tue Oct 30, 2012 7:34 am
by bgs_vb
Hi
I am working on Datastage 8.1 .Also ,is there any other way of doing it rather than using transform lopp variable ? As we are expecting huge data from source and using transform loop variables will slow down the process .
Re: Generating records based on column value
Posted: Tue Oct 30, 2012 7:50 am
by chulett
bgs_vb wrote:id,name,address,col1
1,abc,xyz,10
1,abc,xyz,10
1,abc,xyz,10
1,abc,xyz,10
1,abc,xyz,10
2,cde,yz,2
2,cde,yz,2
So... you need to turn the '5' in the original record to a '10' for that first id?
![Confused :?](./images/smilies/icon_confused.gif)
Posted: Tue Oct 30, 2012 8:15 am
by BI-RMA
bgs_vb wrote: As we are expecting huge data from source and using transform loop variables will slow down the process .
Who told You that using loop variables was relevant in terms of performance? It is not. By the way, I thought using transform loop variables is not an option because You are on version 8.1?
Posted: Tue Oct 30, 2012 11:44 am
by ArndW
sorry, as corrected above, the loops are available as of 8.5.
Since you have 8.1 and loops within a transform stage are not an option you have a simple, but limited option if the maximum number of iterations is known; i.e. if you know you won't have more than 25 in column1 then you could make a transform stage with 25 outputs that have conditions along the lines of "In.Column1 < 2", "In.Column3 < 1", etc. This is a simple and bulletproof approach but won't work if can have values of 100 or higher.
Posted: Tue Oct 30, 2012 7:02 pm
by ray.wurlod
There's a new addition to the Sequential File stage (and to the Big Data stage) in version 9.1 that does exactly this task.