Creating duplicate rows from input dataset

ajith · Post by **ajith** » Thu Aug 31, 2006 2:17 am

How to create duplicate rows from input dataset with respect to a condition

Hi All,

I have a Order start date (ORDER_ST_DT DATE) and an order completion date (ORDER_END_DT DATE). If the difference between start and completion date is greater than 5 then that record has to be duplicated that much times.

For eg, if the difference is 10 the same row has to be duplicated 10 times! Do we have any method in DataStage Parallel jobs to implement the same?

Thanks in Advance
Ajith

Kirtikumar · Post by **Kirtikumar** » Thu Aug 31, 2006 3:01 am

I am not sure if any stage can work like this.

But this can be done using a BuildOp.
As it is dynamic i.e. the total number of rows to be created, I think you might have to built a BuildOp in which you can create a loop and output rows as per this value of the diff between two cols.

Create a column which will store the diff between two dates. Pass this to Build op. In the BuildOp add a loop upto this diff col and output row in the loop.

kumar_s · Post by **kumar_s** » Thu Aug 31, 2006 6:24 am

Iam not with Datastage now, pls check if PadChar function can replicate string instead of just char. If so you can replicated the field concatinaed with a new line character.

Kirtikumar · Post by **Kirtikumar** » Fri Sep 01, 2006 12:03 am

Wow!!! that's a new learning.

But can this happen on the link? Because in PX, data might be treated as single row (due to virtual dataset), isn't it?
Writing this into a sequential file might result in a new line and multiple rows. But storing it in dataset may not help.

ray.wurlod · Post by **ray.wurlod** » Fri Sep 01, 2006 2:47 am

You might most easily be able to accomplish this in a shell script invoked from an External Filter stage. There does not have to be a one-to-one mapping between input and output row counts from this stage type.