Page 1 of 1

Replicating Input Records based on a column value

Posted: Thu May 21, 2009 6:54 am
by in_finity307
Hi,

I have an input source that has a field A. My requirement is that if the length of If the field A is longer than 132 then split the line in as many records to ensure that tha length is not longer than 132 .

Can you tell me the way in which this can be implemented in Datastage. I am not using Datastage in Unix environment, so a unix shell script can not be used.

One other way I guess would be to use a Build - Op stage but I am not sure how that stage is used.

I am working on Datastage 8.0

Posted: Thu May 21, 2009 7:46 am
by chulett
Just an aside - the EE installs the MKS Toolkit on your Windows server so you do in fact have the ability to run a 'UNIX' script. FYI.

Posted: Thu May 21, 2009 8:29 am
by Sainath.Srinivasan
What is the size of the input column? Or is it unbounded ?

It it from seq file or some other source?

Who provides it to you?

If you know the max length, you can split the rows in a transformer and gather with a funnel.

Seq file can be resolved using awk script in source.

Others will need buildOp.

Posted: Fri May 22, 2009 1:56 am
by in_finity307
Can someone give the Build-op code for this logic?

The length of the input column is not known. Its a free text column.

Posted: Fri May 22, 2009 2:37 am
by Sainath.Srinivasan
Have you tried awk script?

Posted: Fri May 22, 2009 3:09 am
by in_finity307
No, the data is coming from a database.

Posted: Fri May 22, 2009 3:15 am
by Sainath.Srinivasan
Then surely it must have a max length.

Posted: Fri May 22, 2009 3:27 am
by in_finity307
The maximum length in the database is defined as 2000

Posted: Fri May 22, 2009 4:35 am
by Sainath.Srinivasan
Then join with a row generator to get a cartesian product.

This can be used in substring later.