Replicating Input Records based on a column value

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
in_finity307
Participant
Posts: 20
Joined: Sat Aug 09, 2008 1:53 pm

Replicating Input Records based on a column value

Post by in_finity307 »

Hi,

I have an input source that has a field A. My requirement is that if the length of If the field A is longer than 132 then split the line in as many records to ensure that tha length is not longer than 132 .

Can you tell me the way in which this can be implemented in Datastage. I am not using Datastage in Unix environment, so a unix shell script can not be used.

One other way I guess would be to use a Build - Op stage but I am not sure how that stage is used.

I am working on Datastage 8.0
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Just an aside - the EE installs the MKS Toolkit on your Windows server so you do in fact have the ability to run a 'UNIX' script. FYI.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Sainath.Srinivasan
Participant
Posts: 3337
Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom

Post by Sainath.Srinivasan »

What is the size of the input column? Or is it unbounded ?

It it from seq file or some other source?

Who provides it to you?

If you know the max length, you can split the rows in a transformer and gather with a funnel.

Seq file can be resolved using awk script in source.

Others will need buildOp.
in_finity307
Participant
Posts: 20
Joined: Sat Aug 09, 2008 1:53 pm

Post by in_finity307 »

Can someone give the Build-op code for this logic?

The length of the input column is not known. Its a free text column.
Sainath.Srinivasan
Participant
Posts: 3337
Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom

Post by Sainath.Srinivasan »

Have you tried awk script?
in_finity307
Participant
Posts: 20
Joined: Sat Aug 09, 2008 1:53 pm

Post by in_finity307 »

No, the data is coming from a database.
Sainath.Srinivasan
Participant
Posts: 3337
Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom

Post by Sainath.Srinivasan »

Then surely it must have a max length.
in_finity307
Participant
Posts: 20
Joined: Sat Aug 09, 2008 1:53 pm

Post by in_finity307 »

The maximum length in the database is defined as 2000
Sainath.Srinivasan
Participant
Posts: 3337
Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom

Post by Sainath.Srinivasan »

Then join with a row generator to get a cartesian product.

This can be used in substring later.
Post Reply