One input record to multiple output records

jlockwo · Post by **jlockwo** » Wed Sep 15, 2004 11:56 am

Hi, I'm hoping someone can help me. We have a server job that has a performance problem and I am hoping to correct it.

A transformer reads in a file and for each record in that file we write out to 10 different files, then we merge them back into a single file using the link collector. We don't really want to save the 10 different files, but the link collector forces us to, since it doesn't accept active-to-active inputs.

So, for example, say our file contains the alphabet, one letter per row:

Letter
a
b
c
d
e
:
:

We want to end up with a file that reads:
Letter &nbsp Number
a           1
a           2
a           3
a           4
a           5
a           6
a           7
a           8
a           9
a           10
b           1
b           2
b           3
b           4
b           5
:
:

Does anyone have a suggestion on how to do this more effciently? Thank you!

ogmios · Post by **ogmios** » Wed Sep 15, 2004 12:13 pm

If it's just the each line of a file with a number appended to it: generate 10 lines in the transformer (including the newlines) per inputline and output that.

Ogmios

jlockwo · Post by **jlockwo** » Wed Sep 15, 2004 12:19 pm

To generate the 10 lines though, don't you need to have 10 output files?
One with:
a 1
b 1
c 1
d 1
:
:

and another with:
a 2
b 2
c 2
d 2
:
:

Up to 10. That's what we have, then we merge the 10 files back together to 1 file using the link collecter.

The problem is we end up creating and saving 10 files that we don't really want and we end up with a performance problem.

So, is there a different way to generate 10 rows using the transformer that will all be within the same ouput file?

jlockwo · Post by **jlockwo** » Wed Sep 15, 2004 12:23 pm

Oops, I just understood what you meant when you mentioned including the new line.

The input file actually has 276 columns and we want to write out 277 (our number column), so it would be rather difficult to do in the method you describe.

ogmios · Post by **ogmios** » Wed Sep 15, 2004 12:33 pm

Besides using a full BASIC job I don't have an easy faster solution... if you were on UNIX and the performance was really bad I would execute a script using awk or perl to do the actions.

Ogmios

KeithM · Post by **KeithM** » Wed Sep 15, 2004 1:13 pm

Could you just bring the same file in ten different times and send those to the link collector? I'm not sure if that would improve performance but it would remove the process of creating ten different temporary files.

shawn_ramsey · Post by **shawn_ramsey** » Wed Sep 15, 2004 1:59 pm

I am assuming that you really want to represent a single line 10 times and that the link collector is the only way that you figured out how to do it. What I would do is use a transformer to take the and replicate the source column into 10 columns. I would then feed that transformer into a pivot stage to take the columns and make them rows.

So using your example.

The feed would look like this into the transformer:

Code: Select all

Letter
a
b

Out of the Transformer:
Letter1 Number1 Letter2 Number2 ... Letter10 Number10
a       1       a       2           a        10
b       1       b       2           b        10

Out of the Pivot:
Letter   Number
a           1
a           2
a           3
a           4
a           5
a           6
a           7
a           8
a           9
a           10
b           1
b           2
b           3
b           4
b           5
:
:

jlockwo · Post by **jlockwo** » Wed Sep 15, 2004 2:58 pm

A Pivot Stage? That sounds like it would work, but I've never heard of such a thing. Could you explain where you find a pivot stage in datastage? I mean under what grouping on the palette?

crouse · Post by **crouse** » Wed Sep 15, 2004 2:59 pm

Read the file with a table def of just one column (the whole row sits in the column) like varchar 4000, into column name "A".

In the transformer, concatenate the row 10 times with the incrementing line number and the row terminator for the target OS (looks like you're on NT).

Example for the derivation on the output in the transformer, assuming your resultant file is comma seperated:
A : ",1" : char(013) : char(010) : A : ",2" : char(013) : char(010) :A : ",3" : char(013) : char(010) : ... : A : ",10" : char(013) : char(010)

Output to a seq file with no line terminator (000), and no column delimeters (000).

View the file in notepad or whatever and you'll have a file that has 10 rows for every one row written from the transformer.

Send me a private email if you don't understand this ( I had to write it in a hurry).

shawn_ramsey · Post by **shawn_ramsey** » Wed Sep 15, 2004 3:46 pm

jlockwo wrote:A Pivot Stage? That sounds like it would work, but I've never heard of such a thing. Could you explain where you find a pivot stage in datastage? I mean under what grouping on the palette?

It is under the Processing Palette and looks like a box with an arrow pointing up and another pointing left. The derivation column is a little weird and I had to go to the .pdf several times to figure it out. It basically takes a comma delimited list of source columns.

jlockwo · Post by **jlockwo** » Wed Sep 22, 2004 9:29 am

I just wanted to say thanks! The Pivot stage worked great. We didn't have that plug-in installed, so I had to have it done, but it was worth it. Thanks!

DSXchange

One input record to multiple output records

One input record to multiple output records

Re: One input record to multiple output records

Re: One input record to multiple output records