I have an input record with the format:
(a, b ,c ,d ,tax1, value1 ,anothervalue1, tax2, value2, anothervalue2)
Datatype of tax1 and tax2 is tax, value1 and value2 is value, anothervalue1 and anothervalue2 is anothervalue.
My output record looks like:
(tax, value, anothervalue) - So output record is a subset of the input record.
How can I use the transformer to create 2 output records:
tax1, value1, anothervalue1
tax2 ,value2 a,nothervalue2
The problem I am having is that I am not able to map both (tax1,value1,anothervalue1) and (tax2,value2,anothervalue2) from the input file to the output file (tax,value,anothervalue).
So basically from a 1 record input file I need 2 records in the output file and all the fields in both output records are in the single record of the input file.
Any suggestions would be helpful.
Regards.
Splitting one input record into multiple output records
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 4
- Joined: Mon Jul 28, 2003 6:52 am
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Strange as it may seem a Sequential File stage can do this for you.
stuff -----> Transformer -----> SeqFile -----> stuff
On the output link from the Transformer stage define one column, containing tax1 : "," : value1 : "," : anothervalue1 : LF : tax2 : "," : value2 : "," : anothervalue2
LF is a stage variable containing the appropriate linefeed character, for example Char(10) on UNIX. [Edited after Roy's post. ]
Format for the input link specifies no column delimiter character (000).
On the output link for the Sequential File stage, specify comma-delimited format with a regular (UNIX-style) end of line. You will find that the one row you wrote to the sequential file has "magically" become two lines because of the intermediate LF character.
stuff -----> Transformer -----> SeqFile -----> stuff
On the output link from the Transformer stage define one column, containing tax1 : "," : value1 : "," : anothervalue1 : LF : tax2 : "," : value2 : "," : anothervalue2
LF is a stage variable containing the appropriate linefeed character, for example Char(10) on UNIX. [Edited after Roy's post. ]
Format for the input link specifies no column delimiter character (000).
On the output link for the Sequential File stage, specify comma-delimited format with a regular (UNIX-style) end of line. You will find that the one row you wrote to the sequential file has "magically" become two lines because of the intermediate LF character.
Last edited by ray.wurlod on Wed Aug 13, 2003 2:45 am, edited 1 time in total.
Hi,
Ray Is right, but a little correction:
on unix systems line termination is LF (Line Feed) which is char(10)
on windows line termination is CR(Carriage Return) + LF which is Char(13) : Char(10)
also, as Ray mentioned, you can use unix style LF termination even on windows files.
If I'm not mistaken there is also the pivot stage option.
p.s.
Ray, on the third time I'll ask for an icecreem (J/K)
Good Luck,
Ray Is right, but a little correction:
on unix systems line termination is LF (Line Feed) which is char(10)
on windows line termination is CR(Carriage Return) + LF which is Char(13) : Char(10)
also, as Ray mentioned, you can use unix style LF termination even on windows files.
If I'm not mistaken there is also the pivot stage option.
p.s.
Ray, on the third time I'll ask for an icecreem (J/K)
Good Luck,
Roy R.
Time is money but when you don't have money time is all you can afford.
Search before posting:)
Join the DataStagers team effort at:
http://www.worldcommunitygrid.org
Time is money but when you don't have money time is all you can afford.
Search before posting:)
Join the DataStagers team effort at:
http://www.worldcommunitygrid.org