Reading tab delimated file, encountering problem

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
smohd1338
Premium Member
Premium Member
Posts: 28
Joined: Fri Aug 03, 2012 1:09 pm

Reading tab delimated file, encountering problem

Post by smohd1338 »

Hi,

I am reading a file with multiple records.Each column in a single record is TAB delimated.

the input file looks like this below:(the space btw is TAB)
FH column1 column2 column3
BH Column1 Column2 Column3
DS Column 1 Column2 Column3
DL...etc.

To read each column i gave this logic in transformer by putting respective constraints on recordType.

Field(Extract_lnk.LogRecord,Char(9),1)
and like wise, Field(Extract_lnk.LogRecord,Char(9),2).....

the logic is working fine as long as there is only one long string, but if they are strings with spaces between them, then the datastage is considering them as 2 columns, like in the DS record in column1 is with space as column and 1, so it is reading in 2 different columns(despite it being space and not tab)? how shall this be resolved, any ideas
sameer
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

How did you define your sequential file - as one column or as multiple columns? The definition is where your problem lies.
smohd1338
Premium Member
Premium Member
Posts: 28
Joined: Fri Aug 03, 2012 1:09 pm

Post by smohd1338 »

Ia m reading such 3 files, so read in file pattern manner, and there would be one long log record. YES as one column.
sameer
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

In that case, the "Field(In.BIGColumn,Char(9),1)" will parse the string until the first tab character, skipping past any and all spaces in the string, until it gets to the first TAB character and will return what is to the left of it as the function return value.

you can confirm this by writing a test job which writes to a peek stage, output your source "In.BIGColumn" record and another column with the derivation of "INDEX(In.BIGColumn,Char(9),1)" to show you the position of the first tab. Perhaps what you assumed to be a space was actually a TAB character.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Is there any reason you're not using TAB as the field delimiter character in the Sequential File stage?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply