Sequential File read problem

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
mydsworld
Participant
Posts: 321
Joined: Thu Sep 07, 2006 3:55 am

Sequential File read problem

Post by mydsworld »

Hi,

While trying to use a Sequential file stage (.CSV file) as input with delimiter 'comma' and Quote 'none', I am facing the following problem :

For records like :

XYZ,GHT,UJY - the stage reads the input data properly ( three fields with values XYZ,GHT,UJY respectively)

However some of the records are appearing in the file like this :

"""XYZ,GHT,UJY""","""IJY""" - In this case data is should be read as two fields 'XYZ,GHT,UJY' and 'IJY'

But it is reading them into four fields XYZ,GHT,UJY and IJY respectively.

Please advise.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

The data are being parsed correctly according to your metadata. You specify quote=none, so the quote characters are treated as ordinary characters. You now have four comma-delimited fields. That's what you've described to DataStage.

If your data are coming in in inconsistent format, harrass your data provider. Or pre-process the data, so that there are no delimiter characters in the data. Otherwise read the entire line as a single string and parse it yourself. You will probably need a routine (or maybe Column Import stage) at some point.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
mydsworld
Participant
Posts: 321
Joined: Thu Sep 07, 2006 3:55 am

Sequential File read problem

Post by mydsworld »

But to read data in two fields 'XYZ,GHT,UJY' and 'IJY' from input data stream """XYZ,GHT,UJY""","""IJY""" (that also contain data like XYZ,GHT,UJY ) what should be the delimiter and quote values.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

If you import the data rows as single strings (VarChar) the delimiter and quote characters are irrelevant. Each should be set to none.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

If your data is quoted then provide the quote character. This will pick up the data as you want. Since it is a csv file. With quote character set to none, each comma will be treated as a selimiter.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
Post Reply