data issues

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
samsuf2002
Premium Member
Premium Member
Posts: 397
Joined: Wed Apr 12, 2006 2:28 pm
Location: Tennesse

data issues

Post by samsuf2002 »

Hi All , i am running a parallel job using source as a sequential file with pipe | as a delimiter but the data coming is bad like there are some carriage returns in some columns and some columns has data which contain pipe therefore DS is taking it as a new column results in rejection of records. i just want to know can we get rid of these issues using DS jobs if yes then how?

any help will be appreciated
thanks
hi sam here
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Are the text strings that contain delimiter and terminator characters quoted? If so, DataStage can manage them, if not it can not, so you will need "them" to create a file in a legal format, or pre-process it yourself.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
samsuf2002
Premium Member
Premium Member
Posts: 397
Joined: Wed Apr 12, 2006 2:28 pm
Location: Tennesse

Post by samsuf2002 »

Thanks ray, ok i can create that but how the data stage will manage it can i please explain me
hi sam here
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

If you specify the quote character as a field level property, it will find the opening quote and scan characters into the field until the closing quote is found. Then it will resume its search for delimiter characters.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
rparimi
Participant
Posts: 20
Joined: Tue Oct 12, 2004 2:01 pm

could be a low value(non-printable characters like Enter)

Post by rparimi »

I had some issues similar to this. This could be a LOW VALUE with ASCII value less than 32. you can write a condition to see if the input ASCII value is less than CHAR(32) but that is bad data. By the way are you getting this from the source table?
samsuf2002
Premium Member
Premium Member
Posts: 397
Joined: Wed Apr 12, 2006 2:28 pm
Location: Tennesse

Post by samsuf2002 »

my source is flat file delimited with pipe
hi sam here
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

You need to know what every byte in the stream is. The advice from rparimi is sound, and you need to be able to handle what ever is in the file. If that means forcing "them" to supply validly quoted character strings, then that's what it will take.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply