sequential file stage - comma delimiter failed

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Is the input sequential file defined with format field defaults of double-quotes as the quote character?
suresh_dsx
Participant
Posts: 160
Joined: Tue May 02, 2006 7:49 am

Post by suresh_dsx »

yes, I have give Quote is DOUBLE.


Field Defaults:
Delimiter=,
Quote=double
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

I cut-and-pasted your 3 lines to a text file and then defined a sequential file stage with the default Quote=double setting and was successfully able to view the data. I suspect that you have another issue - try doing what I did to see if you can successfully read a quoted string with embedded delimiter characters in a small test program.
suresh_dsx
Participant
Posts: 160
Joined: Tue May 02, 2006 7:49 am

Post by suresh_dsx »

As per the suggestion, I have taken sample records and tested the job. Now the below records are populated successfully.

"122","Chaina Asia","1234"
"124"," london, united kingdom ","2222"


But I have identified one record. As this is the address column, I can see many charters in the column.
Second column having double quotes with in the value. That is the reason the job is failed.

Below highlighted is one column
"125","WALT STEET "FIRST", LONDON UNITED KINGDOM ","3333"
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

"125"," WALT STEET "FIRST", LONDON UNITED KINGDOM ","3333"
is, unfortunately, an invalid format which DataStage cannot read. It should read:

Code: Select all

"125"," WALT STEET ""FIRST"", LONDON UNITED KINGDOM ","3333" 
asorrell
Posts: 1707
Joined: Fri Apr 04, 2003 2:00 pm
Location: Colleyville, Texas

Post by asorrell »

Basically you will need to have the source create it using different rules, either

1) Select a quote character that isn't contained in the data
2) Double the quotes (Arnd's example) for interior quotes
3) Strip out interior quotes prior to writing the file

This isn't a DataStage problem - any product would have problems reading it since the quote characters are contained in the data.
Andy Sorrell
Certified DataStage Consultant
IBM Analytics Champion 2009 - 2020
creatingfusion
Participant
Posts: 46
Joined: Tue Jul 20, 2010 1:26 pm
Location: USA
Contact:

Re: sequential file stage - comma delimiter failed

Post by creatingfusion »

I would recoment you to convert the delimiter in the source file "," (comma) into another delimiter "|" before processing it into sequential stage. If the character "|" is also used in the source file you may rather use some other characters such as ~(tilde) or $(dollar) which are generally not used in files.
To accomplish this in the shell script you should replace the character "," with "|" also includes the double quotes " as they would help you replace only the delimiters and not any other , (comma) used in any fields.

You can write the following shell script and use it as before job subroutine:

#!/bin/sh
tr '","' '"|"' <input_file_name>
#In the input file name you can you can parameterize it if required else #put the file name along with the path where it exists
exit 0


This script changes the delimiter from comma to pipe and thus you can run yours datastage job using delimiter as pipe "|" and probably this would reesolve yours issue.

Thanks
Abhijit.
Abhijit
IBM Certified Solution Developer Infosphere DataStage
austin_316
Participant
Posts: 80
Joined: Fri Aug 21, 2009 7:49 am
Location: India

sequential file stage - comma delimiter failed

Post by austin_316 »

try this function TRIM(input.col,""","B").
Post Reply