Regarding fixed length file

samarvind · Post by **samarvind** » Tue Sep 18, 2012 9:01 am

Hi All,

We have requirements to load fixed length file format in our ETL environment and have found an issue which I don't know it is a file format issue or Data Stage issue or issue while FTPing the data into our landing area. Would require your inputs resolving it.

We have file with metadata like below
COL1 1 - 2
COL2 3 - 4
COL3 5- 6
Consider the records in the fixed length flat file are like below
AA CC
XXYY
DDFF
SSRRGG

When I view data after defining the metadata in sequential file stage some records get missed out like only following records are displayed or loaded
AA CC
SSRRGG

Other two records which doesnot have a value for the last column is missed out. When looked into the file for record 2 and 3 there are no spaces after the last character that instead cursor moves on to the next line. Is this a problem reading the record in DataStage? Is there an option to set NULL for the last column if the value is not available?

Have anyone come across such situation and let me know how we could resolve this issue?

Thanks in advance for your input.

Arvind

Madhumitha_Raghunathan · Tue Sep 18, 2012 9:38 am

Hi Arvindh,

It is a requirement in a fixed width file that every line must have the same number of characters and if it is a delimited file there should be correct numebr of delimiters or else the file parsing will fail.
We also faced this issue earlier and the records which had lesser length were dropped when the parsing happened.
We had to get the corrected file from the source system. If the last few columns do not have any data then we need to get it with spaces instead of characters.

Thanks,
Madhumitha

chulett · Post by **chulett** » Tue Sep 18, 2012 9:40 am

Sounds like your FTP process is stripping any trailing spaces from each record. See if a 'binary' transfer helps preserve them.

ArndW · Post by **ArndW** » Tue Sep 18, 2012 10:08 am

I've had the same problem before in that some mainframe FTP removes trailing spaces.

In my case I used a filter in the source fixed width file stage, i.e.

Code: Select all

sed 's/[ \t]*$//'  | tr -d '\x1A' | awk '{printf "%-1494s\n",$0}' -

(remove 0x1A characters and then pads lines to 1494 length)

This might help, since with binary ftp the source data will most likely arrive in EBCDIC and you would need to know the correct character set in order to map in DataStage.

samarvind · Post by **samarvind** » Wed Sep 19, 2012 4:04 am

ArndW, Thanks for the reply but I could not understand. Did you do it in scripting while sourcing the files in UNIX or in Data Stage?

ArndW · Post by **ArndW** » Wed Sep 19, 2012 4:09 am

No, this is in the filter part of source sequential file stage:

Code: Select all

sed 's/[ \t]*$//'  | tr -d '\x1A' | awk '{printf "%-1494s\n",$0}' -

samarvind · Post by **samarvind** » Wed Sep 19, 2012 4:32 am

Thanks Andrw for your swift reply. Let me try and I guess I need to add this in the filter options and with all the last columns in the columns tabs.

I have also tried one more option, I don't know if it is the best way to do it and are there any limitations in it?

Read entire string as one record in source and split it in transformer with sub string like rec[1,2],rec[2,3], so for the last column if no value then will be populated as NULL in the target. However, I do get warnings in Data Stage log as "Exporting nullable field without null handling properties" . Does anyone know if is this a problem or we can add this to a message handler?

chulett · Post by **chulett** » Wed Sep 19, 2012 7:26 am

An exact search here for your message will reveal many conversations on the topic, i.e. what it means and how to handle it.