Regarding fixed length file

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
samarvind
Participant
Posts: 29
Joined: Wed Jan 18, 2006 6:13 am
Location: Sutton, Surrey

Regarding fixed length file

Post by samarvind »

Hi All,

We have requirements to load fixed length file format in our ETL environment and have found an issue which I don't know it is a file format issue or Data Stage issue or issue while FTPing the data into our landing area. Would require your inputs resolving it.

We have file with metadata like below
COL1 1 - 2
COL2 3 - 4
COL3 5- 6
Consider the records in the fixed length flat file are like below
AA CC
XXYY
DDFF
SSRRGG

When I view data after defining the metadata in sequential file stage some records get missed out like only following records are displayed or loaded
AA CC
SSRRGG

Other two records which doesnot have a value for the last column is missed out. When looked into the file for record 2 and 3 there are no spaces after the last character that instead cursor moves on to the next line. Is this a problem reading the record in DataStage? Is there an option to set NULL for the last column if the value is not available?

Have anyone come across such situation and let me know how we could resolve this issue?

Thanks in advance for your input.

Arvind
Thanks & Regards
arvind sampath
Software Engineer
Madhumitha_Raghunathan
Premium Member
Premium Member
Posts: 59
Joined: Fri Apr 22, 2011 8:02 am

Post by Madhumitha_Raghunathan »

Hi Arvindh,

It is a requirement in a fixed width file that every line must have the same number of characters and if it is a delimited file there should be correct numebr of delimiters or else the file parsing will fail.
We also faced this issue earlier and the records which had lesser length were dropped when the parsing happened.
We had to get the corrected file from the source system. If the last few columns do not have any data then we need to get it with spaces instead of characters.

Thanks,
Madhumitha
Last edited by Madhumitha_Raghunathan on Tue Sep 18, 2012 9:41 am, edited 1 time in total.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Sounds like your FTP process is stripping any trailing spaces from each record. See if a 'binary' transfer helps preserve them.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

I've had the same problem before in that some mainframe FTP removes trailing spaces.

In my case I used a filter in the source fixed width file stage, i.e.

Code: Select all

sed 's/[ \t]*$//'  | tr -d '\x1A' | awk '{printf "%-1494s\n",$0}' -
(remove 0x1A characters and then pads lines to 1494 length)

This might help, since with binary ftp the source data will most likely arrive in EBCDIC and you would need to know the correct character set in order to map in DataStage.
samarvind
Participant
Posts: 29
Joined: Wed Jan 18, 2006 6:13 am
Location: Sutton, Surrey

Post by samarvind »

ArndW, Thanks for the reply but I could not understand. Did you do it in scripting while sourcing the files in UNIX or in Data Stage?
Thanks & Regards
arvind sampath
Software Engineer
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

No, this is in the filter part of source sequential file stage:

Code: Select all

sed 's/[ \t]*$//'  | tr -d '\x1A' | awk '{printf "%-1494s\n",$0}' -
samarvind
Participant
Posts: 29
Joined: Wed Jan 18, 2006 6:13 am
Location: Sutton, Surrey

Post by samarvind »

Thanks Andrw for your swift reply. Let me try and I guess I need to add this in the filter options and with all the last columns in the columns tabs.

I have also tried one more option, I don't know if it is the best way to do it and are there any limitations in it?

Read entire string as one record in source and split it in transformer with sub string like rec[1,2],rec[2,3], so for the last column if no value then will be populated as NULL in the target. However, I do get warnings in Data Stage log as "Exporting nullable field without null handling properties" . Does anyone know if is this a problem or we can add this to a message handler?
Thanks & Regards
arvind sampath
Software Engineer
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

An exact search here for your message will reveal many conversations on the topic, i.e. what it means and how to handle it.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply