I have a file with 10 fields. When the fields exist the file record is 100 bytes. Sometimes the last field is missing, sometimes the last 7 fields are missing and everything in between.
I don't care if these are missing but it's causing problems for the Sequential file stage.
What can I do to read this file and process data if it exists?
How to handle reading files with variable lengthed records?
Moderators: chulett, rschirm, roy
-
- Premium Member
- Posts: 457
- Joined: Tue Sep 25, 2007 4:05 pm
Re: How to handle reading files with variable lengthed recor
If it is not a fixed width file, then what is the delimiter between the fields in your file? Sequential file stage differentiates fields according to the delimiter that you specify. If the fields are missing, you still would have the delimiters in the file and the stage picks up NULLS for those missing fields.MMarietta wrote:I have a file with 10 fields. When the fields exist the file record is 100 bytes. Sometimes the last field is missing, sometimes the last 7 fields are missing and everything in between.
I don't care if these are missing but it's causing problems for the Sequential file stage.
What can I do to read this file and process data if it exists?
Vivek Gadwal
Experience is what you get when you didn't get what you wanted
Experience is what you get when you didn't get what you wanted
Re: How to handle reading files with variable lengthed recor
Not sure exactly what you are trying to do, but there can be two broad ways of tackling this.
1> Create a custom stage --- find out how many delimiters are there and accordingly pass out the data. plus you can do a lot more ...
2> Read each record into a single column (whole record) and then counting the number of delimiters figure out what to do ... all this logic goes inside a transformer
Approach 1 gives you the liberty to do literally whatever you want to do with the data.
1> Create a custom stage --- find out how many delimiters are there and accordingly pass out the data. plus you can do a lot more ...
2> Read each record into a single column (whole record) and then counting the number of delimiters figure out what to do ... all this logic goes inside a transformer
Approach 1 gives you the liberty to do literally whatever you want to do with the data.
It took me fifteen years to discover I had no talent for ETL, but I couldn't give it up because by that time I was too famous.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Are the fields AND their intermediate delimiters missing, or are all the delimiters there and the fields between them empty? You handle the two cases differently.
In the first case read each line (record) in the file as a single VarChar field and parse it subsequently.
In the second case adjust the individual field properties so that "" is recognized as representing NULL. Perhaps also provide a default value with which this can be replaced.
In the first case read each line (record) in the file as a single VarChar field and parse it subsequently.
In the second case adjust the individual field properties so that "" is recognized as representing NULL. Perhaps also provide a default value with which this can be replaced.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.