How to handle reading files with variable lengthed records?

MMarietta · Post by **MMarietta** » Thu Jan 03, 2008 10:07 am

I have a file with 10 fields. When the fields exist the file record is 100 bytes. Sometimes the last field is missing, sometimes the last 7 fields are missing and everything in between.

I don't care if these are missing but it's causing problems for the Sequential file stage.

What can I do to read this file and process data if it exists?

vivekgadwal · Post by **vivekgadwal** » Thu Jan 03, 2008 11:01 am

MMarietta wrote:I have a file with 10 fields. When the fields exist the file record is 100 bytes. Sometimes the last field is missing, sometimes the last 7 fields are missing and everything in between.

I don't care if these are missing but it's causing problems for the Sequential file stage.

What can I do to read this file and process data if it exists?

If it is not a fixed width file, then what is the delimiter between the fields in your file? Sequential file stage differentiates fields according to the delimiter that you specify. If the fields are missing, you still would have the delimiters in the file and the stage picks up NULLS for those missing fields.

sud · Post by **sud** » Thu Jan 03, 2008 2:09 pm

Not sure exactly what you are trying to do, but there can be two broad ways of tackling this.

1> Create a custom stage --- find out how many delimiters are there and accordingly pass out the data. plus you can do a lot more ...

2> Read each record into a single column (whole record) and then counting the number of delimiters figure out what to do ... all this logic goes inside a transformer

Approach 1 gives you the liberty to do literally whatever you want to do with the data.

ray.wurlod · Post by **ray.wurlod** » Thu Jan 03, 2008 3:41 pm

Are the fields AND their intermediate delimiters missing, or are all the delimiters there and the fields between them empty? You handle the two cases differently.

In the first case read each line (record) in the file as a single VarChar field and parse it subsequently.

In the second case adjust the individual field properties so that "" is recognized as representing NULL. Perhaps also provide a default value with which this can be replaced.

DSXchange

How to handle reading files with variable lengthed records?

How to handle reading files with variable lengthed records?

Re: How to handle reading files with variable lengthed recor

Re: How to handle reading files with variable lengthed recor