Strategy for Null handling for sequential file processing

pxr87 · Post by **pxr87** » Wed Apr 12, 2006 4:23 pm

I am trying to create some EE parallel Job after working in server edition for some time. I am looking for a Null handling strategy for a our datawarehouse which basically processes sequential files. I thought this approach could do well. but want to confirm.
----
Create a separate parallel Job ( doesn't need to be separate Job, but I decided to go with for other reasons ) with sequential file input stage and use modify stage to create the output sequential file or dataset. with the following command in the moddify stage.
outputcolumn = handle_null ( inputcolumn, value)
value is 01-01-2999 for date field, -9999999999 for integer field, '' for character. Hope that data doesn't contain these values.

If I use this approach, in the subsequent jobs I need to define all the columns are 'Not Null' columns'and then I doesnt need to give 'Null Field Values' or hit my mind whether this is null field or not ( , as our data suppliers never follow their interface protocol). do I doesnt need to bother about the 'Null' issues in the subsequent jobs.
But I still have doubts about the new fields I create based on stages like aggregators etc. Do I still need to handle null for a output field that comes out of aggregate stage from a Not null input fields ?.
I appreciate your comments/suggestions.

roy · Post by **roy** » Tue Apr 18, 2006 3:48 pm

Hi,
Your null handling should concider the business rules of your organization.
In most cases there are default values you can use, some in the future and some in the past for dates;
In some cases a zero would do and in others an empty string.
In short it varies depending on each columns individual sircumstances and what works for one field doesn't have to work for another.

Bare in mind that when reading/writing to sequential files your actually writing strings, which may not be the case when you process the read values later on in your jobs (conversions may be needed).

IHTH,

kumar_s · Post by **kumar_s** » Wed Apr 19, 2006 7:58 am

For the not null input field in aggregator output doesnt require null handling. But most likely to have a data type conversion.

trobinson · Post by **trobinson** » Wed Apr 19, 2006 8:33 am

This is free advice and worth every penny but... Could you not define the Null Handling for each and every column in the Sequential File Input stage columns tab =>properties and eliminate the modify stage?