Strategy for Null handling for sequential file processing
Posted: Wed Apr 12, 2006 4:23 pm
I am trying to create some EE parallel Job after working in server edition for some time. I am looking for a Null handling strategy for a our datawarehouse which basically processes sequential files. I thought this approach could do well. but want to confirm.
----
Create a separate parallel Job ( doesn't need to be separate Job, but I decided to go with for other reasons ) with sequential file input stage and use modify stage to create the output sequential file or dataset. with the following command in the moddify stage.
outputcolumn = handle_null ( inputcolumn, value)
value is 01-01-2999 for date field, -9999999999 for integer field, '' for character. Hope that data doesn't contain these values.
If I use this approach, in the subsequent jobs I need to define all the columns are 'Not Null' columns'and then I doesnt need to give 'Null Field Values' or hit my mind whether this is null field or not ( , as our data suppliers never follow their interface protocol). do I doesnt need to bother about the 'Null' issues in the subsequent jobs.
But I still have doubts about the new fields I create based on stages like aggregators etc. Do I still need to handle null for a output field that comes out of aggregate stage from a Not null input fields ?.
I appreciate your comments/suggestions.
----
Create a separate parallel Job ( doesn't need to be separate Job, but I decided to go with for other reasons ) with sequential file input stage and use modify stage to create the output sequential file or dataset. with the following command in the moddify stage.
outputcolumn = handle_null ( inputcolumn, value)
value is 01-01-2999 for date field, -9999999999 for integer field, '' for character. Hope that data doesn't contain these values.
If I use this approach, in the subsequent jobs I need to define all the columns are 'Not Null' columns'and then I doesnt need to give 'Null Field Values' or hit my mind whether this is null field or not ( , as our data suppliers never follow their interface protocol). do I doesnt need to bother about the 'Null' issues in the subsequent jobs.
But I still have doubts about the new fields I create based on stages like aggregators etc. Do I still need to handle null for a output field that comes out of aggregate stage from a Not null input fields ?.
I appreciate your comments/suggestions.