Sequential File Stage and Nulls

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
SettValleyConsulting
Premium Member
Premium Member
Posts: 72
Joined: Thu Sep 04, 2003 5:01 am
Location: UK & Europe

Sequential File Stage and Nulls

Post by SettValleyConsulting »

Given a pipe-delimited input sequential file something like this


Row 1 Data1|Data2|Data3
Row 2 Data1||Data3


All fields, Varchar(6), say. What would you expect to happen to row 2 if Field 2 was defined as (a) Nullable and (b) Not nullable.


The answer seems to be that (a) Datastage inserts a Null. (b) Datastage inserts an empty string.

Apart from being inconsistent, this is not the answer I expected or wanted, as we actually want rows where mandatory (not nullable) columns that are supplied empty to be rejected, preferably without having to code null tests explicitly for each column.

Is this possible just using the Seq file stage?

thanks

Phil Clarke.
vasubabu
Participant
Posts: 153
Joined: Wed Jan 25, 2006 2:38 am

Re: Sequential File Stage and Nulls

Post by vasubabu »

That time sequential files gives empty. it doesn't give 'NULL' bydefault.
VASU..
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

The behaviour with nullable and not nullable varchar fields read from sequential files is both consistent and logical (as well as being documented). The different behaviour is caused by sequential files not containing NULLs.

If you edit the attributes of your varchar() column in the sequential file definition, you will see that you can specify a "null field value" if the column is specified as nullable - that is the value used when it is read in as empty. If the field is not nullable, then it uses an empty string (i.e. a varchar with length of 0).
SettValleyConsulting
Premium Member
Premium Member
Posts: 72
Joined: Thu Sep 04, 2003 5:01 am
Location: UK & Europe

Post by SettValleyConsulting »

Thanks Arnd - So the short answer is 'No'.

I should have mentioned that Null Field Value was set to ''. This is our default. By inconsistent I meant that an empty input column is treated differently depending on the nullability of the target column. The (online) documentation defines the Null Field Value property as

On import, the value given to a field containing a null.

but this seems not to be right :-

1) Null Field Value is set to ''

If target column is nullable then Empty => Null,
If target column not nullable then Empty => Empty.

2) Null Field Value not set

If target column is nullable then Empty => Empty,
If target column not nullable then Empty => Empty.

3) Null Field Value set to 'Aardvark'

If target column is nullable then Empty => Empty,
If target column not nullable then Empty => Empty.


Also when I set Null Field Value = 'XXXXX', which is an actual value in the file, fields with this value get set to Null so it seems that the Null Field Value is actually specifying the value that represents an in-band Null in the data, this explains the behaviour in 1) and kinda makes sense as it mirrors what this property does on export but its not what the help text says.

Where the input value is <empty string> it would be more consistent and logical always to insert an empty string, or always to attempt to insert a Null (and reject rows if the target was non-nullable). This seems to me more consistent with DS behaviour elsewhere when attempting to pass a null into a non-nullable column nearly always gives an exception.

My ideal behaviour would be a new user property that when set, specified that a row should be rejected where a mandatory column was empty, surely a common requirement. One for the wish list?

Phil.

PS This should be required reading ...viewtopic.php?t=117629
Post Reply