Page 1 of 1

Handling nulls in seq files in PX

Posted: Sat Apr 24, 2004 6:12 pm
by fridge
Hi, I am trying to read a file from a seq stage in PX , the file contains numeric (int,decimal) values that have spaces to signify nulls.

Is there an easy way to 'map' spaces to nulls and avoid a record going down the reject link without going via a tranformer first.

Sorry for such a basic question, but we will encounter this a lot and I want to avoid my developers having to transform --> if isNull(field) do this else field shennaigans.

Have looked at properties in seq file format stage but cant get a reasonable (i.e low development) way of doing it.

Thanks in advance

Ed

Posted: Mon Apr 26, 2004 10:42 pm
by Gazelle
If I understand you right, this is done by specifying "null_field" value for each column, in the initial Sequential File stage.

eg. If ColumnA is a fixed-length Integer(10) field, then you specify

Code: Select all

null_field='          '
When reading the file, if ColumnA contains a string of ten spaces, then Datastage will convert the field into its internal representation of null.

You can then pass the field to a transformer and perform whatever null-handling you require.

Posted: Tue Apr 27, 2004 3:55 am
by richdhan
Hi,

1. What do u mean by "null_field" value. There is no property like "null_field" in the sequential file stage. Do u mean the value of Column A in the sequential file should be 10 spaces. Pls explain

2. If a field contains a value of 10 spaces does Datastage convert it to NULL?

Can u brief me more on these 2 queries

Thanks
Rich

--Pride comes before a fall

Posted: Tue Apr 27, 2004 5:58 pm
by Gazelle
Sorry for the confusion.

"null field" is how it appears on the internal record definition, but in the Stages it's called "Null field value".
1. Here are the steps to get to the "Null field value" property:
  1. In the "Sequential File" stage, on the "Columns" tab, if you right-click on the columns grid and select "Properties" you'll see a "Grid Properties" dialog box.
  2. Here you can select additional properties to appear on the columns grid. Select the "Field Properties" option (and opt to "Save settings for future display").
  3. Now there should be a column titled "Field properties" on the columns grid. Double-click on this cell, and a dialog box appears, with all sorts of column properties.
  4. If the column is made "nullable", then there will be a properties category called "Nullable", under which you can select the "Null field value".
  5. For the "Null field value", enter the value to be interpreted as NULL (eg. for ColumnA, enter ten spaces).
2. After defining the "Null field value" as ten spaces, then if Datastage finds that ColumnA has a value of 10 spaces, Datastage converts the value to NULL.

Note that if the Column property is consistent for all columns (eg. if "Null field value" is ten spaces for allof the fields in the record), then you can define the "Null field value" on the "Format" tab, under the "Field Defaults" category.

Note also that the reverse can be defined when outputting the column. ie. If a "Null field value" is defined when ouputting to a sequential file, then Datastage will convert NULL to the defined "Null field value".

Posted: Wed Apr 28, 2004 4:41 am
by richdhan
Hi,

Iam writing my database rejects to a .csv file. I have a field EFFECTIVE_OUT_DTS which is a timestamp and is a null field.
I got this warning initially
At field "EFFECTIVE_OUT_DTS": Exporting nullable field without null handling properties

I used the techique you had mentioned in your post and set the "null_value" to be 4 spaces

I got this warning
At field "EFFECTIVE_OUT_DTS": "null_field" length (4) must match field's fixed width (19)

I used 19 spaces instead and the job run successfully.

Thanks a ton for your post.

But I have a doubt - I had mentioned the length of the timestamp field to be 38. But the warning says that I have to use the field's fixed width of 19.
Why is that. Can anybody clarify me on this.

Thanks
--Rich

Posted: Wed Apr 28, 2004 4:00 pm
by ray.wurlod
38 is (probably) the precision of the underlying numeric data type. 19 is the number of characters in a TimeStamp (without sub-seconds). There are, for example, 26 characters in a TimeStamp(6) - that is, TimeStamp with microseconds. However, DataStage does not seem to recognize precision for timestamps.

Posted: Thu Apr 29, 2004 5:26 pm
by Gazelle
That issue of null_values for fixed-length fields (eg. a timestamp field) was a real problem for us.

We had a pipe-delimited file with a timestamp field, where || signified a null.

Ascential provided a patch that basically provides a new setting (a project environment variable: "APT_IMPEXP_ALLOW_ZERO_LENGTH_FIXED_NULL").
I do not know if this allows four spaces to be defined as the null_value, but you could ask Ascential support.