Page 1 of 1

Reapearing Nulls

Posted: Fri Dec 06, 2002 2:14 pm
by msigal
I'm having a little problem with a string containing nulls. The text file has a sting such as "medical device x <null><null><null><null>". I put the data through a custom routine and then I try to trim it. What comes out in the resulting text file is the same thing. The problem is when I attempt to debug, the transform tells me the nulls are spaces. I tried IsNull on a substring of the data, Seq(), etc. Everything tells me the nulls are spaces, except when it comes to write the data out, they are back! Source stage is CFF, target is sequential. I transform the data in a stage variable before assigning the stage variable to the target column.

Going crazy in Ann Arbor, Myles

Myles Sigal
Technical Analyst
The MEDSTAT Group
777 E. Eisenhower - 435B
Ann Arbor, MI 48108

myles.sigal@medstat.com
734-913-3466

Posted: Fri Dec 06, 2002 3:04 pm
by chulett
How are you defining the various variables you are storing this string in? Technically, you don't really have four nulls at the end of the string, you have a string of a certain size, like "medical device x". If you store it in a Varchar, only the string as quoted will be stored, the 'extra' spaces that may be at the end will not be. If you store it in a Char, it will automatically padded with spaces to the full size of the field as defined. I'm guessing you are mixing and matching definitions and that is what is confusing you. Same goes when writing out to a sequential file using Char or Varchar fields... if you want to keep the spaces, use Char - and be aware of the 'default padding' that the Seq Stage does!

-craig

Posted: Fri Dec 06, 2002 4:18 pm
by ray.wurlod
There are lots of rules you can set on the Format tab of your output Sequential file stage to specify - among other things - how NULL is to be handled.
The same is true on input. By default, the Sequential File stage treats "" in input as NULL. You must explicitly change this (on the Format tab) if you want the behaviour to be different.

Posted: Fri Dec 06, 2002 4:22 pm
by chulett
But those same options are not available on the CFF stage, are they?

Posted: Sat Dec 07, 2002 4:23 pm
by ray.wurlod
Correct, but I understood that Myles sought a method for controlling the appearance of NULL on output.
We probably should have talked about testing and substituting for NULL within the job's processing too, but didn't. Oh well.

Posted: Mon Dec 09, 2002 7:50 am
by msigal
When I test for the null when the job is processing the file, DataStage tells me the character is a space, I used the Seq() function to do this and get a value of 32, space. I've trimmed the string and used the Len() function to get the count the number of bytes and I get the expected trimmed length. The problem is when the modified (trimmed, etc.) value writes to the sequential file. The nulls reappear. I've modified how the nulls are written in the sequential file (format tab, null value) and it didn't replace the null.

Myles Sigal
Technical Analyst
The MEDSTAT Group
777 E. Eisenhower - 435B
Ann Arbor, MI 48108

myles.sigal@medstat.com
734-913-3466

Posted: Mon Dec 09, 2002 1:44 pm
by msigal
I also did try modifying the data type from varchar to char in the source and target column definitions within the transformer stage. This didn't seem to help.

Posted: Mon Dec 09, 2002 3:17 pm
by ray.wurlod
What happens when you explicitly output NULL (which can be generated in a Transformer stage using the system variable @NULL)?

Posted: Mon Dec 09, 2002 4:22 pm
by msigal
Today I was succesfully able to "trap" the null with the following logic in a routine. I still need to review exactly which characters we will and will not allow to pass and set the values in the if statement accordingly. (Hopefully my little trick with the underscores will keep the formatting nice...)

Myles

* Strip illegal characters and leading and trailing spaces
tempCharacter = Trim(InCharacter,"*","A")
tempCharacter = Trim(tempCharacter,"?","A")
tempCharacter = Trim(tempCharacter,"%","A")
tempCharacter = Trim(tempCharacter,"_","A")
tempCharacter = Trim(tempCharacter,"'","A")
tempCharacter = Trim(tempCharacter,'"',"A")
tempCharacter = Trim(tempCharacter)

BadByte = @FALSE

For i = 1 to len(tempCharacter)
___TheByte = tempCharacter[i,1]
___If Seq(TheByte) > 31 AND Seq(TheByte) <127 THEN
* do nothing
___End Else
______BadByte = @TRUE
______Exit
___End
Next i


If IsNull(tempCharacter) or tempCharacter = "" Or BadByte then
___Ans = MissingVal
end else
___Ans = tempCharacter
end

Edited by - msigal on 12/09/2002 16:25:03