Usage of TrimLeadingTrailing
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 35
- Joined: Mon Nov 01, 2010 2:48 am
- Location: New Delhi
Usage of TrimLeadingTrailing
Hi Experts !
I have a parallel job that reads from a flat file and loads into a Netezza target. It has the following layout:
File --> Transformer --> Netezza Target
The transformer has a stage variable which checks whether a given column is Numeric or not. The derivation for this stage variable is as follows:
If TrimLeadingTrailing(Lk1.Quantity)='' or IsValid('decimal[18,3]',TrimLeadingTrailing(Lk1.Quantity))=1 Then '' Else 'Quantity is not a valid decimal'
I tried the following test cases:
Quantity: abc
Var: Quantity is not a valid decimal
Quantity: 123abc
Var: Quantity is not a valid decimal
Quantity: 0123 abc
Var: ''
This last test case is something that is troubling me. Even though the field is not a Numeric, it still passes the check. I have a feeling that this is because of the Trim function (which sees the space and discards everything after that space). Please advise.
I have a parallel job that reads from a flat file and loads into a Netezza target. It has the following layout:
File --> Transformer --> Netezza Target
The transformer has a stage variable which checks whether a given column is Numeric or not. The derivation for this stage variable is as follows:
If TrimLeadingTrailing(Lk1.Quantity)='' or IsValid('decimal[18,3]',TrimLeadingTrailing(Lk1.Quantity))=1 Then '' Else 'Quantity is not a valid decimal'
I tried the following test cases:
Quantity: abc
Var: Quantity is not a valid decimal
Quantity: 123abc
Var: Quantity is not a valid decimal
Quantity: 0123 abc
Var: ''
This last test case is something that is troubling me. Even though the field is not a Numeric, it still passes the check. I have a feeling that this is because of the Trim function (which sees the space and discards everything after that space). Please advise.
Arpit Chopra
Re: Usage of TrimLeadingTrailing
Did you try outputting the return value of the trim function to see what was actually output?arpitchopra wrote:I have a feeling that this is because of the Trim function (which sees the space and discards everything after that space).
-
- Participant
- Posts: 35
- Joined: Mon Nov 01, 2010 2:48 am
- Location: New Delhi
Re: Usage of TrimLeadingTrailing
I did. And surprisingly, the output for the 3rd case is 123.000 !
Is this how TrimLeadingTrailing usually works ? Isn't it just supposed to remove the white spaces from the beginning and the end of the entire string ?
Is this how TrimLeadingTrailing usually works ? Isn't it just supposed to remove the white spaces from the beginning and the end of the entire string ?
ShaneMuir wrote:Did you try outputting the return value of the trim function to see what was actually output?
Arpit Chopra
Its actually the IsValid test that is not working. If you were to pass your input un trimmed into the IsValid test you would probably find that it returns TRUE.
TrimleadingTrailing is used to remove space from the beginning and end of fields and leave multiple spaces in the middle of a string if they are there, unlike Trim which would remove the multiple spaces and leave just the one.
I wouldn't be surprised if the IsValid test only looks until the space and drops the rest of the line item.
As to why that would be the case - I do not know.
TrimleadingTrailing is used to remove space from the beginning and end of fields and leave multiple spaces in the middle of a string if they are there, unlike Trim which would remove the multiple spaces and leave just the one.
I wouldn't be surprised if the IsValid test only looks until the space and drops the rest of the line item.
As to why that would be the case - I do not know.
-
- Participant
- Posts: 35
- Joined: Mon Nov 01, 2010 2:48 am
- Location: New Delhi
I don't want to do that as my file may contain those junk spaces.ShaneMuir wrote:Its actually the IsValid test that is not working. If you were to pass your input un trimmed into the IsValid test you would probably find that it returns TRUE.
Consider a scenario where the input is " 12" (or "<space><space><space>12"). I want this to pass the test and be stored as 12.000
Won't IsValid without a Trim fail this case ?
You are right. I have applied TrimLeadingTrailing in the derivation for a few Varchar columns and it behaves as expected. It doesn't discard everything after the 1st space.ShaneMuir wrote:I wouldn't be surprised if the IsValid test only looks until the space and drops the rest of the line item.
As to why that would be the case - I do not know.
Do you have an alternative to test for this then ?
Arpit Chopra
-
- Participant
- Posts: 35
- Joined: Mon Nov 01, 2010 2:48 am
- Location: New Delhi
But this won't help me check for the precision. I need to discard everything that has precision greater than 3.ShaneMuir wrote:I would suggest using the Num() or Alpha() functions to find those values which contain only numbers or alpha characters.
I mean we can ideally write a code that will check for the number of characters after "." but then it will be a huge effort to explain the multiple substrings and indices in the formula to someone else. I was hoping for something more subtle and simple. Like the IsValid. If only that would work.
Arpit Chopra
Well you could try the NUM() function in conjunction with IsValid(). I am pretty sure NUM() will return true for 0123 and 0123.01 or 123.456, but false for 0123 abc.
Also I am pretty sure that the isvalid test won't actually test whether the input string has the relevant precision rather that it can be converted to the relevant precision. So 0123 and 0123.01 and 123.456 would all return true for the IsValid(Decimal) check.
If you want to check whether something has only 3 decimals from a string text I would use something like len(trim(input.string['.',2,1]))=3 or similar.
Also I am pretty sure that the isvalid test won't actually test whether the input string has the relevant precision rather that it can be converted to the relevant precision. So 0123 and 0123.01 and 123.456 would all return true for the IsValid(Decimal) check.
If you want to check whether something has only 3 decimals from a string text I would use something like len(trim(input.string['.',2,1]))=3 or similar.
-
- Participant
- Posts: 35
- Joined: Mon Nov 01, 2010 2:48 am
- Location: New Delhi
You are right. This was a mistake on my part.ShaneMuir wrote:Also I am pretty sure that the isvalid test won't actually test whether the input string has the relevant precision rather that it can be converted to the relevant precision. So 0123 and 0123.01 and 123.456 would all return true for the IsValid(Decimal) check
However, when I write the following code as a derivation to the same Quantity column, I get 1234.000 as the output.
Code: Select all
StringToDecimal(TrimLeadingTrailing(Lk1.Quantityt),"round_inf")
Arpit Chopra
I suggested that IsValid had an issue, mainly because of your original post which suggested that it was the TrimLeadingTrailing giving the incorrect result, when actually your incorrect result was because the IsValid determined that the string '0123 abc' was a valid decimal and thus created an output you were not expecting.
Also, If you where to pass the string '0123 abc' to a column that is set as decimal with a specific precision, with no functions, conversions etc, i am guessing that it would convert the first part and drop the second part. Ie your output would be eg 123.000. As to my understanding this is because the IsValid test effectively tries to do a StringtoDecimal conversion, if it can, then its valid, if not then invalid. Passing the string to a decimal output would have the same result but you would get the errors written to the log.
Also, If you where to pass the string '0123 abc' to a column that is set as decimal with a specific precision, with no functions, conversions etc, i am guessing that it would convert the first part and drop the second part. Ie your output would be eg 123.000. As to my understanding this is because the IsValid test effectively tries to do a StringtoDecimal conversion, if it can, then its valid, if not then invalid. Passing the string to a decimal output would have the same result but you would get the errors written to the log.
-
- Participant
- Posts: 35
- Joined: Mon Nov 01, 2010 2:48 am
- Location: New Delhi
It is actually an issue with 2 things.
1. The first is that StringToDecimal() will automatically convert any string to a decimal (regardless of the format specified) if it CAN be converted. So, "1234 abc" will be converted to 1234
2. The second issue (and a bigger one) is with IsValid(). This function will also work in the same way. It will return true if the string CAN be converted to a decimal. It expects you to handle that conversion in the derivation of the target column.
Please read the following link for further information:
http://www-01.ibm.com/support/docview.w ... wg1JR39651
1. The first is that StringToDecimal() will automatically convert any string to a decimal (regardless of the format specified) if it CAN be converted. So, "1234 abc" will be converted to 1234
2. The second issue (and a bigger one) is with IsValid(). This function will also work in the same way. It will return true if the string CAN be converted to a decimal. It expects you to handle that conversion in the derivation of the target column.
Please read the following link for further information:
http://www-01.ibm.com/support/docview.w ... wg1JR39651
Arpit Chopra