Page 1 of 1

Non-ascii character problem

Posted: Wed Dec 08, 2010 7:57 pm
by abc123
I have to read a file which when I do:
od -xc
gives the characters as follows:

1 2 3 4 272 5 6 7 272 \r \n

272 is the field delimiter. How do I tell Datastage to ignore this character?

What I thought:
1)Read the columns as 1 more, for example, the first column above as char(5) and use a Left function downstream.

2)On the sequential fle stage, use the Filter option and use a Unix command. I don't know how to do this. Help appreciated.

Any other ideas?

Posted: Wed Dec 08, 2010 8:17 pm
by Mike
Fixed-width file or delimited file?

Since you say that character is a delimiter, I would assume it is a variable delimited file.

Just define that character as your delimiter character.

If it is a fixed-width file, just define char(1) fillers to consume the characters that you want to ignore.

Mike

Posted: Wed Dec 08, 2010 8:21 pm
by abc123
It is not a fixed width file. I thought of defining that character as a delimiter but I don't know what that character is. It is not one of the ASCII or extended ASCII characters. It looks like a temperature degree sign but how do I put that as a delimiter?

Posted: Wed Dec 08, 2010 8:30 pm
by Mike
Just try a regular copy&paste. It might just end up looking like a box in DataStage, but that would only mean DataStage can't display the pasted value. I like to use ^A (0x01) as a delimiter in delimited files, and the only way I know to enter it is via copy&paste.

Mike

Posted: Wed Dec 08, 2010 10:04 pm
by ray.wurlod
When you import the table definition specify &h0110 as the delimiter character.

Posted: Wed Dec 08, 2010 10:40 pm
by abc123
Ray, where in the 'Import Orchestrate Schema' window would you specify that?

Posted: Wed Dec 08, 2010 10:59 pm
by ray.wurlod
This isn't an Orchestrate schema, so you should not be using that option. You should be using Import > Table Definition > Sequential File.

Posted: Wed Dec 08, 2010 11:29 pm
by abc123
When I enter &h0110 in the Other Delimiter box, I get the error message:

Error: invalid character specification '&0h110'. You can enter any single character, or you can enter the ASCII code for a character either as a decimal number, or as atwo hexadecimal digits prefixed by &H. To specify no quote character, use 000 or &H00.

Posted: Wed Dec 08, 2010 11:31 pm
by stuartjvnorton
abc123 wrote:When I enter &h0110 in the Other Delimiter box, I get the error message:

Error: invalid character specification '&0h110'. You can enter any single character, or you can enter the ASCII code for a character either as a decimal number, or as atwo hexadecimal digits prefixed by &H. To specify no quote character, use 000 or &H00.
Your '0' and 'h' are swapped around.

Posted: Wed Dec 08, 2010 11:40 pm
by abc123
I mistyped it in the error message.

Use Hex value for delimiter

Posted: Wed Sep 14, 2011 2:55 pm
by bashbal
The file dump od -xc shows the delimiter in octal format. The value 272 translates to Hex BA or Decimal 186.

Specifying this delimiter in a Server sequential stage is obvious; you just enter the decimal 186 or &hBA in the delimiter field. This is readily available in the Help.

It is also possible to do this in parallel sequential file stages properties. To do this, go to the format->Field Defaults->Delimiter and enter a backslash followed by a 'x', then the Hex value. In this case it would be "\xBA". NOTE: I did this with "Delimiter" not "Delimiter String".

I did not find this in the Help. I found it by trial-and-error.

Posted: Tue Nov 01, 2011 2:26 pm
by dalvigirish
I was able to use \x8F (delimiter is hex 8F) in delimiter string and can view the data, but can't use this string in delimiter.