Binary Zero check
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 90
- Joined: Mon Dec 08, 2003 4:48 am
- Location: Chennai
Binary Zero check
My Input seq file has binary zeroes in a column.Is there any function to check for binary values (or special characters) in the input file/column. I don't know which column contains binary zero.
Think Ahead,
Raj.D
Raj.D
-
- Premium Member
- Posts: 252
- Joined: Mon Sep 19, 2005 10:28 pm
- Location: Melbourne, Australia
- Contact:
I'm not sure, but since no-one else has posted I'll give it a bash.
In the Format tab of your SEQ FILE stage, try setting the the Default NULL string to 000 (hexadecimal zero). This might work, but might kick up a fuss when there are many of them making up a string.
Alternatively, you could use a filter (click "Stage Uses Filter Commands" in your SEQ stage) and write a small C or Perl program to replace \000 with space or "0". The C/Perl program must read from standard input and write to standard output. Then add the name of the program to the "Filter Command" on the Output tab of the SEQ stage.
Ross.
In the Format tab of your SEQ FILE stage, try setting the the Default NULL string to 000 (hexadecimal zero). This might work, but might kick up a fuss when there are many of them making up a string.
Alternatively, you could use a filter (click "Stage Uses Filter Commands" in your SEQ stage) and write a small C or Perl program to replace \000 with space or "0". The C/Perl program must read from standard input and write to standard output. Then add the name of the program to the "Filter Command" on the Output tab of the SEQ stage.
Ross.
Another method would be to use a transformer and check for INDEX(In.Column,CHAR(000),1) being zero or nonzero. The ICONV(String,'MCP') will replace nonprintable characters with '.'
Since you don't know which column contains these values you could try reading the whole record as jut one column, stripping out the offending values with CONVERT(CHAR(000),'',In.String) and writing it back out.
Since you don't know which column contains these values you could try reading the whole record as jut one column, stripping out the offending values with CONVERT(CHAR(000),'',In.String) and writing it back out.