ASCII format showing some low characters

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

us1aslam1us
Charter Member
Charter Member
Posts: 822
Joined: Sat Sep 17, 2005 5:25 pm
Location: USA

ASCII format showing some low characters

Post by us1aslam1us »

Hi All

I am having one issue here.

I am having an ASCII file with some data.When i am looking at the data in the UltraEdit instead of real spaces i am having some low characters there.

Ex:

D27......R...23456

There should be spaces instead of dots there.Any suggestions.

Thanks
Sam
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

How big is the file, and what do you want to do about the low ASCII values? If the file is small and you want the low values swapped to a space, a simple filter using sed could swap out the values. Anything larger and complicated could require some work.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
us1aslam1us
Charter Member
Charter Member
Posts: 822
Joined: Sat Sep 17, 2005 5:25 pm
Location: USA

Post by us1aslam1us »

Hi Ken,

I need the characters to be swapped with space. Also ken could you tell me why this is happening,is there anything i am doing wrong while coding or some other reason.

Thanks
Sam
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

Are you creating the file or is it coming to you? If you're creating it, where are you getting it? You'll have to put some derivation logic on the interested columns.

If the file is coming to you this way, you'll have to address the file prior to reading it with PX.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

The hex dump of UltraEdit always shows dots where non-display characters occur.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
us1aslam1us
Charter Member
Charter Member
Posts: 822
Joined: Sat Sep 17, 2005 5:25 pm
Location: USA

Post by us1aslam1us »

Hi ken,

I am creating this file. Can i do anything before loading this thing in the sequential file.Orelse can i do something in my sequence job where i am concatinating the detail and trailer recors in one file.Like using some unix commands.....


Hi Ray,

Will it be an issue if we get the thing like that or we can ignore it?

Thanks
Sam
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Sam,

since you are creating this file you can strip out or replace the offending non-displayable characters yourself. You can even write a simple DataStage job to read a file and do this. Do you know the ASCII code(s) of the characters that are getting inserted into the file? You've stated "low characters" but that could include anything lower than ASCII 32 (space).

You can use sed in UNIX to modify your file, but in order to get a suggestion on how to do that from here you might want to specify exactly what you need done to what characters.
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

Sam - Check the option available in the edition. And try to turn of the "Show space and tabs" option. If enabled will show you spaces, new line character, tabs...
It is an additional feature available in the editior.
So there is nothing to do with your job. Veiw the same file in unix, with vi option. It should look good.
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
us1aslam1us
Charter Member
Charter Member
Posts: 822
Joined: Sat Sep 17, 2005 5:25 pm
Location: USA

Post by us1aslam1us »

ArndW wrote: You've stated "low characters" but that could include anything lower than ASCII 32 (space).

You can use sed in UNIX to modify your file, but in order to get a suggestion on how to do that from here you might want to specify exactly what you need done to what characters.
Hi ARND,

I am getting dots instead of spaces. If possible i just want to get rid of any low character with space.

Thanks
Sam
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

You stated that you are creating this file. How (a DataStage EE Job, a script)?

You can use some of the Transform stage functions in either EE or Server to manipulate characters. I think some of us are confused because you have said that you create the file but you want to remove "low characters" without really having said which low characters and also where you want to replace them with spaces. There are so many ways that this can be done from a DataStage job to a simple sed script; but the best would be not get them those invalid characters there in the first place [which is why you've been asked about where the file gets created in the first place].
us1aslam1us
Charter Member
Charter Member
Posts: 822
Joined: Sat Sep 17, 2005 5:25 pm
Location: USA

Post by us1aslam1us »

ArndW wrote:You stated that you are creating this file. How (a DataStage EE Job, a script)?

You can use some of the Transform stage functions in either EE or Server to manipulate characters. I think some of us are confused because you have said that you create the file but you want to remove "low characters" without really having said which low characters and also where you want to replace them with spaces. There are so many ways that this can be done from a DataStage job to a simple sed script; but the best would be not get them those invalid characters there in the first place [which is why you've been asked about where the file gets created in the first place].
Hi

I am creating this file in a Datastage EE job. I don't understand y i am getting those and as i said i am just being able to see that dots when i am checking the hex dump in UltraEdit. Maybe this is confusing but i don't know how to proceed further. But according to my understanding i need to replace hex '00' with '20'.(NUL Ascii character with the Space.)Can someone help me how to do this thing using ereplace or any other transform function.

Thanks
Sam
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

If you are generating the in a EE job then this is the best place to get rid of them. You can do a CONVERT(CAHR(000),' ',In.ColumnName) in a EE transform stage (the syntax is the same as that for a server job).
us1aslam1us
Charter Member
Charter Member
Posts: 822
Joined: Sat Sep 17, 2005 5:25 pm
Location: USA

Post by us1aslam1us »

ArndW wrote:If you are generating the in a EE job then this is the best place to get rid of them. You can do a CONVERT(CAHR(000),' ',In.ColumnName) in a EE transform stage (the syntax is the same as that for a server job).
Thanks Arnd.

Sam
DSkkk
Charter Member
Charter Member
Posts: 70
Joined: Fri Nov 05, 2004 1:10 pm

Post by DSkkk »

Hi Sam..

There is one possible reason for the Low values. When u put a varchar field into a Char(N) Field, then EE Stage by default appends the remaning length with Lower values.
The solution for this is U trim the Source Field and explicitly append the rest of the Field Lenght with Spaces.

ex: Trimf(TrimB(SOURCE_COLUMN) : Str(' ',OUT_COLUMN_LENGTH -(Len(TrimF(TrimB(SOUCE_COLUMN)))))

let me know if this helps.

Regards.
Kiran.
g.kiran
us1aslam1us
Charter Member
Charter Member
Posts: 822
Joined: Sat Sep 17, 2005 5:25 pm
Location: USA

Post by us1aslam1us »

ArndW wrote: CONVERT(CHAR(000),' ',In.ColumnName)
It is not working,still i am getting the same problem.

Thanks
Sam
Post Reply