Removing trailing special character when source is DB2 data

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
Chris Zampogna
Participant
Posts: 2
Joined: Tue Mar 16, 2010 9:40 am

Removing trailing special character when source is DB2 data

Post by Chris Zampogna »

My source field is a varchar(4) in a DB2 UDB on Linux table. When I view the data through an ODBC Enterprise stage I can see a trailing square. What is this character and how do I remove it using a transformer?

Thanks in advance for your help,
Chris
Chris
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

You'd have to identify the character first. Any way to get the hex/decimal equivalent of this "square"? Oracle has a DUMP() function but no clue what DB2 has. That or write it to a flat file and then check it with a hex editor / viewer.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Welcome aboard.

What kind of data source is it? In particular is it text file? (That is, is the DSN a reference to a driver for text files?)
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Chris Zampogna
Participant
Posts: 2
Joined: Tue Mar 16, 2010 9:40 am

Post by Chris Zampogna »

ray.wurlod wrote:Welcome aboard.

What kind of data source is it? In particular is it text file? (That is, is the DSN a reference to a driver for text files?)
Thanks, Ray. The original data source is a text file. I see it in DataStage when I view the data after it has been loaded into the DB2 table. These trailing characters prevent proper joining on this field in lookup stages, which is why I VERY am interested in learning to manage them correctly. :)

Thanks for your help. It's much appreciated!
Chris
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

My guess is that the trailing character is either a UNIX line feed (Char(10)) or the first character of a DOS line terminator (Char(13)). Either way, these should have been cleaned off before being loaded into the DB2 table - this is the correct way to handle this problem.

Once you've found out what the character is, you can append it to the input that's missing it so you can have a lookup that works, and then remove the character downstream, but this approach is really a bandaid - the best approach is to prevent the character getting into DB2 in the first place.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Sairam
Participant
Posts: 42
Joined: Tue Nov 11, 2003 1:09 pm

Post by Sairam »

Hi
As Ray mentioned identify what is the trailing character, most of the time it would be char(10). (Install a freeware Notepad++ which has the ability to display special characters)

Use this perl one liner to substitite that trailing character with blank
Ex :I have used \000 ( null ) with blank

perl -p -i -e "s/\000/ /g" < Absolute Path of the text file >

\000 is the hexdecimal for Null

You can call this perl one liner either as
1) Before Job Routine

Code: Select all

Exec SH   perl -p -i -e "s/\000/ /g" < Absolute Path of the text  file >
2) In a Job sequence Call the Execute Command Activity

Code: Select all

Command : perl -p -i -e "s/\000/ /g" 
Parameters:
< Absolute Path of the text  file >
Thanks
asorrell
Posts: 1707
Joined: Fri Apr 04, 2003 2:00 pm
Location: Colleyville, Texas

Post by asorrell »

Is your job the one loading the text file into DB2? If so, then the problem might be easily fixable. Go to format specifications for the sequential file and change the record delimiter / delimiter string. If it is set to delimiter UNIX, delete that and add delimiter string DOS. Or do the reverse.
Andy Sorrell
Certified DataStage Consultant
IBM Analytics Champion 2009 - 2020
Post Reply