Effect of Unicode on data?

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
Daddy Doma
Premium Member
Premium Member
Posts: 62
Joined: Tue Jun 14, 2005 7:17 pm
Location: Australia
Contact:

Effect of Unicode on data?

Post by Daddy Doma »

G'day Folks,

I've got a parallel job that brings together two sources in a change data capture stage. I keep updates and inserts, and split them into different loads.

Problem: The change data capture stage, and a later lookup stage, do not recognise two records with ID '000085188' as the same. Because of this, an update is treated as an insert.

The office has looked at a hexidecimal conversion of these "identical" IDs, and it appears that one of the values is being padded with trailing values. In hex they are represented as zeros, possibly some sort of null handling?

One theory is that the UNICODE setting in DataStage is changing the data. Does anyone have any experience with this problem?

Cheers,

Zac.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Most likely at some point in time the column with the trailing 0x000 data was defined as a CHAR or a PIC X type field and it was padded automatically with the low-value or null (empty) value. You can do a TRIM(CONVERT(CHAR(000),CHAR(032),YourColumn)) to remove these extraneous values.

Addendum
Oops, I forgot to add that the Unicode setting is most likely irrelevant in this case.
Daddy Doma
Premium Member
Premium Member
Posts: 62
Joined: Tue Jun 14, 2005 7:17 pm
Location: Australia
Contact:

Post by Daddy Doma »

Thanks ArndW,

We've searched through the job and confirmed that this is the case. Will use the code example you gave if we cannot remove the stage that changes the data types.

Regards,

Zac.
Post Reply