G'day Folks,
I've got a parallel job that brings together two sources in a change data capture stage. I keep updates and inserts, and split them into different loads.
Problem: The change data capture stage, and a later lookup stage, do not recognise two records with ID '000085188' as the same. Because of this, an update is treated as an insert.
The office has looked at a hexidecimal conversion of these "identical" IDs, and it appears that one of the values is being padded with trailing values. In hex they are represented as zeros, possibly some sort of null handling?
One theory is that the UNICODE setting in DataStage is changing the data. Does anyone have any experience with this problem?
Cheers,
Zac.
Effect of Unicode on data?
Moderators: chulett, rschirm, roy
-
- Premium Member
- Posts: 62
- Joined: Tue Jun 14, 2005 7:17 pm
- Location: Australia
- Contact:
Most likely at some point in time the column with the trailing 0x000 data was defined as a CHAR or a PIC X type field and it was padded automatically with the low-value or null (empty) value. You can do a TRIM(CONVERT(CHAR(000),CHAR(032),YourColumn)) to remove these extraneous values.
Addendum
Oops, I forgot to add that the Unicode setting is most likely irrelevant in this case.
Addendum
Oops, I forgot to add that the Unicode setting is most likely irrelevant in this case.
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>
-
- Premium Member
- Posts: 62
- Joined: Tue Jun 14, 2005 7:17 pm
- Location: Australia
- Contact: