Migration to 9.1 and CRC32 function

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
danddmrs
Premium Member
Premium Member
Posts: 86
Joined: Fri Apr 20, 2007 12:55 pm

Migration to 9.1 and CRC32 function

Post by danddmrs »

Currently we are migrating from 7.5.2 to 9.1. In 7.5.2 we use the CRC32 function to detect changed data in a given record. When we run the same data in 9.1 and 7.5 we get different values from the CRC32 function so we are getting a good deal of false updates in our data warehouse. This wouldn't be a big deal except that we keep history records in the DW so we would waste space unnecessarily.

Should there be a difference in how CRC32 values are calculated?
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

It really seems like there shouldn't be a difference between the two in order to maintain backwards compatability and avoid exactly what you are seeing. However I could see that there could be a difference depending on the coding / seeding of the function.

In your shoes I'd suggest contacting your official support provider and asking them. Perhaps it is a known issue with a fix or there is an option to revert to the old behaviour? Hey, one can dream! :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
danddmrs
Premium Member
Premium Member
Posts: 86
Joined: Fri Apr 20, 2007 12:55 pm

Post by danddmrs »

NLS is disabled in 7.5.2 but not in 9.1. Could this be affecting what is passed to the function?

Setting up some jobs to capture the values passed to the function.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

danddmrs wrote:NLS is disabled in 7.5.2 but not in 9.1.
Ah... that could very well be the issue. Not sure exactly why but it wouldn't surprise me if that's the root cause.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

It would certainly explain the difference if the CRC32 function operates on bytes rather than on characters. The non-NLS and NLS byte streams are very likely to be different even for "extended ASCII".
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
roy
Participant
Posts: 2598
Joined: Wed Jul 30, 2003 2:05 am
Location: Israel

Post by roy »

I've came across several cases of CRC32 and the likes of it, of any form and provider, returning same output for different inputs !!!
There for I never use them as detecting changes or serogate keys.
Roy R.
Time is money but when you don't have money time is all you can afford.

Search before posting:)

Join the DataStagers team effort at:
http://www.worldcommunitygrid.org
Image
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

... and that is a completely different discussion. :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
roy
Participant
Posts: 2598
Joined: Wed Jul 30, 2003 2:05 am
Location: Israel

Post by roy »

Just thought its important to note in case the one using it is not aware of this. :wink:
Roy R.
Time is money but when you don't have money time is all you can afford.

Search before posting:)

Join the DataStagers team effort at:
http://www.worldcommunitygrid.org
Image
danddmrs
Premium Member
Premium Member
Posts: 86
Joined: Fri Apr 20, 2007 12:55 pm

Post by danddmrs »

NLS was indirectly to blame. There is a data edit routine that was modified because of NLS being enabled. One of the functions of the routine was to reduce fields consisting of only multiple spaces to a single space. This was omitted in the 9.1 version so since ' ' is not the same as ' ' the CDC calculations were not the same. Correcting the data edit routine has resolved the issue.

Thanks all for your input.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Good catch! :D

And in case anyone was wondering why the two examples of spaces in a string look the same, that's because the forum software automagically removes any 'extra' spaces, so the quotes with four spaces between them end up looking the same as the quotes with one space between them. Code tags solve that issue, even though this isn't really code:

Code: Select all

This was omitted in the 9.1 version so since ' ' is not the same as '    ' the CDC calculations were not the same.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply