Check sum stage

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
deva
Participant
Posts: 104
Joined: Fri Dec 29, 2006 1:54 pm

Check sum stage

Post by deva »

Hi I need one quick information regarding check sum stage.

I want encript one of the key field. To make distinct and uniq out of given values, I am using two columns and passing those two columns through check sum stage. The result I am getting 32 numbers.

But I need 15 to 16 numbers only. I dont want take substring. the output of check sum need 16 digits.

Currently I am using 7.5 version, we are going to upgrage into 8.5, So I want to know how this will works.

Please help me with right information.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

A "checksum" is not an encyption mechanism since there is no decryption side to it and it will not generate anything "distinct and uniq"... and you don't have any choice on the length of the output. Sounds like you need to look into other methodologies like MD5 and others.
-craig

"You can never have too many knives" -- Logan Nine Fingers
deva
Participant
Posts: 104
Joined: Fri Dec 29, 2006 1:54 pm

Post by deva »

Thanks for your reply. I don't want to decript again, my requirement is I have 5 systems, out of wihich , the key column should encript, (meaning end user should not find out this id is from particular system. ) We are loading 5 systems information into one datawarehouse.

to deidentify the key column I am using this process.

Thats reason to encript I want to use system name + key column and pass through checksum stage that will give a uniqe number.

In this case I am geting 32 bit number, If I want only 16 bin , how can I do that.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Checksum will not necessarily give you a unique number. There is a small possibility that two values will generate the same checksum value - small enough that most people accept the risk when using the checksum as a comparison technique, its intended purpose.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Perhaps you need to think about leveraging a surrogate key for this? :?
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply