Message Digest C code

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
just4geeks
Premium Member
Premium Member
Posts: 644
Joined: Sat Aug 26, 2006 3:59 pm
Location: Mclean, VA

Message Digest C code

Post by just4geeks »

I am looking for a message digest function in C to be used in a custom stage. One of my jobs involves checking for duplicates in a million record file each record being of varchar(1000) datatype. When I used the remove duplicate stage, the scratch space would become full and the job would abort. I then used a CRC32 C code in a custom stage. It worked fine until I came across two distinct records with same CRC32 code.

I searched for similar posts on message digest and this is the only one that comes close to my issue. viewtopic.php?t=110300&highlight=message+digest

The solution mentioned in the post by Craig Hulett involves using a database table and a generated surrogate key. I was wondering if something could be done on the fly in DataStage only.

Any help will be appreciated. Thanks for your time.
Attitude is everything....
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Well... nothing about that technique itself requires a database table, just some form of persistent storage. I chose a table because of my need to include my ids in other sql operations. And the scope of the persistence would be whatever was appropriate for you needs - just the life of the process, 'forever', etc.
-craig

"You can never have too many knives" -- Logan Nine Fingers
just4geeks
Premium Member
Premium Member
Posts: 644
Joined: Sat Aug 26, 2006 3:59 pm
Location: Mclean, VA

Post by just4geeks »

chulett wrote:Well... nothing about that technique itself requires a database table, just some form of persistent storage. I chose a table because of my need to include my ids in other sql operations. And the scope of the persistence would be whatever was appropriate for you needs - just the life of the process, 'forever', etc.
Thanks Craig for the reply. I was however able to get a working MD5 C code and it worked.
Attitude is everything....
cgarciadesc
Premium Member
Premium Member
Posts: 2
Joined: Tue Feb 24, 2004 11:50 am

Post by cgarciadesc »

just4geeks wrote: Thanks Craig for the reply. I was however able to get a working MD5 C code and it worked.
Could you please share your code???

Thanks in advance!
Post Reply