I am looking for a message digest function in C to be used in a custom stage. One of my jobs involves checking for duplicates in a million record file each record being of varchar(1000) datatype. When I used the remove duplicate stage, the scratch space would become full and the job would abort. I then used a CRC32 C code in a custom stage. It worked fine until I came across two distinct records with same CRC32 code.
I searched for similar posts on message digest and this is the only one that comes close to my issue. viewtopic.php?t=110300&highlight=message+digest
The solution mentioned in the post by Craig Hulett involves using a database table and a generated surrogate key. I was wondering if something could be done on the fly in DataStage only.
Any help will be appreciated. Thanks for your time.
Message Digest C code
Moderators: chulett, rschirm, roy
-
- Premium Member
- Posts: 644
- Joined: Sat Aug 26, 2006 3:59 pm
- Location: Mclean, VA
Message Digest C code
Attitude is everything....
Well... nothing about that technique itself requires a database table, just some form of persistent storage. I chose a table because of my need to include my ids in other sql operations. And the scope of the persistence would be whatever was appropriate for you needs - just the life of the process, 'forever', etc.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
-
- Premium Member
- Posts: 644
- Joined: Sat Aug 26, 2006 3:59 pm
- Location: Mclean, VA
Thanks Craig for the reply. I was however able to get a working MD5 C code and it worked.chulett wrote:Well... nothing about that technique itself requires a database table, just some form of persistent storage. I chose a table because of my need to include my ids in other sql operations. And the scope of the persistence would be whatever was appropriate for you needs - just the life of the process, 'forever', etc.
Attitude is everything....
-
- Premium Member
- Posts: 2
- Joined: Tue Feb 24, 2004 11:50 am