CRC Info
Moderators: chulett, rschirm, roy
CRC Info
Hello
Could you please tell me the Main use of CRC32 function.
Adv thanks....
Could you please tell me the Main use of CRC32 function.
Adv thanks....
You will get a unique number (checksum) for each input.
This is used for several purpose, for example, duplicate check, dedupliacation, Surrogate key generation, lookup....
You can easily get more information by doing a search on the same keyword.
This is used for several purpose, for example, duplicate check, dedupliacation, Surrogate key generation, lookup....
You can easily get more information by doing a search on the same keyword.
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
Hester recently posted why not to use CRC32 for "Surrogate key generation". He wrote this routine with another Ascential developer. He has lots of posts about how to use this routine. It works best in SCD type 2 calculations.
If you calculate CRC32 on all fields in a dimension record and calculate CRC32 on all new fields on a dimension. If the value is the same then most likely all fields in both records are the same. This is much more powerful and faster than comparing each field one at a time. I know one customer here in Dallas which put every column in a hashed file and made all fields part of the key. Then did a lookup on every field. This was the design of an IBM employee or Ascential at the time. Not good. If the record has changed then do your SCD type 2 insert and update.
I am sure Michael will explain this better soon. Do a search for his posts.
If you calculate CRC32 on all fields in a dimension record and calculate CRC32 on all new fields on a dimension. If the value is the same then most likely all fields in both records are the same. This is much more powerful and faster than comparing each field one at a time. I know one customer here in Dallas which put every column in a hashed file and made all fields part of the key. Then did a lookup on every field. This was the design of an IBM employee or Ascential at the time. Not good. If the record has changed then do your SCD type 2 insert and update.
I am sure Michael will explain this better soon. Do a search for his posts.
Mamu Kim
A unique number fo each input? Nope. Useful for surrogate key generation? Nope. Duplicate check? Maybe.kumar_s wrote:You will get a unique number (checksum) for each input.
This is used for several purpose, for example, duplicate check, dedupliacation, Surrogate key generation, lookup....
You can easily get more information by doing a search on the same keyword.
As noted by Kim and recently brought back to raging life by Michael, use it for Change Data Detection - or CDD. When the CRC32 value for the 'same data' is different, something in it has changed. This can be used to ease SCD jobs work.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
Craig and Kim,
Thanks! I thought I was going to have to open another can of CRC whoop butt!
As stated before, two totally different rows of data can/will/might generate the same CRC value and that's ok - perfectly normal.
I spent an entire evening with a consultant a while back trying to help them fix their warehouse after they had used CRC32 to generate their "unique" surrogate keys argggggh
Regards,
Thanks! I thought I was going to have to open another can of CRC whoop butt!
Again - statements like these truly show the ignorance people have about CRC and checksum. First - CRC is NOT a checksum, checksum implies addition and CRC is not additive. CRC is based on division and remainders etc...You will get a unique number (checksum) for each input.
This is used for several purpose, for example, duplicate check, dedupliacation, Surrogate key generation, lookup....
You can easily get more information by doing a search on the same keyword.
As stated before, two totally different rows of data can/will/might generate the same CRC value and that's ok - perfectly normal.
I spent an entire evening with a consultant a while back trying to help them fix their warehouse after they had used CRC32 to generate their "unique" surrogate keys argggggh
raging? - Craig - I like to think of it as more frustration then anything. I have simply had an uphill battle educating developers and customers about the virture of CRC in SCD processing. Every time a consultant utilizes the routine in a manner in which it was not developed simply helps to solidify the image that it doesn't work.raging life by Michael
Regards,
Mike Hester
mhester@petra-ps.com
mhester@petra-ps.com
Craig,
No offense taken and I knew what you meant - I just don't want everyone to think I am a "raging" lunatic! which I am, but only a select group of people know that :D
Regards
No offense taken and I knew what you meant - I just don't want everyone to think I am a "raging" lunatic! which I am, but only a select group of people know that :D
Regards
Mike Hester
mhester@petra-ps.com
mhester@petra-ps.com
-
- Participant
- Posts: 145
- Joined: Fri May 02, 2003 9:59 am
- Location: Seattle, Washington. USA
There is a good Wikipedia entry for CRC. If you read it you will se how wrong you are. http://en.wikipedia.org/wiki/Cyclic_redundancy_checkkumar_s wrote:You will get a unique number (checksum) for each input.
This is used for several purpose, for example, duplicate check, dedupliacation, Surrogate key generation, lookup....
You can easily get more information by doing a search on the same keyword.
Shawn Ramsey
"It is a mistake to think you can solve any major problems just with potatoes."
-- Douglas Adams
"It is a mistake to think you can solve any major problems just with potatoes."
-- Douglas Adams
-
- Participant
- Posts: 145
- Joined: Fri May 02, 2003 9:59 am
- Location: Seattle, Washington. USA
...and very bad when used incorrectly.kduke wrote:No big deal Kumar. Keep up the good work. It was just a minor point. This is a powerful and complex routine which is very useful when used correctly.
Shawn Ramsey
"It is a mistake to think you can solve any major problems just with potatoes."
-- Douglas Adams
"It is a mistake to think you can solve any major problems just with potatoes."
-- Douglas Adams
All of us have been corrected on this forum if you answer enough questions. Kumar has given a lot of great answers. I want that to continue and not slow down becuase something was learned on this post. I always want to encourage people to post whether or not the question seems important. The important thing is to interact and grow into better developers and communicators. If you challenge yourself to grow and be a voice on this web site instead of just a listener then all of us should benefit. If your answer is missing something then Ray will fill in the gaps. If you want to get to the next level in your job then maybe you need to be to give an answer to a complicated issue. Maybe you have a good example of how to do something. Share it. Never be embarassed or ashamed for trying to participate. Nobody knows it all.
Mamu Kim
I agree totally with Kim. We have our top posters, very well experienced in this field, to fill in the blanks or even correct us. Thats one reason I always come back and recheck my answers. And i have been corrected so many times, only to my and the dsxchange user's benefit. And we should welcome and keep an open attitude.
Regards,
Regards,
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.