Page 1 of 1

Convertion of 60 digit Alpha Numeric to 40 digit?

Posted: Sun May 25, 2014 11:46 pm
by pkll
Hi,

How to convert 60 digit Alpha Numeric value to 40 digit?

My Source is 60 digit Alpha Numeric value but my target has 40 digit alpha numeric. How to convert 60 to 40 digit Alpha numeric value??

Posted: Mon May 26, 2014 12:00 am
by ray.wurlod
Technically you are in danger of losing precision or even value. Basically you can't. In the same way that you can't fit a 60cm waist into a 40cm waistband. Or 60 litres of fluid into a 40 litre container.

So, what you have to do is to get guidance from the business on is which 40 of the 60 characters they want to retain.
  • If the leftmost, use the Left() function.

    If the rightmost, use the Right() function.

    If something else, code to that requirement.

Re: Convertion of 60 digit Alpha Numaric to 40 digit?

Posted: Mon May 26, 2014 12:23 am
by stuartjvnorton
This isn't a DataStage issue: it's a data quality issue.
You should be asking "why?", how "how?".

The field may be 60 characters, but are there any values longer than 40 characters? ie: is it a risk, or an actual issue?
Is it possible to encounter a value that is more than 40 characters in length? ie: is it a risk at all?

Posted: Mon May 26, 2014 1:01 am
by ray.wurlod
If you don't mind getting really technical, and can demonstrate that there are no more than 2^5 distinct values in your source data, then you could store them using five bits per character and thus achieve the requirement to store 60 characters in a 40 byte field. Of course, you will need to write a routine in C++ to accomplish this, because you will need to specify an offset for the lowest valued character and/or a suitable mapping.

Posted: Mon May 26, 2014 1:22 am
by pkll
INPUT
COLUMN1
ABC1234ASDFEDGCFEDSADHEEEHAEERA112345671234556
aS34566789908725474907ATAGHEK4DDJDMMDEKKXDI234
AJJANNEJ678239JDHNE4889E09MSJKJKSKSSS123UJX345
AS82378483999MSKKMSAAN778939NMNMM1234567889999
ABC1234ASDFEDGCFEDSADHEEEHAEERA112345671234556

OUTPUT

COLUMN1 OUTPUT KEY VALUE
ABC1234ASDFEDGCFEDSADHEEEHAEERA112345671234556 12345678910112134561
aS34566789908725474907ATAGHEK4DDJDMMDEKKXDI234 12345678910112134562
AJJANNEJ678239JDHNE4889E09MSJKJKSKSSS123UJX345 12345678910112134562
AS82378483999MSKKMSAAN778939NMNMM1234567889999 12345678910112134562
ABC1234ASDFEDGCFEDSADHEEEHAEERA112345671234556 12345678910112134561

I have input with one column and output with two columns.
Please copy the above data in the notepad so that it would be clear. For every 60 bite input column I need 40 bite key value along with input column.

Posted: Mon May 26, 2014 3:20 am
by ray.wurlod
Your input sample does not have 60 characters. They seem to be constructed entirely of your alleged "40 character key" values.

How are the extra columns to be generated?

Posted: Mon May 26, 2014 3:28 am
by pkll
The above data is just example if I have input1 column with 40 to 60 characters data and output should generate unique sequence number with 40 digits.

Posted: Mon May 26, 2014 4:11 am
by askvishal
Try using DBMS utility function with parameter as the input column. Utility function would be dependent on ur DB..

Posted: Mon May 26, 2014 3:02 pm
by ray.wurlod
Nor are the sequence numbers in your example unique.

How about you specify precisely what is required?

It seems to me that all you need to do is to sort your incoming data and generate a key each time a change in the sorted value is detected - which detection is most easily done in the Sort stage.

To generate a 40 digit numeric value use an initial value supplied via a job parameter and increment it by 1 as required. DataStage can only generate BigInt surrogate keys, and these don't have 40 digits.

Re: Convertion of 60 digit Alpha Numeric to 40 digit?

Posted: Tue May 27, 2014 7:15 am
by chulett
So, which one of these two contrary statements in your original post is correct?
pkll wrote:How to convert 60 digit Alpha Numeric value to 40 digit?
pkll also wrote:How to convert 60 to 40 digit Alpha numeric value??
For the first one, I'm not aware of any way to somehow "convert" a sixty character string into a forty digit number. You could hash it into a 32 or 40 character hexadecimal value. For the second, that's already been discussed - which 40 of the 60 do you want to keep? Conversely, which (up to) 20 characters can you afford to lose?

Also, as noted, you have yet to post any rules / requirements you were given as to how any "40 digit number / unique sequence number" is to be generated. Sounds to me like all you'd need for that is pick a starting point and increment it by 1 each time.