Page 1 of 1

Distinct of a query not retrieving correct result

Posted: Tue May 05, 2009 5:37 am
by Abhyankar
Hi,
I did a distinct on a single table on User id. I am still getting duplicate user_ids. Not sure why. (I am selecting only User id )
Even when i trim the spaces off and do a distinct, it is giving me duplicate rows.

Posted: Tue May 05, 2009 6:07 am
by shalini11
Can you post the query you are using.

Posted: Tue May 05, 2009 10:58 pm
by Abhyankar
Hi i am just doing a
select distinct <column_name> from <Table> in DB2.

I am getting few column names more than once.

Posted: Tue May 05, 2009 11:32 pm
by ray.wurlod
Assuming that the data type is string, check that the "same" values don't have the same number of leading or trailing space characters.

Posted: Wed May 06, 2009 12:06 am
by adi_2_chaos
Check if there arent any junk chars in the column. In case there are any, then TRIM wont help

Posted: Wed May 06, 2009 12:10 am
by Abhyankar
Hi,
Yes, it is a char 15 column.
I did a ltrim as well as rtrim on that column.
Also, i checked the length of it..
Basically i did a
"select length(ltrim(rtrim(<column_name>))) from table
where <column_name> like '%000051%'"

Surprisingly it gave me two values 15 and 6. Now the value is 6 so it should give 6, but why is 15 coming? Is it garbage value? How can i remove that?

Posted: Wed May 06, 2009 12:11 am
by Abhyankar
How can i check for junk char values?

Posted: Wed May 06, 2009 12:28 am
by adi_2_chaos
A very simple way is to copy paste a sample ouput to some text editor.
Lets say
'ABC ' is ur output.
In case the trailing characters are white spaces, they'll be inevitably removed by TRIM function. If not, then they're junk chars for sure.

I'm not sure how u can remove junk chars with any fucntion. but i got a vague idea.
Lets say u have two values ....
'ABCDEF' and
'ABCDEF ' (padded with 15 chars).
And you're sure that the former one is correct, then simply update the rest of them usign substr function. The residual condition should be all the rows where the length of that particular column is not equal to 6.
Lets hope this works.

Posted: Wed May 06, 2009 12:39 am
by Abhyankar
Thanks, but just wanted to know why this happened?
Is it because the column length is character 15 and my column value is just 6?
Actually, i populate this table through Datastage. I am doign a substring in my transformer and take this column value.
I run this Job every tuesday. It never happened till now, but somehow happened yesterday!

Posted: Wed May 06, 2009 12:44 am
by adi_2_chaos
Even I'm relatively new to Datastage. Cant suggest a lot of things right now. If you can dig down into things, start with the source data .. maybe there's something wrong with it.

Thanks.

Posted: Wed May 06, 2009 1:01 am
by Kryt0n
Don't think copy and paste won't do much good if they are non-printable characters.

Try select <column>, hex(<column>) from <table> where <column> like %00051%...

What results do you get? At a guess one has been padded with nulls

Posted: Wed May 06, 2009 3:27 am
by Abhyankar
Hi,
The value for '000051'
came as '303030303531000000000000000000' and '303030303531202020202020202020'

Seems the second is padded with space. Thanks. But then why am i not getting the correct result when i am doing a ltrim and rtrim?

Posted: Wed May 06, 2009 3:48 am
by Abhyankar
Hi guys,
The problem has been resolved.
It was padding with space issue itself. Dont know why ltrim and rtrim didnt work, but when i did 'REPLACE(Some_Field, ' ', '')' it gave me proper distinct records.
Thanks a lot to all!

Posted: Thu May 07, 2009 10:51 pm
by datastage_learner
Abhyankar wrote:Hi guys,
The problem has been resolved.
It was padding with space issue itself. Dont know why ltrim and rtrim didnt work, but when i did 'REPLACE(Some_Field, ' ', '')' it gave me proper distinct records.
Thanks a lot to all!


ecjjactly...mate...when you have distinct issues you have improper sql formation issues too. take care next time :roll: :wink:

Also...mate when u enter the username values in the table before retrieving it, can't you trim the data first before inserting in the table?

Secondly, char datatype tends to do such thing...use varchar2 henceforth! :roll:

Posted: Thu May 07, 2009 11:45 pm
by ray.wurlod
Please strive for a professional standard of written English on DSXchange. Put yourself in the position of someone whose first language is not English wasting time looking up "ecjjactly" in a translation dictionary.

And please refrain from frequent use of ...mate... - it does not contribute anything.

Finally, as noted elsewhere, the second person personal pronoun in English is spelled "you", not "u".