problem in look up when it is long string

santoo_happy · Post by **santoo_happy** » Wed Aug 22, 2007 4:55 pm

Hi,

I have a source field "DATA_VALUE" which can have any kind of data i.e. char or number or date.

Iam making a look up with hashed file to get non matching records from source. But look up is NOT matching(even if both records are same) for few records which is having long string approx string length is 2500 characters

Please advice

Thanks,
Santosh

ArndW · Post by **ArndW** » Wed Aug 22, 2007 5:12 pm

DataStage hashed files have a configurable maximum length; I believe the default value is 768; you can check this value by looking into the uvconfig file or executing the command 'smat -t' from the command line. This would mean that the hashed file keys are truncated to this length, which explains why your match didn't work.

I'd have to check my docs to see what the impact of increasing the MAXKEYSIZE parameter in the uvconfig is - offhand I would guess that the overall impact shouldn't be too great.

ray.wurlod · Post by **ray.wurlod** » Wed Aug 22, 2007 5:57 pm

For a start 2500 would imply GROUP.SIZE 2.

The impact can be huge - can even preclude use of dynamic hashed files!

Be very, very careful.

Abu@0403 · Post by **Abu@0403** » Wed Aug 22, 2007 11:32 pm

Since the field can contain any of the metadata like char or number or date, I assume that the datatype of that field is given as char.

So one of the solution is to split this single field to multiple field and mention all these splitted fields as key.

In case of splitting also make sure that while splitting some of the data for that field may be very low so in that case the splitted fields may have null, so for these just assign some default value so that exact match occurs when it is compared with that of the input field. Please check if this could solve your problem.

ArndW · Post by **ArndW** » Wed Aug 22, 2007 11:35 pm

Abu - that is a good thought, but unfortunately it doesn't work that way. Multiple key definitions in DataStage actually go into just one key. Hashed files have one and only one key and it must be unique.

Abu@0403 · Post by **Abu@0403** » Wed Aug 22, 2007 11:43 pm

Is it that even if we have 10 keys in hash file. It internally combines and has it only as a single fileld. So it will be the same whether it is split or not. Is this the way hash file works.

ArndW · Post by **ArndW** » Wed Aug 22, 2007 11:48 pm

Yes, exactly. There is only one physical unique key; if you specify multiple keys in a job it combines them to one string internally.

Abu@0403 · Post by **Abu@0403** » Wed Aug 22, 2007 11:52 pm

Thanks a lot Arnd. Now its clear for me.

Abu@0403 · Post by **Abu@0403** » Sun Aug 26, 2007 10:54 pm

In this case just split that single field into multiple hashed file fields, say have that splitted and stored into 5 hashed file(500 chars in eash hashed file).

Have a condition like if it matches with all the hashed files then True, even if one of the hash file returns as NOTFOUND, then it would mean like the input field column does not match with the looked up data. Please check if this can be done. Arnd, can you advice if this is feasible.

DSXchange

problem in look up when it is long string

problem in look up when it is long string

Re: problem in look up when it is long string