Page 1 of 1

Duplicate Lookup Values

Posted: Tue Jun 13, 2006 5:09 am
by parag.s.27
Hey all,

I am having a lookup file which is fetched from an ERP system at the run time and is directly fed to the hash file and is used as a lookup.

Th problem is there are sometimes duplicate lookup parameters in the lookup file for a single value to be searched.

for e.g. in my master file "CC1464-SERV" is a value for which i have to find a lookup...now in lookup file

i have

1.)CC1464-SERV, WBSE.001.00012534
2.)CC1464-SERV, CCE.001.001001

now since thr are duplicate lookup parameters, the DS picks random value... sometimes 1 and other times 2.

I can not go for searching and removing duplicate values from lookup file coz it comes at runtime and has thousands of values.

can someone suggest any logic to always pick the first value or remove duplicates at the run time itself..

thanks in advance

Re: Duplicate Lookup Values

Posted: Tue Jun 13, 2006 5:23 am
by balajisr
parag.s.27 wrote:Hey all,

I am having a lookup file which is fetched from an ERP system at the run time and is directly fed to the hash file and is used as a lookup.

Th problem is there are sometimes duplicate lookup parameters in the lookup file for a single value to be searched.

for e.g. in my master file "CC1464-SERV" is a value for which i have to find a lookup...now in lookup file

i have

1.)CC1464-SERV, WBSE.001.00012534
2.)CC1464-SERV, CCE.001.001001

now since thr are duplicate lookup parameters, the DS picks random value... sometimes 1 and other times 2.

I can not go for searching and removing duplicate values from lookup file coz it comes at runtime and has thousands of values.

can someone suggest any logic to always pick the first value or remove duplicates at the run time itself..

thanks in advance
If you want to remove duplicates which one will be retained? First or Last?.
If the last value is to be retained then write it to the hashed file.
When writing the hashed file from the lookup table duplicate rows are removed.Writing the hashed file with the same key causes the row to be overwritten and the last value for the same key from the lookup file is written to the hashed file. First value cannot be retreived.

Or preprocess the file using unix uniq command in before job subroutine or in filter option of sequential file.

If possible change your key.

Posted: Tue Jun 13, 2006 5:28 am
by parag.s.27
Hi,

Actually i mentioned in my post that i want to retain 1st value...

Posted: Tue Jun 13, 2006 5:35 am
by rwierdsm
parag.s.27 wrote:Hi,

Actually i mentioned in my post that i want to retain 1st value...
In that case, pass the lookup file though a sort and then a tranform. In the transform compare the current row key value to the previous row keyvalue. If the current row key equals the previous row key, don't pass the record on.

Lookup RowProcCompareWithPreviousValue() in help. Note that you can only use this function once in a job.

Rob W

Posted: Tue Jun 13, 2006 5:45 am
by parag.s.27
Ya i think i should make a seperate job and include in the sequence......
else thr is no option it seems....

Posted: Tue Jun 13, 2006 6:40 am
by DSguru2B
Better yet, reverse sort it and pass it through the hashed file. You dont need to make your design more complicated just to get a last record in your hashed file.

Posted: Tue Jun 13, 2006 9:34 am
by kumar_s
parag.s.27 wrote:Hi,

Actually i mentioned in my post that i want to retain 1st value...
Balaji also gave a solution which was unnoticed.
preprocess the file using unix uniq command in before job subroutine or in filter option of sequential file

Posted: Tue Jun 13, 2006 3:32 pm
by ray.wurlod
Make sure the hashed file is loaded with the first value (however that is defined). Usually, as others have noted, this can be accomplished by sorting the stream of data being loaded into the hashed file in reverse order; hashed file records are destructuvely overwritten, so it is the last written version of any record that remains in the hashed file. Sorting in reverse order guarantees that this is your "first".