Hey all,
I am having a lookup file which is fetched from an ERP system at the run time and is directly fed to the hash file and is used as a lookup.
Th problem is there are sometimes duplicate lookup parameters in the lookup file for a single value to be searched.
for e.g. in my master file "CC1464-SERV" is a value for which i have to find a lookup...now in lookup file
i have
1.)CC1464-SERV, WBSE.001.00012534
2.)CC1464-SERV, CCE.001.001001
now since thr are duplicate lookup parameters, the DS picks random value... sometimes 1 and other times 2.
I can not go for searching and removing duplicate values from lookup file coz it comes at runtime and has thousands of values.
can someone suggest any logic to always pick the first value or remove duplicates at the run time itself..
thanks in advance
Duplicate Lookup Values
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 221
- Joined: Fri Feb 17, 2006 3:38 am
- Location: India
- Contact:
Duplicate Lookup Values
Thanks & Regards
Parag Saundattikar
Certified for Infosphere DataStage v8.0
Parag Saundattikar
Certified for Infosphere DataStage v8.0
Re: Duplicate Lookup Values
If you want to remove duplicates which one will be retained? First or Last?.parag.s.27 wrote:Hey all,
I am having a lookup file which is fetched from an ERP system at the run time and is directly fed to the hash file and is used as a lookup.
Th problem is there are sometimes duplicate lookup parameters in the lookup file for a single value to be searched.
for e.g. in my master file "CC1464-SERV" is a value for which i have to find a lookup...now in lookup file
i have
1.)CC1464-SERV, WBSE.001.00012534
2.)CC1464-SERV, CCE.001.001001
now since thr are duplicate lookup parameters, the DS picks random value... sometimes 1 and other times 2.
I can not go for searching and removing duplicate values from lookup file coz it comes at runtime and has thousands of values.
can someone suggest any logic to always pick the first value or remove duplicates at the run time itself..
thanks in advance
If the last value is to be retained then write it to the hashed file.
When writing the hashed file from the lookup table duplicate rows are removed.Writing the hashed file with the same key causes the row to be overwritten and the last value for the same key from the lookup file is written to the hashed file. First value cannot be retreived.
Or preprocess the file using unix uniq command in before job subroutine or in filter option of sequential file.
If possible change your key.
-
- Participant
- Posts: 221
- Joined: Fri Feb 17, 2006 3:38 am
- Location: India
- Contact:
-
- Premium Member
- Posts: 209
- Joined: Fri Jan 09, 2004 1:14 pm
- Location: Toronto, Canada
- Contact:
In that case, pass the lookup file though a sort and then a tranform. In the transform compare the current row key value to the previous row keyvalue. If the current row key equals the previous row key, don't pass the record on.parag.s.27 wrote:Hi,
Actually i mentioned in my post that i want to retain 1st value...
Lookup RowProcCompareWithPreviousValue() in help. Note that you can only use this function once in a job.
Rob W
Rob Wierdsma
Toronto, Canada
bartonbishop.com
Toronto, Canada
bartonbishop.com
-
- Participant
- Posts: 221
- Joined: Fri Feb 17, 2006 3:38 am
- Location: India
- Contact:
Balaji also gave a solution which was unnoticed.parag.s.27 wrote:Hi,
Actually i mentioned in my post that i want to retain 1st value...
preprocess the file using unix uniq command in before job subroutine or in filter option of sequential file
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Make sure the hashed file is loaded with the first value (however that is defined). Usually, as others have noted, this can be accomplished by sorting the stream of data being loaded into the hashed file in reverse order; hashed file records are destructuvely overwritten, so it is the last written version of any record that remains in the hashed file. Sorting in reverse order guarantees that this is your "first".
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.