No hit on lookup with Ignoring duplicate error

ggarze · Post by **ggarze** » Fri Jul 29, 2011 12:35 pm

This has me really confused and hopefully someone can explain the difference as I'm not sure if it's a parttion issue or something else. I have a Dataset as input with two fields (code, timestamp(key)). I have another Dataset going into a lookup stage with (code(key), start_timestamp, end_timestamp). Now I'm doing a range lookup, if the input timestamp is >= start_timestamp and <= end_timestamp on the lookup dataset then there should be a match. Looking at the data everything should match but of course I get no matches. I guess i should mention the lookup dataset key field, code, is pretty much the same value. Only the timestamp ranges differ. When I run and get no match I get the "Ignoring duplicate entry; no further warnings will be issued for this table" warning message. Thinking it might be a partion issue in the lookup stage I changed the partion to entire for the dataset. Same issue. Only when I go into the lookup stage and change the "Multiple rows returned from link" option from blank to the Link do then all the lookups match and I get output.

Why is this? Is it a partyioning issue? If DS finds a duplicate does it treat it as a not found condition?

Thanks

ray.wurlod · Post by **ray.wurlod** » Fri Jul 29, 2011 4:17 pm

No, that's simply how it works.
The Lookup stage by default returns one match. If you want more than one (which is likely less of the time) you have to tell it so.

ggarze · Post by **ggarze** » Fri Jul 29, 2011 8:36 pm

So ray, in my example where I have a key which has multiple date ranges and I want to add the start and end date to the incoming records whose date comes in and must be within the range how would I go about this? Because of the dupes would you suggest breaking this out into two separate lookups? For example: first lookup on the code and pass the code from the lookuP(if the code lookup had no match this code being passed on would be null) on to the next lookup which only does the range and use a condition to first check the code from the first lookup. If the code is null then don't do the range lookup if not null then do the range lookup. Does that make sense?

ray.wurlod · Post by **ray.wurlod** » Sat Jul 30, 2011 5:13 pm

You can do it with one, but I believe your two-Lookup solutions is easier to understand and therefore to maintain.