Fatal Error: Updatable lookup requires sorted keys.

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
dougcl
Premium Member
Premium Member
Posts: 137
Joined: Thu Jun 24, 2010 4:28 pm

Fatal Error: Updatable lookup requires sorted keys.

Post by dougcl »

Hi folks, there are many range lookup problems posted here, so I realize I am adding to a pile, but I haven't seen this particular issue resolved so I am posting it before submitting a ticket.

I have a Varchar Key and a date on the stream:

StreamKey
StreamDate

I have the same key, and a pair of dates, and a desired data column on the reference:
RefKey
RefDate1
RefDate2
RefData

I want the stream to get the RefData value when StreamKey = RefKey and StreamDate is between RefDate1 and RefDate2.

FYI, RefKey and RefDate1 uniquely define a row in the ref data, and none of the date ranges overlap. So for each StreamKey/StreamDate pair, only one match is possible in the reference. I have the reference link partitioned "entire."

I have tried the following:
1) Match the StreamKey to the RefKey on the reference lookup link, and do a range lookup on StreamDate on the stream link. I got the error.

2) I tried a range lookup on both the StreamKey and the StreamDate on the stream link using the strategy suggested by vskr72 here viewtopic.php?t=133588
(by duplicating RefKey in the reference link and treating both lookups as ranges on the stream link.)

I got the error. I got the error either way.

Any help is appreciated. At this point, I suppose the most general question is, does range lookup work at all when the range is specified on the stream link? Is there a known bug on Solaris?

Thanks,
Doug
kris007
Charter Member
Charter Member
Posts: 1102
Joined: Tue Jan 24, 2006 5:38 pm
Location: Riverside, RI

Re: Fatal Error: Updatable lookup requires sorted keys.

Post by kris007 »

You can join just the StreamKey and RefKey and bring all the columns downstream into a Transformer or Filter Stage and then apply the logic
StreamDate is between RefDate1 and RefDate2
That should work fine.
Kris

Where's the "Any" key?-Homer Simpson
dougcl
Premium Member
Premium Member
Posts: 137
Joined: Thu Jun 24, 2010 4:28 pm

Re: Fatal Error: Updatable lookup requires sorted keys.

Post by dougcl »

kris007 wrote:You can join just the StreamKey and RefKey and bring all the columns downstream into a Transformer or Filter Stage and then apply the logic
StreamDate is between RefDate1 and RefDate2
That should work fine.
Hi thanks. Joining the StreamKey and the RefKey requires sorting and repartitioning the stream. Also, it would create an explosion in row count. I would like to use a lookup. Is it correct to infer from your response that range lookups on the stream link don't work?
kris007
Charter Member
Charter Member
Posts: 1102
Joined: Tue Jan 24, 2006 5:38 pm
Location: Riverside, RI

Re: Fatal Error: Updatable lookup requires sorted keys.

Post by kris007 »

Sorry for misleading you. What I meant was to use LookUp Stage to join just the StreamKey and RefKey and then continue. Range lookups in Parallel jobs are done differently than server jobs and I couldn't figure out a way so far to do the range lookups on the stream link. I don't think it is possible.
Kris

Where's the "Any" key?-Homer Simpson
dougcl
Premium Member
Premium Member
Posts: 137
Joined: Thu Jun 24, 2010 4:28 pm

Post by dougcl »

Thanks for the clarification. I'm submitting a ticket to see what IBM has to say. In the meantime I think I'll try your idea of using the lookup to match the keys. Up until now, I've always had unique rows in the lookup, so I guess you're saying multiple rows from the reference link just lead to multiple rows on the output like a join would. Speaking of which, something else I tried was to go into the "multiple rows returned from link" area and place a condition in there on the date range. Of course that didn't work either as reference link items are not available in the expression there. Seems like that would have been a nice, intuitive answer. As it is, I can't think of a case where the expression there would be useful. Only being able to apply a constraint to the stream while multiple rows are being returned on the reference makes no sense to me.
dougcl
Premium Member
Premium Member
Posts: 137
Joined: Thu Jun 24, 2010 4:28 pm

Post by dougcl »

Hi guys, I am trying this idea of doing a lookup on the key, and then using a filter to apply the date logic, but I don't think the lookup is working. If the reference returns more than one row, does it operate like a join, or does it ignore the duplicates? My rowcount going in is the same as the rowcount coming out. It should be much larger.

Thanks,
Doug
kris007
Charter Member
Charter Member
Posts: 1102
Joined: Tue Jan 24, 2006 5:38 pm
Location: Riverside, RI

Post by kris007 »

dougcl wrote:If the reference returns more than one row, does it operate like a join, or does it ignore the duplicates? Doug
If you select the option "multiple rows returned from link" it will work like a join stage. If you did not select that option the lookup stage will drop duplicate records based on key which in your case RefKey
My rowcount going in is the same as the rowcount coming out. It should be much larger.
Which link are you talking about here. The reference vs Output or Stream vs Output?
Kris

Where's the "Any" key?-Homer Simpson
dougcl
Premium Member
Premium Member
Posts: 137
Joined: Thu Jun 24, 2010 4:28 pm

Post by dougcl »

kris007 wrote: If you select the option "multiple rows returned from link" it will work like a join stage. If you did not select that option the lookup stage will drop duplicate records based on key which in your case RefKey
Oh duh. I see that. I had the drop down blank. It seems to be working now. thanks,
Doug
dougcl
Premium Member
Premium Member
Posts: 137
Joined: Thu Jun 24, 2010 4:28 pm

Post by dougcl »

Hi folks, I think I have this solved. Apparently the error message is correct. If you are doing a range lookup, the reference link must be sorted on the lookup keys. Partition entire first, then sort.

Doug
Post Reply