Page 1 of 1

Warnings in join stage

Posted: Wed Mar 30, 2005 1:51 am
by vigneshra
Hi

We are getting a warning in our job when trying to validate. The warning is as below:

JOIN_Records: When checking operator: User inserted sort "{natural="SORT_Guest:InTo_Join_Left.v", synthetic="buffer(0)"}" does not fulfill the sort requirements of the downstream operator "APT_JoinSubOperator in JOIN_Records"

What this error means? Will it have impact while running for a large number of records (in the order of few millions)?

Please reply as early as possible!

Posted: Wed Mar 30, 2005 1:55 am
by roy
Hi,
did you sort acording to the stage requirments?
look in the help for it.
_________________
Vignesh.

"Choose a job you love, and you will never have to work a day in your life
I guess that's part of why some people turn to the sex industry as well j/k ;))

Posted: Wed Mar 30, 2005 4:08 pm
by gh_amitava
Hi,

What is the Partition logic you have used and what is the number of Node ?

Regards
~Amitava

Posted: Wed Mar 30, 2005 11:03 pm
by T42
This error message shows up when you are trying to do something down the stream that utilizes the sort data.

Your sort key(s) must fit that stage's key(s). If it's an aggregator, you need the same keys. If it's a remove duplicate, you need one extra sort field beyond the keys. The list goes on.

Hope this help you figure out the problem.

Posted: Thu Mar 31, 2005 1:44 am
by richdhan
Hi T42,
T42 wrote:If it's a remove duplicate, you need one extra sort field beyond the keys.
Can you explain why this is so in the case of Remove Duplicates Stage and not in the case of Aggregator?

TIA
Rich

Posted: Thu Mar 31, 2005 8:09 am
by T42
I believe it's how the stages work, and based on limited experience using the sort stage apart from the stage it's being used for. Typically, for the remove duplicate, I would use the embedded sort.

However, I think Remove Duplicates sorta expect that data be somehow sorted beyond just the key. If you just sort on the keys, you have no guarantee that you will get the record you want, so the stage is only being extra-cautious in requiring this.

I may be wrong, and will have to test this again when I'm done with crunch-mode at this client. I would suggest that experimentation be done on those sorts -- if it doesn't work, try adding an extra field, or taking out the extra fields.