Page 1 of 1

When should I use "Entire" partitioning in a looku

Posted: Tue Feb 03, 2009 2:06 pm
by pattemk
Hi,

Is it mandatory to specify entire as the partitioning method when using normal lookup. Is there any chance of losing data if we just leave it as auto partitioning?

Kindly advice

Thanks

**Note: Subject made more descriptive - Content Editor **

Re: lookup

Posted: Tue Feb 03, 2009 2:08 pm
by betterthanever
no it is not mandatory

Re: lookup

Posted: Tue Feb 03, 2009 2:45 pm
by pattemk
[quote="betterthanever"]no it is not mandatory[/quote]

Thanks for your prompt reply, a quick question.

what would be the cases or scenarios where we must specify entire partitiong method when doing normal lookup?

Kindly advice

Thanks

Re: lookup

Posted: Tue Feb 03, 2009 2:56 pm
by betterthanever
i don't think any

Posted: Tue Feb 03, 2009 4:16 pm
by Mike
Specify entire partitioning whenever you are unable or unwilling to partition the reference input exactly the same as the stream input. One example of unable that I can think of: multiple reference links into a single lookup stage where the stream input can only be partitioned to match one of the reference inputs. An example of unwilling: a very small reference table where you don't want the overhead of repartitioning the stream input.

Mike

Posted: Tue Feb 03, 2009 5:17 pm
by ray.wurlod
On a single machine ("SMP" environment) you may as well use Entire, because it comes at no cost, via shared memory.

In a multiple machine environment ("MPP", cluster, grid) there can be a substantial cost moving records to all nodes, so you tend to avoid Entire (other than for small Data Sets) and use the same key-based partitioning as is used for the stream input.

Posted: Wed Feb 04, 2009 8:30 am
by pattemk
[quote="Mike"]Specify entire partitioning whenever you are unable or unwilling to partition the reference input exactly the same as the stream input. One example of unable that I can think of: multiple reference links into a single lookup stage where the stream input can only be partitioned to match one of the reference inputs. An example of unwilling: a very small reference table where you don't want the overhead of repartitioning the stream input.

Mike[/quote]

Thanks for your prompt reply.
my reference data is very small, i believe specifying entire will not result in lose of data or performance, so i am assuming specifying entire is better practice and mostly like mandatory when doing normal lokup with small reference data.