Lookup fileset warning: partitioning other than Entire.....

Minhajuddin · Post by **Minhajuddin** » Tue Sep 25, 2007 8:37 am

Hi all,

I am getting this warning whenever I use a Lookup fileset.

Lookup_10: Input Dataset 0 has partitioning other than Entire specified; disabling memory sharing.

I have searched the forum, but, looks like we don't have an option other than suppressing it....

Has anybody found a fix for this?

Thanks for the help...

battaliou · Post by **battaliou** » Tue Sep 25, 2007 10:27 am

Ok, but is your file set partiioned on your lookup key? Also, are you running with RCP?

Minhajuddin · Post by **Minhajuddin** » Tue Sep 25, 2007 10:38 am

I have two jobs

Code: Select all

Job1

Database----> LookupFileSet(Entire Partition)

Code: Select all

Job2
                   LookupFileSet
                          |
                          |
                          |
                          V
Dataset1---->LookupStage------>Dataset2

In the second job I have set the partitioning of input link to Lookup stage as Entire....

And I don't have RCP enabled.

battaliou · Post by **battaliou** » Wed Sep 26, 2007 1:41 am

That diagram is excellent. If you look at Chapter 7 of parjdev.pdf, it explains that the lookup file set creates one file for each partition. It follows that if you specify "entire" partitioning that you will get duplication. You must specify the key that the file will be lookup up on. I would suggest that you Hash partition on this key too. i.e. Job1 should be partitioned on Lookupkey.

Minhajuddin · Post by **Minhajuddin** » Wed Sep 26, 2007 11:29 am

battaliou wrote:That diagram is excellent. If you look at Chapter 7 of parjdev.pdf, it explains that the lookup file set creates one file for each partition. It follows that if you specify "entire" partitioning that you will get duplication. You must specify the key that the file will be lookup up on. I would suggest that you Hash partition on this key too. i.e. Job1 should be partitioned on Lookupkey.

I tried hashed partitioning too......
But, I am getting the same warning..

Does this warning pop up for everybody.... Or is it just my case

HSBCdev · Post by **HSBCdev** » Tue Oct 02, 2007 5:04 am

If you hash-partition the reference link, then the input link must also be hash-partitioned, may be expensive. From the Developer's guide

There are some special partitioning considerations for lookup stages.
You need to ensure that the data being looked up in the lookup table is
in the same partition as the input data referencing it. One way of
doing this is to partition the lookup tables using the Entire method.
Another way is to partition it in the same way as the input data
(although this implies sorting of the data).

For this reason we recommend Entire partitioning the reference data, which should, by definition, be low volume, this removes the need to sort or partition the main datastream and guarantees that every key value is present in every partition.

But, to answer the question, we have been unable to get rid of the warning (crops up on some transformers also) and we add it to the Project Level message handler and demote to info.

regards

Phil Clarke.

HSBCdev · Post by **HSBCdev** » Tue Oct 02, 2007 5:06 am

If you hash-partition the reference link, then the input link must also be hash-partitioned, may be expensive. From the Developer's guide

There are some special partitioning considerations for lookup stages.
You need to ensure that the data being looked up in the lookup table is
in the same partition as the input data referencing it. One way of
doing this is to partition the lookup tables using the Entire method.
Another way is to partition it in the same way as the input data
(although this implies sorting of the data).

For this reason we recommend Entire partitioning the reference data, which should, by definition, be low volume, this removes the need to sort or partition the main datastream and guarantees that every key value is present in every partition.

But, to answer the question, we have been unable to get rid of the warning (crops up on some transformers also) and we add it to the Project Level message handler and demote to info.

regards

Phil Clarke.

HSBCdev · Post by **HSBCdev** » Tue Oct 02, 2007 5:08 am

Yep, it's a duplicate post. Not sure how it happened but apologies anyway.

Doc Phil · Post by **Doc Phil** » Tue Oct 02, 2007 6:10 am

Completely not related to the posting, but couldnt help latching onto the reply "...From the Developer's guide ..."

I am new to E/E and am slowly finding my feet. Where can I get this Developer's guide????

throbinson · Post by **throbinson** » Tue Oct 02, 2007 6:13 am

I didn't realize one could write using Entire. I can see this being useful for a MPP system in which one is writing the entire fileset to each separate node. If you have a SMP system then this is most likely a needless duplication of data.
The default partitioning of the Look-up stage reference links is Entire. To eliminate this warning without resorting to the dubious benefits of the Message handler, set the partitioning method of the input link of the Look-up to Auto and your reference links to Entire. I believe this warning message is trying to tell you that you are not using Entire and that you need to make sure you've partitioned correctly so that all look-up rows are in the correct partitions because EE is disabling the memory sharing that would result when using a single copy of the Lookup fileset.

chulett · Post by **chulett** » Tue Oct 02, 2007 7:03 am

Doc Phil wrote:I am new to E/E and am slowly finding my feet. Where can I get this Developer's guide????

Wherever you installed your client software, in a 'Docs' sub-folder, you'll find pdf versions of all manuals including the Parallel Job Developer's Guide. The main 'bookshelf' pdf is in your Start Menu under Ascential DataStage / Online Manuals.

Doc Phil · Post by **Doc Phil** » Tue Oct 02, 2007 8:11 am

thanx