Page 1 of 1

Lookup fileset warning: partitioning other than Entire.....

Posted: Tue Sep 25, 2007 8:37 am
by Minhajuddin
Hi all,

I am getting this warning whenever I use a Lookup fileset.

Code: Select all

Lookup_10: Input Dataset 0 has partitioning other than Entire specified; disabling memory sharing.
I have searched the forum, but, looks like we don't have an option other than suppressing it....

Has anybody found a fix for this?

Thanks for the help...

Posted: Tue Sep 25, 2007 10:27 am
by battaliou
Ok, but is your file set partiioned on your lookup key? Also, are you running with RCP?

Posted: Tue Sep 25, 2007 10:38 am
by Minhajuddin
I have two jobs

Code: Select all

Job1

Database----> LookupFileSet(Entire Partition)

Code: Select all

Job2
                   LookupFileSet
                          |
                          |
                          |
                          V
Dataset1---->LookupStage------>Dataset2


In the second job I have set the partitioning of input link to Lookup stage as Entire....

And I don't have RCP enabled.

Posted: Wed Sep 26, 2007 1:41 am
by battaliou
That diagram is excellent. If you look at Chapter 7 of parjdev.pdf, it explains that the lookup file set creates one file for each partition. It follows that if you specify "entire" partitioning that you will get duplication. You must specify the key that the file will be lookup up on. I would suggest that you Hash partition on this key too. i.e. Job1 should be partitioned on Lookupkey.

Posted: Wed Sep 26, 2007 11:29 am
by Minhajuddin
battaliou wrote:That diagram is excellent. If you look at Chapter 7 of parjdev.pdf, it explains that the lookup file set creates one file for each partition. It follows that if you specify "entire" partitioning that you will get duplication. You must specify the key that the file will be lookup up on. I would suggest that you Hash partition on this key too. i.e. Job1 should be partitioned on Lookupkey.
I tried hashed partitioning too......
But, I am getting the same warning..

Does this warning pop up for everybody.... Or is it just my case :(

Posted: Tue Oct 02, 2007 5:04 am
by HSBCdev
If you hash-partition the reference link, then the input link must also be hash-partitioned, may be expensive. From the Developer's guide

There are some special partitioning considerations for lookup stages.
You need to ensure that the data being looked up in the lookup table is
in the same partition as the input data referencing it. One way of
doing this is to partition the lookup tables using the Entire method.
Another way is to partition it in the same way as the input data
(although this implies sorting of the data).


For this reason we recommend Entire partitioning the reference data, which should, by definition, be low volume, this removes the need to sort or partition the main datastream and guarantees that every key value is present in every partition.

But, to answer the question, we have been unable to get rid of the warning (crops up on some transformers also) and we add it to the Project Level message handler and demote to info.

regards

Phil Clarke.

Posted: Tue Oct 02, 2007 5:06 am
by HSBCdev
If you hash-partition the reference link, then the input link must also be hash-partitioned, may be expensive. From the Developer's guide

There are some special partitioning considerations for lookup stages.
You need to ensure that the data being looked up in the lookup table is
in the same partition as the input data referencing it. One way of
doing this is to partition the lookup tables using the Entire method.
Another way is to partition it in the same way as the input data
(although this implies sorting of the data).


For this reason we recommend Entire partitioning the reference data, which should, by definition, be low volume, this removes the need to sort or partition the main datastream and guarantees that every key value is present in every partition.

But, to answer the question, we have been unable to get rid of the warning (crops up on some transformers also) and we add it to the Project Level message handler and demote to info.

regards

Phil Clarke.

Posted: Tue Oct 02, 2007 5:08 am
by HSBCdev
Yep, it's a duplicate post. Not sure how it happened but apologies anyway.

Posted: Tue Oct 02, 2007 6:10 am
by Doc Phil
Completely not related to the posting, but couldnt help latching onto the reply "...From the Developer's guide ..."

I am new to E/E and am slowly finding my feet. Where can I get this Developer's guide????

Posted: Tue Oct 02, 2007 6:13 am
by throbinson
I didn't realize one could write using Entire. I can see this being useful for a MPP system in which one is writing the entire fileset to each separate node. If you have a SMP system then this is most likely a needless duplication of data.
The default partitioning of the Look-up stage reference links is Entire. To eliminate this warning without resorting to the dubious benefits of the Message handler, set the partitioning method of the input link of the Look-up to Auto and your reference links to Entire. I believe this warning message is trying to tell you that you are not using Entire and that you need to make sure you've partitioned correctly so that all look-up rows are in the correct partitions because EE is disabling the memory sharing that would result when using a single copy of the Lookup fileset.

Posted: Tue Oct 02, 2007 7:03 am
by chulett
Doc Phil wrote:I am new to E/E and am slowly finding my feet. Where can I get this Developer's guide????
Wherever you installed your client software, in a 'Docs' sub-folder, you'll find pdf versions of all manuals including the Parallel Job Developer's Guide. The main 'bookshelf' pdf is in your Start Menu under Ascential DataStage / Online Manuals.

Posted: Tue Oct 02, 2007 8:11 am
by Doc Phil
thanx