Lookup fileset warning: partitioning other than Entire.....

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
Minhajuddin
Participant
Posts: 467
Joined: Tue Mar 20, 2007 6:36 am
Location: Chennai
Contact:

Lookup fileset warning: partitioning other than Entire.....

Post by Minhajuddin »

Hi all,

I am getting this warning whenever I use a Lookup fileset.

Code: Select all

Lookup_10: Input Dataset 0 has partitioning other than Entire specified; disabling memory sharing.
I have searched the forum, but, looks like we don't have an option other than suppressing it....

Has anybody found a fix for this?

Thanks for the help...
Minhajuddin

<a href="http://feeds.feedburner.com/~r/MyExperi ... ~6/2"><img src="http://feeds.feedburner.com/MyExperienc ... lrow.3.gif" alt="My experiences with this DLROW" border="0"></a>
battaliou
Participant
Posts: 155
Joined: Mon Feb 24, 2003 7:28 am
Location: London
Contact:

Post by battaliou »

Ok, but is your file set partiioned on your lookup key? Also, are you running with RCP?
3NF: Every non-key attribute must provide a fact about the key, the whole key, and nothing but the key. So help me Codd.
Minhajuddin
Participant
Posts: 467
Joined: Tue Mar 20, 2007 6:36 am
Location: Chennai
Contact:

Post by Minhajuddin »

I have two jobs

Code: Select all

Job1

Database----> LookupFileSet(Entire Partition)

Code: Select all

Job2
                   LookupFileSet
                          |
                          |
                          |
                          V
Dataset1---->LookupStage------>Dataset2


In the second job I have set the partitioning of input link to Lookup stage as Entire....

And I don't have RCP enabled.
Minhajuddin

<a href="http://feeds.feedburner.com/~r/MyExperi ... ~6/2"><img src="http://feeds.feedburner.com/MyExperienc ... lrow.3.gif" alt="My experiences with this DLROW" border="0"></a>
battaliou
Participant
Posts: 155
Joined: Mon Feb 24, 2003 7:28 am
Location: London
Contact:

Post by battaliou »

That diagram is excellent. If you look at Chapter 7 of parjdev.pdf, it explains that the lookup file set creates one file for each partition. It follows that if you specify "entire" partitioning that you will get duplication. You must specify the key that the file will be lookup up on. I would suggest that you Hash partition on this key too. i.e. Job1 should be partitioned on Lookupkey.
3NF: Every non-key attribute must provide a fact about the key, the whole key, and nothing but the key. So help me Codd.
Minhajuddin
Participant
Posts: 467
Joined: Tue Mar 20, 2007 6:36 am
Location: Chennai
Contact:

Post by Minhajuddin »

battaliou wrote:That diagram is excellent. If you look at Chapter 7 of parjdev.pdf, it explains that the lookup file set creates one file for each partition. It follows that if you specify "entire" partitioning that you will get duplication. You must specify the key that the file will be lookup up on. I would suggest that you Hash partition on this key too. i.e. Job1 should be partitioned on Lookupkey.
I tried hashed partitioning too......
But, I am getting the same warning..

Does this warning pop up for everybody.... Or is it just my case :(
Minhajuddin

<a href="http://feeds.feedburner.com/~r/MyExperi ... ~6/2"><img src="http://feeds.feedburner.com/MyExperienc ... lrow.3.gif" alt="My experiences with this DLROW" border="0"></a>
HSBCdev
Premium Member
Premium Member
Posts: 141
Joined: Tue Mar 16, 2004 8:22 am
Location: HSBC - UK and India
Contact:

Post by HSBCdev »

If you hash-partition the reference link, then the input link must also be hash-partitioned, may be expensive. From the Developer's guide

There are some special partitioning considerations for lookup stages.
You need to ensure that the data being looked up in the lookup table is
in the same partition as the input data referencing it. One way of
doing this is to partition the lookup tables using the Entire method.
Another way is to partition it in the same way as the input data
(although this implies sorting of the data).


For this reason we recommend Entire partitioning the reference data, which should, by definition, be low volume, this removes the need to sort or partition the main datastream and guarantees that every key value is present in every partition.

But, to answer the question, we have been unable to get rid of the warning (crops up on some transformers also) and we add it to the Project Level message handler and demote to info.

regards

Phil Clarke.
HSBCdev
Premium Member
Premium Member
Posts: 141
Joined: Tue Mar 16, 2004 8:22 am
Location: HSBC - UK and India
Contact:

Post by HSBCdev »

If you hash-partition the reference link, then the input link must also be hash-partitioned, may be expensive. From the Developer's guide

There are some special partitioning considerations for lookup stages.
You need to ensure that the data being looked up in the lookup table is
in the same partition as the input data referencing it. One way of
doing this is to partition the lookup tables using the Entire method.
Another way is to partition it in the same way as the input data
(although this implies sorting of the data).


For this reason we recommend Entire partitioning the reference data, which should, by definition, be low volume, this removes the need to sort or partition the main datastream and guarantees that every key value is present in every partition.

But, to answer the question, we have been unable to get rid of the warning (crops up on some transformers also) and we add it to the Project Level message handler and demote to info.

regards

Phil Clarke.
HSBCdev
Premium Member
Premium Member
Posts: 141
Joined: Tue Mar 16, 2004 8:22 am
Location: HSBC - UK and India
Contact:

Post by HSBCdev »

Yep, it's a duplicate post. Not sure how it happened but apologies anyway.
Doc Phil
Participant
Posts: 15
Joined: Wed Sep 12, 2007 12:45 am
Location: Cape Town

Post by Doc Phil »

Completely not related to the posting, but couldnt help latching onto the reply "...From the Developer's guide ..."

I am new to E/E and am slowly finding my feet. Where can I get this Developer's guide????
Anticipation of failure is worse than failure itself
throbinson
Charter Member
Charter Member
Posts: 299
Joined: Wed Nov 13, 2002 5:38 pm
Location: USA

Post by throbinson »

I didn't realize one could write using Entire. I can see this being useful for a MPP system in which one is writing the entire fileset to each separate node. If you have a SMP system then this is most likely a needless duplication of data.
The default partitioning of the Look-up stage reference links is Entire. To eliminate this warning without resorting to the dubious benefits of the Message handler, set the partitioning method of the input link of the Look-up to Auto and your reference links to Entire. I believe this warning message is trying to tell you that you are not using Entire and that you need to make sure you've partitioned correctly so that all look-up rows are in the correct partitions because EE is disabling the memory sharing that would result when using a single copy of the Lookup fileset.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Doc Phil wrote:I am new to E/E and am slowly finding my feet. Where can I get this Developer's guide????
Wherever you installed your client software, in a 'Docs' sub-folder, you'll find pdf versions of all manuals including the Parallel Job Developer's Guide. The main 'bookshelf' pdf is in your Start Menu under Ascential DataStage / Online Manuals.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Doc Phil
Participant
Posts: 15
Joined: Wed Sep 12, 2007 12:45 am
Location: Cape Town

Post by Doc Phil »

thanx
Anticipation of failure is worse than failure itself
Post Reply