Lookup Files created with Partitioning = Entire
Posted: Thu Dec 21, 2006 11:00 am
All,
I've got two jobs.
The first job, called BuildLists, reads a couple of sequential files and populates two lookup lists, one hosted in a FileSet the other in a Dataset. Previously the BuildLists job was pointing to the default config file which defined just a single node. Now it points to a new config file with two nodes defined. Both lists were previously created with partitioning set to 'Entire'.
The second job reads both lookups. Partioning is set to 'Auto' throughout and it uses the 2 node config file.
I re-ran the BuildLists job today to recreate the lookups, as some new rows had been added. This is probably the first run since the new config file was set. Now my second job finds double entries in the lookups. My understanding was that setting the partitioning to 'Entire' would make all the rows in the lookup file available to all nodes in the second job, but it seems to make them all available twice. When I set the partitioning for one of the lookups (the fileset) to 'Auto', the second job no longer sees duplicate entries for that lookup.
Please educate me. This is not the behaviour I was expecting.
Rob W.
I've got two jobs.
The first job, called BuildLists, reads a couple of sequential files and populates two lookup lists, one hosted in a FileSet the other in a Dataset. Previously the BuildLists job was pointing to the default config file which defined just a single node. Now it points to a new config file with two nodes defined. Both lists were previously created with partitioning set to 'Entire'.
The second job reads both lookups. Partioning is set to 'Auto' throughout and it uses the 2 node config file.
I re-ran the BuildLists job today to recreate the lookups, as some new rows had been added. This is probably the first run since the new config file was set. Now my second job finds double entries in the lookups. My understanding was that setting the partitioning to 'Entire' would make all the rows in the lookup file available to all nodes in the second job, but it seems to make them all available twice. When I set the partitioning for one of the lookups (the fileset) to 'Auto', the second job no longer sees duplicate entries for that lookup.
Please educate me. This is not the behaviour I was expecting.
Rob W.