Unable to generate a node map
Moderators: chulett, rschirm, roy
Unable to generate a node map
Hi,
I have a simple job that reads wildcard pattern based files from a folder and loads into a target database. I have the File Name Column property set to store the file name in the target database. This works fine in a Windows Datastage Server environment; of course with help from the gurus here viewtopic.php?t=117069&highlight=APT_FileImportOperator
I migrated this to a Linux GRID environment and am running into the following error on the Sequential File stage.
main_program: For createFilesetFromPattern(), could not find any available nodes in node pool "".
SF_Input_File: At least one filename or data source must be set in APT_FileImportOperator before use.
This is happening when i set the $APT_IMPORT_PATTERN_USES_FILESET to TRUE.
If I set the $APT_IMPORT_PATTERN_USES_FILESET to FALSE, the job runs fine, but the file names are not fully expanded but are stored as /pathname/*.txt.
I tried providing a prefix tag to the pattern like "Feed*.txt" and it doesn't make any difference either. It just loads it as /pathname/Feed*.txt.
If I do an "ls" using the the folder name and pattern it lists the 2 files i have copied into the source location for testing.
I did a search on this messages and found that it might help to supply the folder name as a job parameter and then specify the pattern separately. Tried that and that did not help either.
At this time, the only means to get this to work is to set the $APT_IMPORT_PATTERN_USES_FILESET to FALSE
Please let me know if any input from me would help you help me further.
Your time and help is greatly appreciated.
Thanks,
I have a simple job that reads wildcard pattern based files from a folder and loads into a target database. I have the File Name Column property set to store the file name in the target database. This works fine in a Windows Datastage Server environment; of course with help from the gurus here viewtopic.php?t=117069&highlight=APT_FileImportOperator
I migrated this to a Linux GRID environment and am running into the following error on the Sequential File stage.
main_program: For createFilesetFromPattern(), could not find any available nodes in node pool "".
SF_Input_File: At least one filename or data source must be set in APT_FileImportOperator before use.
This is happening when i set the $APT_IMPORT_PATTERN_USES_FILESET to TRUE.
If I set the $APT_IMPORT_PATTERN_USES_FILESET to FALSE, the job runs fine, but the file names are not fully expanded but are stored as /pathname/*.txt.
I tried providing a prefix tag to the pattern like "Feed*.txt" and it doesn't make any difference either. It just loads it as /pathname/Feed*.txt.
If I do an "ls" using the the folder name and pattern it lists the 2 files i have copied into the source location for testing.
I did a search on this messages and found that it might help to supply the folder name as a job parameter and then specify the pattern separately. Tried that and that did not help either.
At this time, the only means to get this to work is to set the $APT_IMPORT_PATTERN_USES_FILESET to FALSE
Please let me know if any input from me would help you help me further.
Your time and help is greatly appreciated.
Thanks,
-V
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Hi Ray,
Thanks for the followup.
I checked the job log and found that the config file being used has the following entries.
{
node "node1"
{
fastname "ctpcqabdsh01p"
pools ""
resource disk "/nfsgrid/nfsbin/IBM/InformationServer/Server/Datasets" {pools ""}
resource scratchdisk "/nfsgrid/nfsbin/IBM/InformationServer/Server/Scratch" {pools ""}
}
node "node2"
{
fastname "ctpcqabdsh01p"
pools ""
resource disk "/nfsgrid/nfsbin/IBM/InformationServer/Server/Datasets" {pools ""}
resource scratchdisk "/nfsgrid/nfsbin/IBM/InformationServer/Server/Scratch" {pools ""}
}
}
Both the nodes have pools ""
Is this what you wanted to be verified Ray?
Let me know if there is any other entry that I should be looking at.
Thanks,
Thanks for the followup.
I checked the job log and found that the config file being used has the following entries.
{
node "node1"
{
fastname "ctpcqabdsh01p"
pools ""
resource disk "/nfsgrid/nfsbin/IBM/InformationServer/Server/Datasets" {pools ""}
resource scratchdisk "/nfsgrid/nfsbin/IBM/InformationServer/Server/Scratch" {pools ""}
}
node "node2"
{
fastname "ctpcqabdsh01p"
pools ""
resource disk "/nfsgrid/nfsbin/IBM/InformationServer/Server/Datasets" {pools ""}
resource scratchdisk "/nfsgrid/nfsbin/IBM/InformationServer/Server/Scratch" {pools ""}
}
}
Both the nodes have pools ""
Is this what you wanted to be verified Ray?
Let me know if there is any other entry that I should be looking at.
Thanks,
-V
Hi lstsaur,
Thanks for reviewing my query. The following 4 Grid params are the ones that were suggested to be added to all our PX jobs in the GRID. I have created a parameter set by the name APT_GRID_PARAMS in my project for this purpose.
Here are the values for these entries in the log file.
APT_GRID_PARAMS.$APT_GRID_ENABLE = YES (Compiled-in default)
APT_GRID_PARAMS.$APT_GRID_COMPUTENODES = 1 (Compiled-in default)
APT_GRID_PARAMS.$APT_GRID_PARTITIONS = 1 (Compiled-in default)
APT_GRID_PARAMS.$APT_GRID_SEQFILE_HOST = (Compiled-in default)
Another thing I noticed is that the default config file as printed in the director log in the initial Environment variable settings entry (APT_CONFIG_FILE=/nfsgrid/nfsbin/IBM/InformationServer/Server/Configurations/default.apt) points to "default.apt", which is what i had posted earlier.
A few lines below that i see the following log in director for this job.
<Dynamic_gird.sh> SEQFILE Host(s): ctpcqabdsc02p: ctpcqabdsc02p:
{
node "Conductor"
{
fastname "ctpcqabdsh01p"
pools "conductor"
resource disk "/nfsdata/data1/datasets" {pools ""}
resource scratchdisk "/scratch" {pools ""}
}
node "node1_1"
{
fastname "ctpcqabdsc02p"
pools ""
resource disk "/nfsdata/data1/datasets" {pools ""}
resource scratchdisk "/scratch" {pools ""}
}
}
I am not fully conversant with grid internals and would appreciate your inputs/directions on how I could decipher this entry.
I have several other PX jobs that work fine with the GRID parameters that i have provided earlier.
Let me know if you need any additional details in this regard.
Thanks for your time,
Thanks for reviewing my query. The following 4 Grid params are the ones that were suggested to be added to all our PX jobs in the GRID. I have created a parameter set by the name APT_GRID_PARAMS in my project for this purpose.
Here are the values for these entries in the log file.
APT_GRID_PARAMS.$APT_GRID_ENABLE = YES (Compiled-in default)
APT_GRID_PARAMS.$APT_GRID_COMPUTENODES = 1 (Compiled-in default)
APT_GRID_PARAMS.$APT_GRID_PARTITIONS = 1 (Compiled-in default)
APT_GRID_PARAMS.$APT_GRID_SEQFILE_HOST = (Compiled-in default)
Another thing I noticed is that the default config file as printed in the director log in the initial Environment variable settings entry (APT_CONFIG_FILE=/nfsgrid/nfsbin/IBM/InformationServer/Server/Configurations/default.apt) points to "default.apt", which is what i had posted earlier.
A few lines below that i see the following log in director for this job.
<Dynamic_gird.sh> SEQFILE Host(s): ctpcqabdsc02p: ctpcqabdsc02p:
{
node "Conductor"
{
fastname "ctpcqabdsh01p"
pools "conductor"
resource disk "/nfsdata/data1/datasets" {pools ""}
resource scratchdisk "/scratch" {pools ""}
}
node "node1_1"
{
fastname "ctpcqabdsc02p"
pools ""
resource disk "/nfsdata/data1/datasets" {pools ""}
resource scratchdisk "/scratch" {pools ""}
}
}
I am not fully conversant with grid internals and would appreciate your inputs/directions on how I could decipher this entry.
I have several other PX jobs that work fine with the GRID parameters that i have provided earlier.
Let me know if you need any additional details in this regard.
Thanks for your time,
-V
Apologies for the delayed response. Got pulled into a few other unnecessary distractions...
I added the GRID variable for Host files and it still wouldn't give desired results. However, when i added that and enabled the following variable
$APT_IMPORT_PATTERN_USES_FILESET = True, i got another error as follows.
SF_Input: Unable to generate a node map from fileset /tmp/import_tmp_20671db190272.fs.
main_program: Could not check all operators because of previous error(s)
On a separate note, we were asked to use the $APT_GRID_SEQFILE_HOST only for output files and not when reading seq files. Is that not the case?
Thanks,
I added the GRID variable for Host files and it still wouldn't give desired results. However, when i added that and enabled the following variable
$APT_IMPORT_PATTERN_USES_FILESET = True, i got another error as follows.
SF_Input: Unable to generate a node map from fileset /tmp/import_tmp_20671db190272.fs.
main_program: Could not check all operators because of previous error(s)
On a separate note, we were asked to use the $APT_GRID_SEQFILE_HOST only for output files and not when reading seq files. Is that not the case?
Thanks,
-V
Thanks lstsaur.
One additional question:-
If the source file is on the head node, will the use of the variable $APT_GRID_SEQFILE_HOST interfere with the source file location if it did not get the head node as the Host name during the exeuction?
Does the error message I received point to such a symptom?
Thanks,
One additional question:-
If the source file is on the head node, will the use of the variable $APT_GRID_SEQFILE_HOST interfere with the source file location if it did not get the head node as the Host name during the exeuction?
Does the error message I received point to such a symptom?
Thanks,
-V
-
- Premium Member
- Posts: 196
- Joined: Tue Nov 23, 2004 11:50 pm
- Location: Sydney (Australia)
I'm also getting the same problem. If I run my job without setting 'APT_IMPORT_PATTERN_USES_FILESET' to 'True', file names come as 'TestFilePattern????????.dat'.
But if I enable this env variable, the job aborts with following error message:
SQ_SrcFile: Unable to generate a node map from fileset /var/tmp/import_tmp_838635bc0dfb.fs.
Our datastage server is on grid env and I've included all the required grid env variables in the job i.e:
$APT_GRID_ENABLE
$APT_GRID_QUEUE
$APT_GRID_SEQFILE_HOST
$APT_GRID_FROM_PARTITIONS
$APT_GRID_FROM_NODES
$APT_GRID_COMPUTENODES
$APT_GRID_PARTITIONS
But if I enable this env variable, the job aborts with following error message:
SQ_SrcFile: Unable to generate a node map from fileset /var/tmp/import_tmp_838635bc0dfb.fs.
Our datastage server is on grid env and I've included all the required grid env variables in the job i.e:
$APT_GRID_ENABLE
$APT_GRID_QUEUE
$APT_GRID_SEQFILE_HOST
$APT_GRID_FROM_PARTITIONS
$APT_GRID_FROM_NODES
$APT_GRID_COMPUTENODES
$APT_GRID_PARTITIONS
-Nripendra Chand
-
- Premium Member
- Posts: 783
- Joined: Mon Jan 16, 2006 10:17 pm
- Location: Sydney, Australia
-
- Premium Member
- Posts: 783
- Joined: Mon Jan 16, 2006 10:17 pm
- Location: Sydney, Australia
-
- Premium Member
- Posts: 783
- Joined: Mon Jan 16, 2006 10:17 pm
- Location: Sydney, Australia