Unable to map file Error

johnman · Post by **johnman** » Sat Jan 10, 2004 6:15 pm

Hi All,
I have a Job with writes the data to a Data Set. I'm getting the following error while running the job:

"TempFile1,3: Unable to map file Datastage/Datasets/node4/test.ds.username.host.0000.0003.0000.71f6.c1c843f5.0003.551d93b2: Invalid argument

The error occurred on Orchestrate node node4 (hostname xyz)"

Here, Datastage/Datasets/node4/test.ds is one of the data file created for a data set.

appreciate if anybody can help me in this.

Regs
John

bigpoppa · Post by **bigpoppa** » Sat Jan 10, 2004 7:07 pm

Does that file exist? If not, you may need to recreate the dataset it is trying to access.

-BP

johnman · Post by **johnman** » Mon Jan 12, 2004 10:40 am

Hi BP,
Yes that file exists. We created the data set again and ran the job. But everytime it is giving the same error.

repartition(0),0: Unable to map file
Datastage/Datasets/node4/test.ds.username.host.0000.0003.0000.71f6.c1c843f5.0003.551d93b2: Invalid argument

The error occurred on Orchestrate node node4 (hostname xyz)"

Is it happening because of the partitioning? We are joining two data sets, and the Preserve Partitioning property is set to "Clear" without which it gives "Irreconcilable Constraints" error.

Regs
John

bigpoppa · Post by **bigpoppa** » Mon Jan 12, 2004 1:36 pm

Try running with a one node config. If that works, then the problem is with the partitioning.

Are the partition methods for both data sets set to 'clear'?

- BP

johnman · Post by **johnman** » Wed Jan 14, 2004 6:22 pm

Hi BP,
It's not working with One node also.
Yes partition methods for both the data sets are also set to Clear.

Can you give me any more pointers to solve this problem. I can tell you that this is only happening with the huge amount of data.
The file
Datastage/Datasets/node4/test.ds.username.host.0000.0003.0000.71f6.c1c843f5.0003.551d93b2

is of size 4.9GB.

Appreciate your prompt response.

Regards
John

ray.wurlod · Post by **ray.wurlod** » Wed Jan 14, 2004 10:21 pm

Are you running out of disk space on any of your processing nodes?

Teej · Post by **Teej** » Wed Jan 14, 2004 10:53 pm

Answer Ray's question. Also, please post the copy of your configuration file that you are using to run this. It would also be helpful if you can share your mountpoint configuration scheme (Do you have 4 separate mountpoints? Are you running jobs using the same mountpoint?)

To find out about disk space usage, do something like: df -k

-T.J.

johnman · Post by **johnman** » Thu Jan 15, 2004 12:17 am

Hi,
All the four nodes are on same mount.
Scratch space is in different mount. i.e. four nodes on one mount and four scratch nodes on other mount.
Configuration File is as below:

{
node "node1"
{
fastname "host"
pools ""
resource disk "/apps/Ascential/Projects/Datasets/node1" {pools ""}
resource scratchdisk "/apps/Ascential/Projects/tmp/Scratch/node1" {pools ""}
}
node "node2"
{
fastname "host"
pools ""
resource disk "/apps/Ascential/Projects/Datasets/node2" {pools ""}
resource scratchdisk "/apps/Ascential/Projects/tmp/Scratch/node2" {pools ""}
}
node "node3"
{
fastname "host"
pools ""
resource disk "/apps/Ascential/Projects/Datasets/node3" {pools ""}
resource scratchdisk "/apps/Ascential/Projects/tmp/Scratch/node3" {pools ""}
}
node "node4"
{
fastname "host"
pools ""
resource disk "/apps/Ascential/Projects/Datasets/node4" {pools ""}
resource scratchdisk "/apps/Ascential/Projects/tmp/Scratch/node4" {pools ""}
}
}

Space occupied on mount is 75% of 450GB and on Scratch mount 20% of 300GB. Can there be any space issue?
Or is there any limit on the size of a data file of dataset?
I'm able to create the data sets. But while joning two data sets this error message is coming.
Any pointers???

Thanks & Regards
John

vzoubov · Post by **vzoubov** » Thu Jan 15, 2004 12:45 am

johnman wrote: Or is there any limit on the size of a data file of dataset?
Any pointers???

ulimit -f

will give you the file size limit on your unix system.

Vitali.

johnman · Post by **johnman** » Thu Jan 15, 2004 1:32 am

File Size is unlimited. So that is not the problem with File Size.

One more thing I've noticed is that when I try to dump that Data Set to Sequential file through DataStage utility, it gives me the same error.

Regards
John

Teej · Post by **Teej** » Thu Jan 15, 2004 7:31 am

johnman wrote:All the four nodes are on same mount.

This would hurt PX processing capaiblity by limiting it to a specific I/O system for all four nodes -- it's like going 4 ways on a single CPU, especially with moderately large datasets.

I do not know if you are aware, but you can assign multiple mountpoints to a single node:

Code: Select all

 node "node1"
{
    fastname "host"
    pools ""
    resource disk "/MountA/Ascential/Projects/Datasets/node1" {pools ""}
    resource disk "/MountB/Ascential/Projects/Datasets/node1" {pools ""}
    resource scratchdisk "/MountA/Ascential/Projects/Scratch/node1" {pools ""}
    resource scratchdisk "/MountB/Ascential/Projects/Scratch/node1" {pools ""}
}

You will need to do this, if the cause is space issues.

Space occupied on mount is 75% of 450GB and on Scratch mount 20% of 300GB. Can there be any space issue?

It depends on WHEN you made this observation. DataStage PX is _VERY_ good at cleaning up after itself. If the job aborted, the diskspace usually are cleaned up immediately. So for that job you are having an error on -- observe the diskspace WHILE the job is running.

Or is there any limit on the size of a data file of dataset?

Nope. DataStage will automatically break up the dataset into multiple files within the mountpoint if necessary.

I'm able to create the data sets. But while joning two data sets this error message is coming.

This really suggests that it is a space issue. You have two datasets. Creating a bigger one from that requires more space. DataStage does its calculation and think you don't have enough space, and kick you off.

I do hope you are putting your dataset control file on a location outside those spaces (maybe in a temporary space, or a designated write space?)

-T.J.

johnman · Post by **johnman** » Fri Jan 16, 2004 9:00 am

Thanks for your time.

The Space calculation I have given is when the job was running. I continuously monitored the disk space during the job run.

DataStage does its calculation and think you don't have enough space, and kick you off.

Does this mean DataStage does it's calculation for space beforehand? Then how can I estimate the amount of space required by the job? Does this mean that same error will come when I run the job with FileSet output instead of DataSet?

Teej · Post by **Teej** » Fri Jan 16, 2004 2:50 pm

johnman wrote:Does this mean DataStage does it's calculation for space beforehand? Then how can I estimate the amount of space required by the job? Does this mean that same error will come when I run the job with FileSet output instead of DataSet?

Well, no. It is time-consuming for DataStage to read in the number of data, and figure out how much data is needed. I do not know how Orchestrate does it behind the scenes but for Datasets, it is probably the same concept as a data file -- it will attempt to allocate some block, and write to it. If there's no more space, DataStage would return an error.

Hmm. Does this error happens when you run 2 nodes on the same set of data?

-T.J.

bigpoppa · Post by **bigpoppa** » Sun Jan 18, 2004 6:14 pm

How did you create the dataset? Is it created by an upstream PX job?

There is a utility called orchadmin that can be called thru the UNIX command line. You can use it to check the status of a .ds file. It has been described in other posts. If orchadmin has trouble with the .ds, then the problem isn't with the job that reads the .ds - the problem is then with the job that creates the .ds.

- BP