How and where the scratch disk is defined
Moderators: chulett, rschirm, roy
How and where the scratch disk is defined
how and where the scratch disk is defined?
How are the parameters defined ?
How are the parameters defined ?
The scratch filesystems used by your parallel jobs are defined within your parallel configuration files, as is well-documented in the Parallel Job Developer's Guidesection on Configuration Files.
Regards,
Regards,
- james wiles
All generalizations are false, including this one - Mark Twain.
All generalizations are false, including this one - Mark Twain.
![Exclamation :!:](./images/smilies/icon_exclaim.gif)
peep wrote:I am having the same issue. scratch disk space is full .
here is config
Code: Select all
{
node "node1"
{
fastname "edrnpr17"
pools ""
resource disk "/IIS/data/node1/datasets" {pools ""}
resource scratchdisk "/IIS/data/node1/sort" {pools "sort"}
resource scratchdisk "/IIS/data/node1/buffer" {pools "buffer"}
}
node "node2"
{
fastname "edrnpr17"
pools ""
resource disk "/IIS/data/node2/datasets" {pools ""}
resource scratchdisk "/IIS/data/node2/sort" {pools "sort"}
resource scratchdisk "/IIS/data/node2/buffer" {pools "buffer"}
}
node "node3"
{
fastname "edrnpr17"
pools ""
resource disk "/IIS/data/node3/datasets" {pools ""}
resource scratchdisk "/IIS/data/node3/sort" {pools "sort"}
resource scratchdisk "/IIS/data/node3/buffer" {pools "buffer"}
}
}
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Note that there is no default disk pool for scratchdisk in the configuration file that Craig posted. Perhaps there should be. It depends what DataStage is being asked to do.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
"we are working on xml files" is extremely vague as to what your job is doing. You can work on xml files with a text editor ![Smile :)](./images/smilies/icon_smile.gif)
Take time to analyze and list out the following:
1) What operations does your job perform that will cause it to use your defined scratch disks?
2) What is the volume of data you are processing (not only rows of data, but amount/number of bytes)?
3) Are the three resources defined in each node simply different directories on the same disks or are they separate disks?
4) How many other jobs are running in your environment at the same time as your job?
Are you and/or the system administrators watching disk usage while your job is running? If not, you should be as that is when it is used. Maybe only one of the file systems (node1, node2 or node3) is running out of space.
Regards,
![Smile :)](./images/smilies/icon_smile.gif)
Take time to analyze and list out the following:
1) What operations does your job perform that will cause it to use your defined scratch disks?
2) What is the volume of data you are processing (not only rows of data, but amount/number of bytes)?
3) Are the three resources defined in each node simply different directories on the same disks or are they separate disks?
4) How many other jobs are running in your environment at the same time as your job?
Are you and/or the system administrators watching disk usage while your job is running? If not, you should be as that is when it is used. Maybe only one of the file systems (node1, node2 or node3) is running out of space.
Regards,
- james wiles
All generalizations are false, including this one - Mark Twain.
All generalizations are false, including this one - Mark Twain.
1)Job has sort stage ,transformer ,aggregrator...xml stage as well.
2)Three nodes are defined in the same directory IIS/data/node1,data/node2, data/node 3 and are on same disk.
3) There is no specific space allocate to scratch disk. The data can expand till 100 gb. (so do i need to change any settings in the datastage clients ) that will let datastage jobs to use full disk space with out any restrictions ?
2)Three nodes are defined in the same directory IIS/data/node1,data/node2, data/node 3 and are on same disk.
3) There is no specific space allocate to scratch disk. The data can expand till 100 gb. (so do i need to change any settings in the datastage clients ) that will let datastage jobs to use full disk space with out any restrictions ?
Ahh...now some potentially useful information. Do the scratch and buffer directories reside on the same physical disks with the disk resources? Those being /IIS/data/node1/datasets, /IIS/data/node2/datasets and /IIS/data/node3/datasets. Is that 100GB shared by all of these file systems? If so, then you MUST consider the size of your output datasets as part of the problem...they will affect how much storage is available for scratch usage. Also, any other datasets that already exist from other jobs or job runs that reside on the same disk.There is no specific space allocate to scratch disk. The data can expand till 100 gb.
Where are your source XML files stored? On the same disk as datasets, sort and buffer scratch storage? If so, they also factor into how much space is available for scratch usage.
Standard/best practices recommend that scratch space be allocated to it's own disk storage when possible.
Sorts and buffers will use what space is available in the file systems listed within each logical node in your configuration file. Each logical node (i.e. job partition) will use the file systems allocated to it and not the file systems allocated to other nodes (if they are not the same names).
Sort will use (as documented) 1) sort disk pools, then 2) default disk pools (you have none defined), then 3) $TMPDIR, then 4) /tmp. Buffers will use (as documented)either 1) the default disk pools OR 2) buffer disk pools (only if defined, as in your case)
Do your error messages say anything more than "scratch space full"? Such as which scratch space (which file system)?
You're using three partitions. How evenly distributed is your data among the three partitions?
You need to work with your sys admins to monitor disk usage (at the directory level) while your job is running in order to determine what is consuming the majority of your disk space. You may simply find that you just don't have enough storage to hold everything (scratch files, datasets, etc.) at the same time on the same disk and need to add more storage.
Regards,
- james wiles
All generalizations are false, including this one - Mark Twain.
All generalizations are false, including this one - Mark Twain.
hi All Thanks for your response.
Now I know where and how the scratch disk is defined.
in APT_CONFIG_FILE
we ran out of disk space as our buffer disk did not have enough space to support during run time.
we increased the space and that took care of it.
To add more information
now we have moved our scratch disk to nfs and added variables to support nfs.
Now I know where and how the scratch disk is defined.
in APT_CONFIG_FILE
we ran out of disk space as our buffer disk did not have enough space to support during run time.
we increased the space and that took care of it.
To add more information
now we have moved our scratch disk to nfs and added variables to support nfs.