Write to dataset failed: File too large

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
lakshya
Participant
Posts: 19
Joined: Fri Jan 21, 2005 2:39 pm

Write to dataset failed: File too large

Post by lakshya »

Hi-

One of our jobs is aborting when the dataset size reaches over 2 GB throwing the following error

CpyRecs,0: Write to dataset failed: File too large
The error occurred on Orchestrate node node2 (hostname XXX)

We have got the limits changed to maximum for the userid thro which we are running our jobs

Current settings:
$ ulimit -a
time(seconds) unlimited
file(blocks) unlimited
data(kbytes) unlimited
stack(kbytes) 4194304
memory(kbytes) 999999999999
coredump(blocks) unlimited
nofiles(descriptors) 2000

The jobs are still aborting throwing the same error.

Did anyone face the same problem? Can you please suggest me if there is any fix for the same.

Please help me on this as we have several job designed in the same way,where we will be having datasets over 2 GB in size.

Thanks in advance
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

I think you need to stop and & restart the DataStage server after making the ulimit change, have you done that?
lakshya
Participant
Posts: 19
Joined: Fri Jan 21, 2005 2:39 pm

Post by lakshya »

Hi Arndw-

Yes! That has been done after the limits were changed.

Thanks
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

You also need to remove the dataset as well and re-create it.
lakshya
Participant
Posts: 19
Joined: Fri Jan 21, 2005 2:39 pm

Post by lakshya »

The dataset gets deleted from the nodes as soon as the the job aborts.If it completes,it writes to the processing folder.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

OK, put a "ulimit -a" external command into your job to make sure that the background process is getting the same limitations; perhaps one of your initialization scripts resets the limit.
gbusson
Participant
Posts: 98
Joined: Fri Oct 07, 2005 2:50 am
Location: France
Contact:

Post by gbusson »

you also have to kill all the processes owned by the user who runs the jobs.
Otherwise the ulimit tab won't be updated!
lakshya
Participant
Posts: 19
Joined: Fri Jan 21, 2005 2:39 pm

Post by lakshya »

There are no processes hanging for the userid in question
lakshya
Participant
Posts: 19
Joined: Fri Jan 21, 2005 2:39 pm

Post by lakshya »

Arndw-

Can you please help me on where/how to add the "ulimit -a" external command into my job to make sure that the background process is getting the same limitations;

Thanks
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

In the job properties you can specify a before-job subroutine and I believe one of the options is ExecSh or something similar to execute a UNIX shell command.
lakshya
Participant
Posts: 19
Joined: Fri Jan 21, 2005 2:39 pm

Post by lakshya »

Hi-

I ran the job after adding the "ulimit -a" in a before subroutine am getting the following

XXX..BeforeJob (ExecSH): Executed command: ulimit -a
*** Output from command was: ***
time(seconds) unlimited
file(blocks) 4194303
data(kbytes) 131072
stack(kbytes) 32768
memory(kbytes) 65536
coredump(blocks) 2097151
nofiles(descriptors) 2000

So it is not taking the changed limits into the job and thats why it is aborting.

How can I fix this?

Thanks
Lakshman
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Check the dsenv script in your project directory and also the DataStage startup script in /bin for ulimit settings.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

You changed your ulimit, but not the one for the ID under which DataStage processes run. That's why Arnd had you check via a before-job subroutine. The dsenv script is executed by all DataStage processes. However, on some UNIXes, only superuser can increase ulimit - you may need to ask your System Administrator to assist.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
gbusson
Participant
Posts: 98
Joined: Fri Oct 07, 2005 2:50 am
Location: France
Contact:

Post by gbusson »

maybe u've not set the impersonation mode!

Check it!
Otherwise, set the ulimit for dsadm.
lakshya
Participant
Posts: 19
Joined: Fri Jan 21, 2005 2:39 pm

Post by lakshya »

Hi All-

Thank you very much for your inputs on this issue.

Atlast the jobs are able to create datasets with size more than 2 GB.

Earlier we had changed the ulimit settings to maximum for the ID thro which we run our jobs,but the jobs kept aborting giving the same error.
The jobs were being passed with the default values from the admin ID.

Now we have the administrator ID ulimit values set to maximum and restarted the job after bouncing the server and it worked.The jobs finished successfully for datasets more than 3 million rows in it.

Thanks again
Post Reply