Dataset write Failure

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
just4geeks
Premium Member
Premium Member
Posts: 644
Joined: Sat Aug 26, 2006 3:59 pm
Location: Mclean, VA

Dataset write Failure

Post by just4geeks »

Getting following fatal errors (in order) while loading data into datasets.

1. SRC_oe_order_header_all,0: Write to dataset on [fd 24] failed (Success) on node node1, hostname lxdscon.beckman.com
2. SRC_oe_order_header_all,0: Orchestrate was unable to write to any of the following files:
3. SRC_oe_order_header_all,0: /dstage1/Server/Datasets/Data_frm_OAGCRD.txt.dsadm.lxdscon.beckman.com.0000.0000.0000.5ba6.c9920787.0000.246beea0
4. SRC_oe_order_header_all,0: Block write failure. Partition: 0
5. SRC_oe_order_header_all,0: Failure during execution of operator logic.
8. SRC_oe_order_header_all,0: Fatal Error: File data set, file "/dstage1/store/Data_frm_OAGCRD.txt".; output of "SRC_oe_order_header_all": DM getOutputRecord error.
9. node_node1: Player 1 terminated unexpectedly.
10.main_program: APT_PMsectionLeader(1, node1), player 1 - Unexpected exit status 1.
11. main_program: Step execution finished with status = FAILED.

I read related previous posts in the forum and Did following research.

1. I was running the job as isadmin.

2. Checked permissions for folder where datasets are saved.
We have set following directories for datasets as well as scratch disk.

resource disk "/dstage1/Server/Datasets"
resource scratchdisk "/dstage1/Server/Scratch"


We have all the permissions on Server where all the datasets are stored. 'store' is another folder where we save output files. It also has full permissions.

drwxrwxrwx 4 root root 4096 Feb 12 14:56 Server
drwxrwxrwx 6 root root 4096 Feb 13 10:18 Projects
drwxrwxrwx 4 root root 4096 Feb 29 19:11 store

drwxrwxrwx 2 root root 4096 Feb 22 15:16 Scratch
drwxrwxrwx 2 root root 4096 Feb 29 19:11 Datasets


3. Checked available space using df comand. We have plenty of space left in '/dstage1' where data is stored.

[isadmin@lxdscon dstage1]$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/VG00-LogVol00 3.1G 175M 2.8G 6% /
/dev/cciss/c0d0p1 190M 13M 169M 7% /boot
none 3.8G 0 3.8G 0% /dev/shm
/dev/mapper/VG01-LogVol00 29G 13G 15G 47% /dstage1
/dev/mapper/VG00-LogVol01 6.0G 4.1G 1.7G 72% /home
/dev/mapper/VG00-LogVol05 10G 7.6G 1.9G 81% /opt
/dev/mapper/VG00-LogVol02 3.1G 54M 2.9G 2% /tmp
/dev/mapper/VG00-LogVol03 10G 2.5G 7.0G 26% /usr
/dev/mapper/VG00-LogVol04 3.1G 99M 2.9G 4% /var


4. Checked file size limits using ulimit -a command

[isadmin@lxdscon dstage1]$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
file size (blocks, -f) unlimited
pending signals (-i) 1024
max locked memory (kbytes, -l) 32
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
stack size (kbytes, -s) 10240
cpu time (seconds, -t) unlimited
max user processes (-u) 131071
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited


Any thoughts?

Thanks in Advance!
Attitude is everything....
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

Are you trying to write a dataset? If so change the extension from .txt to .ds.
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
just4geeks
Premium Member
Premium Member
Posts: 644
Joined: Sat Aug 26, 2006 3:59 pm
Location: Mclean, VA

Post by just4geeks »

kumar_s wrote:Are you trying to write a dataset? If so change the extension from .txt to .ds.
Thanks! but I did try with .ds extension earlier but the results were same . I just tried .txt out of curiosity.
Attitude is everything....
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

Try writing a empty Dataset file with .ds extension with the current configuration.
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
just4geeks
Premium Member
Premium Member
Posts: 644
Joined: Sat Aug 26, 2006 3:59 pm
Location: Mclean, VA

Post by just4geeks »

kumar_s wrote:Try writing a empty Dataset file with .ds extension with the current configuration.
Did that again with same results:
1. SRC_oe_order_header_all,0: Write to dataset on [fd 23] failed (Success) on node node1, hostname lxdscon.beckman.com
2. SRC_oe_order_header_all,0: Orchestrate was unable to write to any of the following files:
3. SRC_oe_order_header_all,0: /dstage1/Server/Datasets/data_frm_oagcrd1.ds.dsadm.lxdscon.beckman.com.0000.0000.0000.1071.c9925ba0.0000.4bb0671b
4. SRC_oe_order_header_all,0: Block write failure. Partition: 0
5. SRC_oe_order_header_all,0: Failure during execution of operator logic.
6. SRC_oe_order_header_all,0: Fatal Error: File data set, file "/dstage1/store/data_frm_oagcrd1.ds".; output of "SRC_oe_order_header_all": DM getOutputRecord error.
7. node_node1: Player 1 terminated unexpectedly.
8. main_program: APT_PMsectionLeader(1, node1), player 1 - Unexpected exit status 1.
9. main_program: Step execution finished with status = FAILED.
Attitude is everything....
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

Does your user id have enough previlages??
Try cd $DSHOME/bin
UV
from command prompt.

Else try with dsadm user id.
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
just4geeks
Premium Member
Premium Member
Posts: 644
Joined: Sat Aug 26, 2006 3:59 pm
Location: Mclean, VA

Post by just4geeks »

kumar_s wrote:Does your user id have enough previlages??
Try cd $DSHOME/bin
UV
from command prompt.

Else try with dsadm user id.
As you can see from first post, I have all the privileges. I tried with dsadm as well as isadmin user ids. Please let me know if anyelse can be looked at. I can provide you other details as well.
where to try $dshome/bin command? I execute this command to invoke dssh for seeing/clearing locks etc. What do I do after invoking the cd $DSHOME/bin and UV.
Attitude is everything....
Ananda
Participant
Posts: 29
Joined: Mon Sep 20, 2004 12:05 am

Post by Ananda »

I faced same issue. Resolution was to delete datasets and create some space. Job then ran fine.

tCopy,0: Write to dataset on [fd 8] failed (Success) on node node1, hostname mphewddes001
tCopy,0: Orchestrate was unable to write to any of the following files:
tCopy,0: /node1/res/DS_C_MP_MSTR_PFL.ds.dsadm.dstaged1.hew.us.ml.com.0000.0000.0000.ffe.cd913cf2.0000.e4187cb3
tCopy,0: Block write failure. Partition: 0
tCopy,0: Failure during execution of operator logic.
tCopy,0: Fatal Error: File data set, file "/cedp_data/cedpor/mstr_pfl/datasets/DS_C_MP_MSTR_PFL.ds".; output of "APT_TransformOperatorImplV0S9_ext_C_MP_MSTR_PFL_tCopy in tCopy": DM getOutputRecord error.
If you don't fail now and again, it's a sign you're playing it safe.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Space: the final frontier.

:lol:
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
sureshreddy2009
Participant
Posts: 62
Joined: Sat Mar 07, 2009 4:59 am
Location: Chicago
Contact:

Post by sureshreddy2009 »

I have one option for solution
Are you using same dataset name in two different jobs.

And already you loaded into one sequential file whose name as exaclty like what you mentioned in dataset like name.ds and you are giving same name in dataset now
Thanks
Suresh Reddy
ETL Developer
Research Operations

"its important to know in which direction we are moving rather than where we are"
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Run the job to reproduce the error. Immediately go to UNIX and do an "ls -al" on your dataset descriptor file ("/dstage1/store/Data_frm_OAGCRD.txt") and then do a 'orchadmin ll /dstage1/store/Data_frm_OAGCRD.txt' as well. Does the data file in the error message, the one in /node1 and ending with a unique hex identifier, actually exist in the datasets directory? How big is it?
Post Reply