Dataset read - Unknown Error Reading Data
Moderators: chulett, rschirm, roy
Dataset read - Unknown Error Reading Data
Hi,
I created a job which reads from Oracle and writes to a dataset and on the dataset I have set the option to sort the data. Job finished successfully and dataset got created.
Problem: There are six partions of the dataset with some of the partitions having 0 rows. When I try to view the dataset with all records I can see the records. But, if I try to view data from a particular partition (0 rows or with rows existing in the partition) then I am receiving error "Unknown error reading data".
Did anyone came across this or can someone shed some light on this please.
Thanks.
I created a job which reads from Oracle and writes to a dataset and on the dataset I have set the option to sort the data. Job finished successfully and dataset got created.
Problem: There are six partions of the dataset with some of the partitions having 0 rows. When I try to view the dataset with all records I can see the records. But, if I try to view data from a particular partition (0 rows or with rows existing in the partition) then I am receiving error "Unknown error reading data".
Did anyone came across this or can someone shed some light on this please.
Thanks.
I don't know what is causing your error, but did you use a hash partitioning algorithm (which would explain the empty partitions). Perhaps the view data command just doesn't like empty partitions (just like nature abhors a vacuum). Also, have you tried using the command-line "orchadmin" command to display the dataset?
Last edited by ArndW on Fri Oct 10, 2008 12:07 pm, edited 1 time in total.
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>
ArndW, I did do hash partitioning.did you use a hash partitioning algorithm (which would explain the empty partitions).
Do you think even view on non empty partitions also doesn't work?"Perhaps the view just doesn't like empty partitions ".
How can I view the data then? Is there a way?
Thanks for your reply.
I would recommend trying it from the command line with "orchadmin", if that brings an error then it needs to be reported to IBM as a problem.
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>
ArndW,
I am looking into reading the dataset using orchadmin. I never used it. So, I had set some env variables and libraries before the command started working. I still got an error. I gave the same of the file with .txt extension and it may be incorrect. After looking at the dataset file name defination, I have change the name to .ds extension and re-run the job. Hope it will work. I will update once job finishes.
Thanks.
I am looking into reading the dataset using orchadmin. I never used it. So, I had set some env variables and libraries before the command started working. I still got an error. I gave the same of the file with .txt extension and it may be incorrect. After looking at the dataset file name defination, I have change the name to .ds extension and re-run the job. Hope it will work. I will update once job finishes.
Thanks.
ArndW,
I am still getting error from orchadmin for partition read and works fine for full read.
Error:
$ orchadmin dump -n 99 -part 0 /data/dstage1/project1/dataset.ds
##I TFCN 000001 14:57:25(000) <main_program>
Ascential DataStage(tm) Enterprise Edition 7.5.1A
Copyright (c) 2004, 1997-2004 Ascential Software Corporation.
All Rights Reserved
##I TFSC 000001 14:57:25(001) <main_program> APT configuration file: /tmp/aptoa37672107f704
##E TFPM 000040 14:57:27(000) <APT_PeekOperator,1> Operator terminated abnormally: received signal SIGSEGV
##E TFPM 000338 14:57:28(000) <main_program> Unexpected exit status 1
##E TFSR 000011 14:57:38(000) <main_program> Step execution finished with status = FAILED.
##I TCOA 000049 14:57:38(001) <main_program> The dump FAILED for /data/dstage1/project1/dataset.ds.
Any idea.
I am still getting error from orchadmin for partition read and works fine for full read.
Error:
$ orchadmin dump -n 99 -part 0 /data/dstage1/project1/dataset.ds
##I TFCN 000001 14:57:25(000) <main_program>
Ascential DataStage(tm) Enterprise Edition 7.5.1A
Copyright (c) 2004, 1997-2004 Ascential Software Corporation.
All Rights Reserved
##I TFSC 000001 14:57:25(001) <main_program> APT configuration file: /tmp/aptoa37672107f704
##E TFPM 000040 14:57:27(000) <APT_PeekOperator,1> Operator terminated abnormally: received signal SIGSEGV
##E TFPM 000338 14:57:28(000) <main_program> Unexpected exit status 1
##E TFSR 000011 14:57:38(000) <main_program> Step execution finished with status = FAILED.
##I TCOA 000049 14:57:38(001) <main_program> The dump FAILED for /data/dstage1/project1/dataset.ds.
Any idea.
If you change the partitioning to "round robin" to make sure all paritions have data, does the error still persist?
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Round robin "cheats", since it guarantees that all partitions have data (assuming you have records => number of partitions); but it doesn't solve the cause of the problem. I think that it would be a good idea to involve your support provider at this point in time, since you have a reproduceable simple test case on your system.
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>