Error in join stage

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

somu_june
Premium Member
Premium Member
Posts: 439
Joined: Wed Sep 14, 2005 9:28 am
Location: 36p,reading road

Error in join stage

Post by somu_june »

Hi,

Iam having a job which reads data from three source tables using DB2 API stage and I have join stage and I am joinning the three tables by a inner join for commom columm RCA and writting the output from a join to a dataset. The job run fines if the output from a join is 974197 records but if the output from a join is 58597217 records it is loading only 18680751 records and job is aborting throwing the following error

Join_Src_Tables,3: Write to dataset failed: File too large
The error occurred on Orchestrate node node4 (hostname d3crs40)
Join_Src_Tables,3: Orchestrate was unable to write to any of the following files:
Join_Src_Tables,3: /DataStage/751A/Ascential/DataStage/Datasets/Src_File_Cpr.txt.Raya1.d3crs40.0000.0003.0000.9f68.c790f261.0003.fe80ef89
Join_Src_Tables,3: Block write failure. Partition: 3
Join_Src_Tables,3: Fatal Error: File data set, file "{0}".; output of "APT_JoinSubOperatorNC(1) in Join_Src_Tables": DM getOutputRecord error.
buffer(3),3: Error in writeBlock - could not write 16


Can any one tell me why Iam getting this error . Is this error due to lack of buffer for join stage . IF that is the case , can some one tell me how to increase the buffer

Thanks,
SomaRaju.
somaraju
kura
Participant
Posts: 21
Joined: Sat Mar 20, 2004 3:43 pm

Re: Error in join stage

Post by kura »

This error comes when you have limitation on file size creation. Can you check with Unix admin what is file size limit for user used to run this job

[quote="somu_june"]Hi,

Iam having a job which reads data from three source tables using DB2 API stage and I have join stage and I am joinning the three tables by a inner join for commom columm RCA and writting the output from a join to a dataset. The job run fines if the output from a join is 974197 records but if the output from a join is 58597217 records it is loading only 18680751 records and job is aborting throwing the following error

Join_Src_Tables,3: Write to dataset failed: File too large
The error occurred on Orchestrate node node4 (hostname d3crs40)
Join_Src_Tables,3: Orchestrate was unable to write to any of the following files:
Join_Src_Tables,3: /DataStage/751A/Ascential/DataStage/Datasets/Src_File_Cpr.txt.Raya1.d3crs40.0000.0003.0000.9f68.c790f261.0003.fe80ef89
Join_Src_Tables,3: Block write failure. Partition: 3
Join_Src_Tables,3: Fatal Error: File data set, file "{0}".; output of "APT_JoinSubOperatorNC(1) in Join_Src_Tables": DM getOutputRecord error.
buffer(3),3: Error in writeBlock - could not write 16


Can any one tell me why Iam getting this error . Is this error due to lack of buffer for join stage . IF that is the case , can some one tell me how to increase the buffer

Thanks,
SomaRaju.[/quote]
somu_june
Premium Member
Premium Member
Posts: 439
Joined: Wed Sep 14, 2005 9:28 am
Location: 36p,reading road

Re: Error in join stage

Post by somu_june »

Hi Kura,

I type ulimit -a command . When I entered the command I got the output as given below.

time<seconds> unlimited
file<blocks> unlimited
data<kbytes> 131072
stack<kbytes> 32768
memory<kbytes> 32768
coredump<blocks> unlimited
nofiles<descriptors> 2000

file<blockd> unlimited . I already posted in the form with file full waring but I think the problem might be with Join stage not able to handle more number of records , I checkd disk space everything . I think this problem is due to Join stage and not the file limits.

Thanks,
SomaRaju.
somaraju
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

Have you checked the disk space utilization of /DataStage/751A/Ascential/DataStage/Datasets/ and the temp directory that been used for the intermediate result?
And the limit is 131072 KB.
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

Perhaps you can try to increase all three factor data, stack and Memory to unlimited (If your Unix admin permits)
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
somu_june
Premium Member
Premium Member
Posts: 439
Joined: Wed Sep 14, 2005 9:28 am
Location: 36p,reading road

Post by somu_june »

Hi kumar_s,

I checked the disk space utilization of /DataStage/751A/Ascential/DataStage/Datasets and I got

Filesystem 512-blocks Free %used Iused %Iused Mountedon
/dev/Datastage1 60030976 23932992 61% 226555 4% /DataStage

I dont know what temp directory you are saying about is it Scratch space and for Scratch disk

Filesystem 512-blocks Free %used Iused %Iused Mountedon
/dev/Datastage1 60030976 23932568 61% 226555 4% /DataStage

if temp directory is different from Scratch space can you tell me where can I find that temp directory.


Thanks,
SomaRaju.
somaraju
somu_june
Premium Member
Premium Member
Posts: 439
Joined: Wed Sep 14, 2005 9:28 am
Location: 36p,reading road

Post by somu_june »

Hi ,

Iam getting the above error when Iam using a join stage , but when I use a look up stage the job is working fine . Iam getting this error even after increasing scratch disk space . Is there any limitations in join stage or do I need to increase any buffer size in administrator for join stage to handle large data . Iam using an inner join .

Thanks,
SomaRaju.
somaraju
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Do you have write permission to /DataStage/751A/Ascential/DataStage/Datasets ?
You might also check /DataStage/751A/Ascential/DataStage/Scratch
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
somu_june
Premium Member
Premium Member
Posts: 439
Joined: Wed Sep 14, 2005 9:28 am
Location: 36p,reading road

Post by somu_june »

Hi Ray,

I went to Datasets directory /DataStage/751A/Ascential/DataStage/Datasets> and permission I found was

-rwxrwx--- 1 Raju0212 dstage 9788336 Mar 8 2006 lookuptable.20060308.ij
emh5c*
-rwxrwx--- 1 Raju0212 dstage 9788336 Mar 8 2006 lookuptable.20060308.ls
d1o2d*

I went to scratch directory
/DataStage/751A/Ascential/DataStage/Scratch> .
I found read write permission for my id

-rw------- 1 Raju0212 dstage 0 Feb 21 11:03 c34has34.00000000000000
da
-rw------- 1 Raju0212 dstage 2222104 Feb 28 11:48 c34has34.00000000000003


Do I need to have an execute permission for scratch disk for my userid and group


Thanks,
SomaRaju
somaraju
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

In UNIX execute permission allows you to use the directory name in a pathname so, yes, you need execute permission. The permissions on Datasets are OK. The permissions on Scratch are not.

However, the error message in your original post was unable to write to Datasets, so permissions may or may not be the issue. Was the job executed by Raju0212 or a member of the dstage group? Can you verify that the user has read and execute permissions to all directories in the pathname to Datasets?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Mike3000
Participant
Posts: 24
Joined: Mon Mar 26, 2007 9:16 am

Very professional

Post by Mike3000 »

Thank you guys for very professional and informative discussion of
this topic.
I've learned lots new stuff.
Edited by Mike3000
somu_june
Premium Member
Premium Member
Posts: 439
Joined: Wed Sep 14, 2005 9:28 am
Location: 36p,reading road

Post by somu_june »

HI Ray,

The jobs are executed by Raju0212 . And these are the permission on the two directories

drwxrwxr-x 2 dsadm dstage 326144 May 20 17:10 Datasets/

drwxrwxr-x 2 dsadm dstage 46080 May 19 17:09 Scratch/


One more thing is Iam able to join records and it is running fine untill 974597 records when it is more than that the job is aborting throwing the error, scratch space is not having a execute permission then how this 974597 records are successfull. Iam not able to join large amount of data.

Thanks,
SomaRaju
somaraju
sanjay
Premium Member
Premium Member
Posts: 203
Joined: Fri Apr 23, 2004 2:22 am

Post by sanjay »

Somu

try to split into 2 jobs i.e instead of 3 joins keep 2 joins and next job use result dataset to join 3rd

Sanjay

somu_june wrote:HI Ray,

The jobs are executed by Raju0212 . And these are the permission on the two directories

drwxrwxr-x 2 dsadm dstage 326144 May 20 17:10 Datasets/

drwxrwxr-x 2 dsadm dstage 46080 May 19 17:09 Scratch/


One more thing is Iam able to join records and it is running fine untill 974597 records when it is more than that the job is aborting throwing the error, scratch space is not having a execute permission then how this 974597 records are successfull. Iam not able to join large amount of data.

Thanks,
SomaRaju
somu_june
Premium Member
Premium Member
Posts: 439
Joined: Wed Sep 14, 2005 9:28 am
Location: 36p,reading road

Post by somu_june »

Hi Sanjay,

I splited job in to two jobs and Iam taking first job out put as one of the input in second and performed join function but Iam getting the error like this

Errors :

Join_Src_Tables,0: Write to dataset failed: No space left on device
The error occurred on Orchestrate node node1 (hostname d3crs40)

Join_Src_Tables,0: Orchestrate was unable to write to any of the following files:

Join_Src_Tables,0: /DataStage/751A/Ascential/DataStage/Datasets/Src_File_Cable.txt.Raju0212.d25was39.0000.0000.0000.e1a0.c81b41a4.0000.d43d544e

Join_Src_Tables,0: Block write failure. Partition: 0

Join_Src_Tables,0: Failure during execution of operator logic.

Join_Src_Tables,0: Fatal Error: File data set, file "{0}".; output of "APT_JoinSubOperatorNC in Join_Src_Tables": DM getOutputRecord error.

buffer(1),0: Error in writeBlock - could not write 21872

buffer(1),0: Failure during execution of operator logic.

buffer(1),0: Fatal Error: APT_BufferOperator::writeAllData() write failed. This is probably due to a downstream operator failure.

node_node1: Player 3 terminated unexpectedly.

buffer(0),0: Fatal Error: APT_BufferOperator::writeAllData() write failed. This is probably due to a downstream operator failure.


Thanks,
SomaRaju.
somaraju
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

What do you understand "no space left on device" to mean?

Time to do some housekeeping in the file systems, methinks.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply