Join_48,1: Caught exception from runLocally(): APT_BadAlloc

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
partha.natta
Premium Member
Premium Member
Posts: 32
Joined: Tue Mar 09, 2010 5:56 am
Location: Bangalore

Join_48,1: Caught exception from runLocally(): APT_BadAlloc

Post by partha.natta »

Hi,
We are getting this error in datastage parallel job in windows platform.
We are reading a 8 gb file from sequential file stage and split the records in two flow.
In one flow we are assingning inrownum for each records in the transformer stage.
The mode should be in sequential mode as we want to have distinct inrownum for each records.
After that we are doing join for both the flows, but after some time the job is getting aborted with the below error.

Error Message:
=============

Join_48,1: Caught exception from runLocally(): APT_BadAlloc: Heap allocation failed..
Join_48,1: The runLocally() of the operator failed.
Thanks & Regards,
Partha
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Welcome to DSXChange. When signing up the suggestion is to try the search engine to see if the same problem might have already been posted and solved.

I just did an exact search for "APT_BadAlloc: Heap allocation failed" and got 63 hits. Did you try different keywords?

You have run out of some sort of space on your system. Most likely your data is not sorted on the JOIN condition (or if so, DataStage doesn't know about it) and the result of 2 separate concurrent sorts of 8Gb of data is blowing up your scratch area's disk(s).

When reading a sequential file, you can have the sequential file stage assign a rownumber (unsigned int64 is the datatype, I believe). You can put in a sort stage which sorts on this column but specifies that the data is already sorted. Now DataStage "knows" about the sorting and later on in the join if you use the same column it won't insert a sort and your job will not only NOT abort, but go much, much quicker.
Kryt0n
Participant
Posts: 584
Joined: Wed Jun 22, 2005 7:28 pm

Post by Kryt0n »

Plus you don't need to run sequentially as long as you suitably partition (at least I would hope the row number is based on source row and not partition row...)
Post Reply