Heap Size Allocation

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
kalimuddin
Participant
Posts: 28
Joined: Tue Aug 08, 2006 8:07 am

Heap Size Allocation

Post by kalimuddin »

Hello,
I am using two teradata stage and i am doing a inner join then again i am doing a left outer join with a dataset,then i am doing a look up with a dataset the lookup condition is continue,then i am doing some transformation work in transformer and then doing some aggregation then finally putting it to a sequential file.at the source in one of the teradata table i have 67 million record and in the other one i have 8 million record,while reading from the first table, the table which has 67 milliom record it stops after flowing 66 million record and gives me heap size allocation failed. i have tried all the steps described in the forums is there any other way i can resolve this issue. i have set everything to unlimted , i have tried using sort stages also but nothing works exactly at the 66th million record it stops and gives the error.waiting for a better reply which can resolve the issue

Thanks & Regards,
kalim.
salil
Participant
Posts: 46
Joined: Thu Oct 13, 2005 5:41 am

Re: Heap Size Allocation

Post by salil »

Can u tell us the average size of ur single record when it reaches the aggregator?
Any idea about the system memory?
Are the inputs partitioned in each stage?
How many nodes specified in parameter file?
A printer consists of 3 main parts: the case, the jammed paper tray and the blinking red light.
kalimuddin
Participant
Posts: 28
Joined: Tue Aug 08, 2006 8:07 am

Re: Heap Size Allocation

Post by kalimuddin »

salil wrote:Can u tell us the average size of ur single record when it reaches the aggregator?
Any idea about the system memory?
Are the inputs partitioned in each stage?
How many nodes specified in parameter file?
at the source level itself its stoping after the 66th million record just moving to join
when i check by ulimit -m its showing unlimited
yes at join stage it is partitioned and has a check on perform sort
8 node configuration is given in the parameter
salil
Participant
Posts: 46
Joined: Thu Oct 13, 2005 5:41 am

Re: Heap Size Allocation

Post by salil »

Try to put a sorter upstream the joiner,instead of tsort and try assess the change.Also,ensure that its hash partitioned and try interchanging the streams of join(left to right).
A printer consists of 3 main parts: the case, the jammed paper tray and the blinking red light.
samsuf2002
Premium Member
Premium Member
Posts: 397
Joined: Wed Apr 12, 2006 2:28 pm
Location: Tennesse

Post by samsuf2002 »

try what salil suggested if that doesnt help u try with a look up stage .
hi sam here
kalimuddin
Participant
Posts: 28
Joined: Tue Aug 08, 2006 8:07 am

Post by kalimuddin »

samsuf2002 wrote:try what salil suggested if that doesnt help u try with a look up stage .
I tried but it dosent work,i also tried with a sort stage,i changed the partitioning to "set" in both the teradata stage and the first join stage and in the second join stage i made the partition to propogate and i have used 16 node config file and i have also used APT_DISABLE_COMBINATION in the parameter. but i am still getting the same error,i checked with the hard and soft limits every thing unlimited..any better idea or any changes in job requiered do kindly help me resolving it.
thebird
Participant
Posts: 254
Joined: Thu Jan 06, 2005 12:11 am
Location: India
Contact:

Post by thebird »

Have youy tried sorting the data inside Teradata?

Try sorting the data inside Terradata feed this to a Sort stage and set the option inside to Don't Sort and then try it out.

Regards

The Bird
kalimuddin
Participant
Posts: 28
Joined: Tue Aug 08, 2006 8:07 am

Post by kalimuddin »

thebird wrote:Have youy tried sorting the data inside Teradata?

Try sorting the data inside Terradata feed this to a Sort stage and set the option inside to Don't Sort and then try it out.

Regards

The Bird
how to sort data inside teradata when it is a source and i have only the output tab where no partitioning tab is available, when it is in target we can see the input tab and partitioning tab where we can perform a sort.kindly do reply..
thebird
Participant
Posts: 254
Joined: Thu Jan 06, 2005 12:11 am
Location: India
Contact:

Post by thebird »

[/quote]

how to sort data inside teradata when it is a source and i have only the output tab where no partitioning tab is available, when it is in target we can see the input tab and partitioning tab where we can perform a sort.kindly do reply..[/quote]

You will have to give a suitable Order By clause in the SQL query so that the data gets sorted inside Teradata.

The Bird
kalimuddin
Participant
Posts: 28
Joined: Tue Aug 08, 2006 8:07 am

Post by kalimuddin »

thebird wrote:
how to sort data inside teradata when it is a source and i have only the output tab where no partitioning tab is available, when it is in target we can see the input tab and partitioning tab where we can perform a sort.kindly do reply..[/quote]

You will have to give a suitable Order By clause in the SQL query so that the data gets sorted inside Teradata.

The Bird[/quote]

i tried it (order by colum name) with in the teradata stage but nothing worked. even i tried just using one teradata stage and a dataset no other stages are there but when i run the job i am getting the same error at the 66th million. is there any other things i should try..
Nageshsunkoji
Participant
Posts: 222
Joined: Tue Aug 30, 2005 2:07 am
Location: pune
Contact:

Post by Nageshsunkoji »

kalimuddin wrote:
thebird wrote:
how to sort data inside teradata when it is a source and i have only the output tab where no partitioning tab is available, when it is in target we can see the input tab and partitioning tab where we can perform a sort.kindly do reply..
You will have to give a suitable Order By clause in the SQL query so that the data gets sorted inside Teradata.

The Bird[/quote]

i tried it (order by colum name) with in the teradata stage but nothing worked. even i tried just using one teradata stage and a dataset no other stages are there but when i run the job i am getting the same error at the 66th million. is there any other things i should try..[/quote]

Hi Kalimuddin,

I think the problem is at join stage, where by default datastage insert Tsort operator. You can do one thing put a sort stage before join stage and perform hash partition and select the environment variable called APT_NO_SORT_INSERTION as true. This variable will stop the default sorting and one more thing, you have 66 million records as source and if you are performing sort on that many records, check the scratch disk space, whcih is dedicated for sorting purpose and increae the size if it is not sufficient.
NageshSunkoji

If you know anything SHARE it.............
If you Don't know anything LEARN it...............
kalimuddin
Participant
Posts: 28
Joined: Tue Aug 08, 2006 8:07 am

Post by kalimuddin »

Nageshsunkoji wrote:
kalimuddin wrote:
thebird wrote:
how to sort data inside teradata when it is a source and i have only the output tab where no partitioning tab is available, when it is in target we can see the input tab and partitioning tab where we can perform a sort.kindly do reply..
You will have to give a suitable Order By clause in the SQL query so that the data gets sorted inside Teradata.

The Bird
i tried it (order by colum name) with in the teradata stage but nothing worked. even i tried just using one teradata stage and a dataset no other stages are there but when i run the job i am getting the same error at the 66th million. is there any other things i should try..[/quote]

Hi Kalimuddin,

I think the problem is at join stage, where by default datastage insert Tsort operator. You can do one thing put a sort stage before join stage and perform hash partition and select the environment variable called APT_NO_SORT_INSERTION as true. This variable will stop the default sorting and one more thing, you have 66 million records as source and if you are performing sort on that many records, check the scratch disk space, whcih is dedicated for sorting purpose and increae the size if it is not sufficient.[/quote]


ok lets forget the join stage just see i have a teradata stage and a dataset, i am reading from teradata and writing it to datastage nothing else i am doing so here also after the 66th million record i am getting the heap size error while reading itself.i have written order by clause and the partition flag is set.
Post Reply