Heap Size Allocation

kalimuddin · Post by **kalimuddin** » Mon Nov 13, 2006 3:56 am

Hello,
I am using two teradata stage and i am doing a inner join then again i am doing a left outer join with a dataset,then i am doing a look up with a dataset the lookup condition is continue,then i am doing some transformation work in transformer and then doing some aggregation then finally putting it to a sequential file.at the source in one of the teradata table i have 67 million record and in the other one i have 8 million record,while reading from the first table, the table which has 67 milliom record it stops after flowing 66 million record and gives me heap size allocation failed. i have tried all the steps described in the forums is there any other way i can resolve this issue. i have set everything to unlimted , i have tried using sort stages also but nothing works exactly at the 66th million record it stops and gives the error.waiting for a better reply which can resolve the issue

Thanks & Regards,
kalim.

salil · Post by **salil** » Mon Nov 13, 2006 4:29 am

Can u tell us the average size of ur single record when it reaches the aggregator?
Any idea about the system memory?
Are the inputs partitioned in each stage?
How many nodes specified in parameter file?

kalimuddin · Post by **kalimuddin** » Mon Nov 13, 2006 4:51 am

salil wrote:Can u tell us the average size of ur single record when it reaches the aggregator?
Any idea about the system memory?
Are the inputs partitioned in each stage?
How many nodes specified in parameter file?

at the source level itself its stoping after the 66th million record just moving to join
when i check by ulimit -m its showing unlimited
yes at join stage it is partitioned and has a check on perform sort
8 node configuration is given in the parameter

salil · Post by **salil** » Mon Nov 13, 2006 6:04 am

Try to put a sorter upstream the joiner,instead of tsort and try assess the change.Also,ensure that its hash partitioned and try interchanging the streams of join(left to right).

samsuf2002 · Post by **samsuf2002** » Mon Nov 13, 2006 10:11 am

try what salil suggested if that doesnt help u try with a look up stage .

kalimuddin · Post by **kalimuddin** » Tue Nov 14, 2006 10:59 pm

samsuf2002 wrote:try what salil suggested if that doesnt help u try with a look up stage .

I tried but it dosent work,i also tried with a sort stage,i changed the partitioning to "set" in both the teradata stage and the first join stage and in the second join stage i made the partition to propogate and i have used 16 node config file and i have also used APT_DISABLE_COMBINATION in the parameter. but i am still getting the same error,i checked with the hard and soft limits every thing unlimited..any better idea or any changes in job requiered do kindly help me resolving it.

thebird · Post by **thebird** » Tue Nov 14, 2006 11:08 pm

Have youy tried sorting the data inside Teradata?

Try sorting the data inside Terradata feed this to a Sort stage and set the option inside to Don't Sort and then try it out.

Regards

The Bird

kalimuddin · Post by **kalimuddin** » Tue Nov 14, 2006 11:20 pm

thebird wrote:Have youy tried sorting the data inside Teradata?

Try sorting the data inside Terradata feed this to a Sort stage and set the option inside to Don't Sort and then try it out.

Regards

The Bird

how to sort data inside teradata when it is a source and i have only the output tab where no partitioning tab is available, when it is in target we can see the input tab and partitioning tab where we can perform a sort.kindly do reply..

thebird · Post by **thebird** » Wed Nov 15, 2006 12:15 am

[/quote]

how to sort data inside teradata when it is a source and i have only the output tab where no partitioning tab is available, when it is in target we can see the input tab and partitioning tab where we can perform a sort.kindly do reply..[/quote]

You will have to give a suitable Order By clause in the SQL query so that the data gets sorted inside Teradata.

The Bird

kalimuddin · Post by **kalimuddin** » Tue Nov 21, 2006 2:46 am

thebird wrote:

how to sort data inside teradata when it is a source and i have only the output tab where no partitioning tab is available, when it is in target we can see the input tab and partitioning tab where we can perform a sort.kindly do reply..[/quote]

You will have to give a suitable Order By clause in the SQL query so that the data gets sorted inside Teradata.

The Bird[/quote]

i tried it (order by colum name) with in the teradata stage but nothing worked. even i tried just using one teradata stage and a dataset no other stages are there but when i run the job i am getting the same error at the 66th million. is there any other things i should try..

Nageshsunkoji · Post by **Nageshsunkoji** » Tue Nov 21, 2006 3:53 am

kalimuddin wrote:
thebird wrote:
how to sort data inside teradata when it is a source and i have only the output tab where no partitioning tab is available, when it is in target we can see the input tab and partitioning tab where we can perform a sort.kindly do reply..

You will have to give a suitable Order By clause in the SQL query so that the data gets sorted inside Teradata.

The Bird[/quote]

i tried it (order by colum name) with in the teradata stage but nothing worked. even i tried just using one teradata stage and a dataset no other stages are there but when i run the job i am getting the same error at the 66th million. is there any other things i should try..[/quote]

Hi Kalimuddin,

I think the problem is at join stage, where by default datastage insert Tsort operator. You can do one thing put a sort stage before join stage and perform hash partition and select the environment variable called APT_NO_SORT_INSERTION as true. This variable will stop the default sorting and one more thing, you have 66 million records as source and if you are performing sort on that many records, check the scratch disk space, whcih is dedicated for sorting purpose and increae the size if it is not sufficient.

kalimuddin · Post by **kalimuddin** » Tue Nov 21, 2006 5:05 am

Nageshsunkoji wrote:
kalimuddin wrote:
thebird wrote:
how to sort data inside teradata when it is a source and i have only the output tab where no partitioning tab is available, when it is in target we can see the input tab and partitioning tab where we can perform a sort.kindly do reply..
You will have to give a suitable Order By clause in the SQL query so that the data gets sorted inside Teradata.

The Bird

i tried it (order by colum name) with in the teradata stage but nothing worked. even i tried just using one teradata stage and a dataset no other stages are there but when i run the job i am getting the same error at the 66th million. is there any other things i should try..[/quote]

Hi Kalimuddin,

I think the problem is at join stage, where by default datastage insert Tsort operator. You can do one thing put a sort stage before join stage and perform hash partition and select the environment variable called APT_NO_SORT_INSERTION as true. This variable will stop the default sorting and one more thing, you have 66 million records as source and if you are performing sort on that many records, check the scratch disk space, whcih is dedicated for sorting purpose and increae the size if it is not sufficient.[/quote]

ok lets forget the join stage just see i have a teradata stage and a dataset, i am reading from teradata and writing it to datastage nothing else i am doing so here also after the 66th million record i am getting the heap size error while reading itself.i have written order by clause and the partition flag is set.

DSXchange

Heap Size Allocation

Heap Size Allocation

Re: Heap Size Allocation

Re: Heap Size Allocation

Re: Heap Size Allocation