Job abort during AGG OR SORT

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Nagaraj
Premium Member
Premium Member
Posts: 383
Joined: Thu Nov 08, 2007 12:32 am
Location: Bangalore

Job abort during AGG OR SORT

Post by Nagaraj »

I have the below job design.

SEQ_FILE---->TFM1---->AGG---->TFM2---ORA_TGT

in the seq file i have around 700,000 rows. with 11 columns.

i have to do a group by on 9 columns and agg function like max and First for the 2 remaining columns.
in TFM1 i dont do any transformations. its direct throughput.

In agg i have checked all the group by columns and the functions applied.

the job fails at AGG stage.

OPTIONS TRIED SO FAR

1. Add a sort before aggregator and check "ignore" option for group by columns in the agg stage >>>> no use, it aborts.

2. tried doing sort on the 9 columns using unix sort >>>> error says "SEGMENTATION FAULT"

3.seq---->AGG--->TFM---SEQ
>>> got call from unix team saying
PROBLEM Service Alert: hostname/VAR_DISK is CRITICAL
Files occupying space
/var/tmp/stm4595752aaaaa
/var/tmp/stm4595752aaaab
/var/tmp/stm4595752aaaab
/var/tmp/stm4595752aaaab etc etc
Something dropped 100 4M files (204 total). /var/tmp

let me know what else or how i can get this job process and load my target table.

Errors:
Aggregator_66: %s >>>> even after resetting the job i didnt get much info in the log.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Your scratch (temp) disk is filling. Clean it up.

Getting more memory in the machine would help too.

It is possible to tune the memory consumption of the in-memory table created by the Aggregator stage, but I suspect you're not ready to go there yet.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Nagaraj
Premium Member
Premium Member
Posts: 383
Joined: Thu Nov 08, 2007 12:32 am
Location: Bangalore

Post by Nagaraj »

"It is possible to tune the memory consumption of the in-memory table created by the Aggregator stage" >>>> I believe it is the memory problem, ray can you please let me know how to tune the memory consumption of the in-memory table?
Nagaraj
Premium Member
Premium Member
Posts: 383
Joined: Thu Nov 08, 2007 12:32 am
Location: Bangalore

Post by Nagaraj »

should i be waiting for the reply or i shd close this thread and start a new one ??? Let me know


"It is possible to tune the memory consumption of the in-memory table created by the Aggregator stage" >>>> I believe it is the memory problem, ray can you please let me know how to tune the memory consumption of the in-memory table?
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Occasionally I like to sleep, particularly during the night.

There are some tools for tuning the Aggregator stage memory table but, if you don't have that memory, you're only going to aggravate your disk issues. I'll need to research the command - it used to be available from the DS.TOOLS menu, but I don't believe it's there any more.

Watch this space.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

You should just be more patient, especially if only Ray is allowed to answer. :roll:

I've used the DS.TOOLS entry in the past for that but have no clue what the actual "tool" itself would be if it is no longer there.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Nagaraj
Premium Member
Premium Member
Posts: 383
Joined: Thu Nov 08, 2007 12:32 am
Location: Bangalore

Post by Nagaraj »

Okay Thanks Ray and chulett, i will keep my fingers crossed.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

The DS.TUNEPROPS command applies only to a server job that contains an Aggregator stage. It allows the memory allocation model for the Aggregator to be modified. The single argument for this command is the name of the job.

Code: Select all

>DS.TUNEPROPS SummarizeProduct

Stages in job: SummarizeProduct
Stage#   Stage name.(Type)......................................
     1 = aggrSummarizeProduct (AGGREGATOR)

Properties of stage: aggrSummarizeProduct, Job: SummarizeProduct
 Prop#   Property name......  Value..............................
     1 = ReportAfterRows      100
     2 = ReportAfterTime      15
     3 = TableSize            8192
     4 = ResizeAt             2
     5 = ResizeBy             8
     6 = Resize2Threshold     524288
Enter property number:
The first two options govern the number of input rows or seconds after which the Aggregator stage is forced to update its status record in the RT_STATUSnnn table (which is viewed by the Monitor). This is necessary because potentially no rows come out of the Aggregator until all rows are in.

The TableSize property is the initial size (in bytes) of the in-memory table used to accumulate results. It must be a power of 2. When the table size is less than or equal to the number of entries times the ResizeAt figure, the table is re-hashed.

When the table is re-hashed its size is increased by ResizeBy factor, until the Resize2Threshold (default 0.5MB) is reached. After that the size of the table is doubled each time it needs to be re-hashed, irrespective of the ResizeBy setting. With the default settings, memory allocation is 8KB (initial), 64KB, 512KB, 1MB, 2MB and so on.

Some advantage can be gained with unsorted data by increasing the initial table size, so that more memory is allocated initially. Reducing the ResizeBy factor can reduce incremental demand for memory, at the cost of taking more incremental requests for the same volume of data.

An attempt to allocate too much memory will result in a fatal error that reports an access violation at run time.

Changes effected by this command are stored in the RT_CONFIGnnn table for the job. They are lost if the job is re-compiled.

Tip: You can find out which jobs use Aggregator stage via Usage Analysis on the Aggregator stage type.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Nagaraj
Premium Member
Premium Member
Posts: 383
Joined: Thu Nov 08, 2007 12:32 am
Location: Bangalore

Post by Nagaraj »

Changes effected by this command are stored in the RT_CONFIGnnn table for the job. They are lost if the job is re-compiled >>>> Can i resize the table when the job is running or i need to resize and then run the job without compiling???
Nagaraj
Premium Member
Premium Member
Posts: 383
Joined: Thu Nov 08, 2007 12:32 am
Location: Bangalore

Post by Nagaraj »

Increased the tablesize to 32768 while the job is running, will post the results soon.
Nagaraj
Premium Member
Premium Member
Posts: 383
Joined: Thu Nov 08, 2007 12:32 am
Location: Bangalore

Post by Nagaraj »

No changes to the Job, it has 800k rows and its been running from 4 hrs....!

PS: it has one SUM function and 4 MAX functions.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

So... is that good?

Your best solution to this would be to properly pre-sort the data before it comes to the Aggregator and then assert that sorted order there. It will then use minimal resources for the aggregation and greatly improve the overall performance.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Nagaraj
Premium Member
Premium Member
Posts: 383
Joined: Thu Nov 08, 2007 12:32 am
Location: Bangalore

Post by Nagaraj »

Okay i will add the Sort Stage before AGG sort on the Key columns and in Agg stage i have two tabs on the input tab Sort and Sort Order , i wud update them with number and 'ignore' respectively and check the KEY box for all the Keys, is that correct settings?
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

I don't have the docs in front of me right now. You sort by your grouping keys and then number them appropriately in the Aggregator to show their sort order. Nothing else should be changed.

You can tell it's working when rows "flow through" the stage as the job runs. If it continues to hold onto all of the rows and only spits them out at the end, then you've not got it right. And if you assert the sort order incorrectly, it will abort the job.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Nagaraj
Premium Member
Premium Member
Posts: 383
Joined: Thu Nov 08, 2007 12:32 am
Location: Bangalore

Post by Nagaraj »

Okay got it, will post results soon.
Post Reply