Unknown Abort Error

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
msigal
Participant
Posts: 31
Joined: Tue Nov 26, 2002 3:19 pm
Location: Denver Metro

Unknown Abort Error

Post by msigal »

Can someone help me with this error? The job aborts with the following warning:
Abnormal termination of stage ds00548FacProfDrgBOR..XfrmCustDrug detected

Then after resetting I have another information log that states:

From previous run
DataStage Job 1615 Phantom 4549
UniVerse/SQL: 0 records deleted.
UniVerse/SQL: 1 record deleted.
jobnotify: Error 911 occurred.
DataStage Phantom Finished.
[252698] DSD.StageRun ds00548FacProfDrgBOR. ds00548FacProfDrgBOR.ds00548ContainerFacProfDrgC7.XfrmAllData 1 0/50 - terminated.

Thanks,
Myles
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Please confirm release of DataStage, that XfrmCustDrug is a Transformer stage, and then names of any Routines called from that Transformer stage. The name of the job is ds00548FacProfDrgBOR, presumably ds00548ContainerFacProdDrgC7 is a Container (shared?). What exactly is XfrmAllData?
Somewhere a UV stage is involved?
Was there any additional information in &PH&?
How many warnings were logged by this job?
Are there any links that handle rejected rows and, if so, is there a limit to the number of rows that such links can handle?
msigal
Participant
Posts: 31
Joined: Tue Nov 26, 2002 3:19 pm
Location: Denver Metro

Post by msigal »

Ray, here's a very long reply along with the answers to your questions. Thanks again for you assistance. - Myles

We are running DS 5.2 on AIX 5. XfrmCustDrug is a Transformer stage. The routines called are all custom, genCharVal, genFin2Val, RptDistribution, and RptControls. There is one hashed file connected to this stage. The name of the job is ds00548FacProfDrgBOR, and ds00548ContainerFacProdDrgC7 is a shared container. XfrmAllData is a transformer stage in the shared container. There is no UV stage in this job? I couldn't find any additional information in &PH&.
There was only the warning for abort about the abnormal termination of stage. We don't generally use the reject option for constraints.

And some additional information:

The source stage is CFF that feeds the XfrmAllData tranformer. Logic is applied to split off records down one of 4 links. Link1 goes out of the container to the XfrmCustDrug transformer. Link2 goes out of the container to XfrmCustProf transformer. Link3 stays in the container and goes to a sort stage, followed by transformer stage that further splits the data down 2 more links coming out of the container. Link4 handles the "rejects", not using rejects option on constraint but reverse logic of other links. We have many hashed files across this job and had some memory issues with the sort stage, in that we would run out of memory while buffering records. That was fixed by lowering Max Rows in virtual memory. We also learned that in 5.2 enable and disabling preload of hashed files to memory has no effect of the allocation of memory to the job, causing huge amounts of memory being allocated to the job. We have since reduced memory allocation on adminstrator from 128 to 64.

Reodering of the links seems to have some effect of the error message received. XfrmCustDrug is the first link. When another link is first, it is noted in the error message instead. When I moved the sort stage to the first output link of the XfrmAllData transform, the problem stopped. There are approximately 170k records passing throught the XfrmAllData transform, of which 38k go to the sort stage.

I tried to lookup the error number, 911, and wasn't succesful. When looking at the job monitor in Director it appears as the later stages finish, except the sort. My wild hypothesis is that the XfrmAllData transformer wants to finish and attempts to "classify" the first link out as finished but it can't because of the sort stage still working.

Edited by - msigal on 12/16/2002 07:07:52
Post Reply