Page 1 of 1

Job hanging

Posted: Wed Aug 31, 2005 3:11 am
by elavenil
Hi,

There is a job, in which 9 hash files are used as lookup. Couple of hash files are static and few of them are dynamic. All the dynamic hash files are very small except one.

This job usually takes less than a minute to process 5 to 10K records but yesterday's run the job was running for more than 6 hours but not finished/aborted. It processed 2K records after that it just hung. Looked at all the dynamic hash files and found that one of the hash file is overflow and just recreated then submitted to run the job. But still the same. Restarted DS but no luck. After the UNIX server restarted, the job ran in 10 seconds.

I could not get any clue why the job just hung? Pls enlighten me some clue.

At the sametime, Has anyone encountered any situation like this? If yes, can you share with us, what caused the problem and how the problem was resolved?

Thanks in advance for the ideas/suggesstions.

Regards
Saravanan

Posted: Wed Aug 31, 2005 4:12 pm
by ray.wurlod
Very hard to diagnose after the event. You have to look at things while the problem is occurring. Most probable causes are locks (one job has a record in a hashed file locked, another job is trying to use it) or system load. Sounds like locks are the most likely culprit; both the Hashed File stage and the UV stage properly set record-level update locks when writing to hashed files (though a UV stage can "promote" to a table-level lock if the number of inserts/updates is sufficiently large - which is set by the MAXRLOCK parameter in the uvconfig file).

Posted: Thu Sep 01, 2005 3:17 am
by elavenil
Thanks Ray for your suggesstions. I do not think i can reproduce this problem. I will look into this locking when i encounter the problem like this.

Regards
Saravanan