Job Mon Port

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
grb_garre
Participant
Posts: 17
Joined: Wed Jan 19, 2005 10:49 pm

Job Mon Port

Post by grb_garre »

Hi


Is there any clues to Job Monitor Port Failures ???
Can anybody share the ideas ...


Thanks in advance
Raj
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Not with so little information.

Can you please post actual error/warning messages? Some idea of your setup would be useful too - for example what might be trying to monitor jobs? MetaStage? Or are you working in a cluster, such that exchanges between player processes on different nodes and between player, section leader and conductor processes must occur using TCP?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
grb_garre
Participant
Posts: 17
Joined: Wed Jan 19, 2005 10:49 pm

Post by grb_garre »

ray.wurlod wrote:Not with so little information.

Can you please post actual error/warning messages? Some idea of your setup would be useful too - for example what might be trying to monitor jobs? MetaStage? Or are you working in a cluster, such that exchanges between player processes on different nodes and between player, section leader and conductor processes must occur using TCP?
Ray,

Initailly we had run the job in 4 node configuartion(clustered) and than
came back to to 2 node configuration(SMP)
Initial Run its giving fatal errors

1)Error when checking operator: temp.dst has 4 partitions, but only 2 are
accessible from the nodes in the configuration file.
2)Error when checking operator: The dataset will not be deleted.
3)Could not check all operators because of previous error(s)
4)temp.dst could not be deleted

And the next run ,
If i delete the dataset and run the job and it was through , but its giving a warning on job mon port

Failed to connect to JobMonApp on port 13401

Thanks
Amos.Rosmarin
Premium Member
Premium Member
Posts: 385
Joined: Tue Oct 07, 2003 4:55 am

Post by Amos.Rosmarin »

Hi,

I have a very very long correspondence with Ascential support regarding this error.
I use Solaris and already got 2 patches that made things better but still not perfect.

The monitor is located in $APT_ORCHHOME/java
and there are logs you can see there.

The ports that the monitor is using are in
$APT_ORCHHOME/etc/jobmon_ports

The defaults are 13400 and 13401
f you know any other application that uses those ports it is best to change it to aviod collisions.

If you get the msg:
Failed to connect to JobMonApp on port 13401

it is best to start the service:

Code: Select all

$APT_ORCHHOME/java/jobmoninit start

HTH,
Amos
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Thank you for detailed solution. So it appears that the job monitor had not been started? (Should've thought of that - first question from support analyst - is it switched on?)
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
T42
Participant
Posts: 499
Joined: Thu Nov 11, 2004 6:45 pm

Post by T42 »

There is a known issue with JobMonApp related to a job crashing in an unusual way -- for some reason, it would take JobMonApp down with them.

There are a number of patches available, but again, as Amos mentioned, it does not appears to resolve the problem.

Go to $APT_ORCHHOME/java and take a look at the last few lines of the latest log. I bet you it'll be similiar to the messages I have been getting on my AIX box.

I am still trying to nail this bug, but lately, JobMonApp have been behaving -- and it correspond to developers actually improving on their job designs and not getting hard crashes (instead of normal aborts).
Post Reply