Hi
Is there any clues to Job Monitor Port Failures ???
Can anybody share the ideas ...
Thanks in advance
Raj
Job Mon Port
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Not with so little information.
Can you please post actual error/warning messages? Some idea of your setup would be useful too - for example what might be trying to monitor jobs? MetaStage? Or are you working in a cluster, such that exchanges between player processes on different nodes and between player, section leader and conductor processes must occur using TCP?
Can you please post actual error/warning messages? Some idea of your setup would be useful too - for example what might be trying to monitor jobs? MetaStage? Or are you working in a cluster, such that exchanges between player processes on different nodes and between player, section leader and conductor processes must occur using TCP?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Ray,ray.wurlod wrote:Not with so little information.
Can you please post actual error/warning messages? Some idea of your setup would be useful too - for example what might be trying to monitor jobs? MetaStage? Or are you working in a cluster, such that exchanges between player processes on different nodes and between player, section leader and conductor processes must occur using TCP?
Initailly we had run the job in 4 node configuartion(clustered) and than
came back to to 2 node configuration(SMP)
Initial Run its giving fatal errors
1)Error when checking operator: temp.dst has 4 partitions, but only 2 are
accessible from the nodes in the configuration file.
2)Error when checking operator: The dataset will not be deleted.
3)Could not check all operators because of previous error(s)
4)temp.dst could not be deleted
And the next run ,
If i delete the dataset and run the job and it was through , but its giving a warning on job mon port
Failed to connect to JobMonApp on port 13401
Thanks
-
- Premium Member
- Posts: 385
- Joined: Tue Oct 07, 2003 4:55 am
Hi,
I have a very very long correspondence with Ascential support regarding this error.
I use Solaris and already got 2 patches that made things better but still not perfect.
The monitor is located in $APT_ORCHHOME/java
and there are logs you can see there.
The ports that the monitor is using are in
$APT_ORCHHOME/etc/jobmon_ports
The defaults are 13400 and 13401
f you know any other application that uses those ports it is best to change it to aviod collisions.
If you get the msg:
it is best to start the service:
HTH,
Amos
I have a very very long correspondence with Ascential support regarding this error.
I use Solaris and already got 2 patches that made things better but still not perfect.
The monitor is located in $APT_ORCHHOME/java
and there are logs you can see there.
The ports that the monitor is using are in
$APT_ORCHHOME/etc/jobmon_ports
The defaults are 13400 and 13401
f you know any other application that uses those ports it is best to change it to aviod collisions.
If you get the msg:
Failed to connect to JobMonApp on port 13401
it is best to start the service:
Code: Select all
$APT_ORCHHOME/java/jobmoninit start
HTH,
Amos
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Thank you for detailed solution. So it appears that the job monitor had not been started? (Should've thought of that - first question from support analyst - is it switched on?)
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
There is a known issue with JobMonApp related to a job crashing in an unusual way -- for some reason, it would take JobMonApp down with them.
There are a number of patches available, but again, as Amos mentioned, it does not appears to resolve the problem.
Go to $APT_ORCHHOME/java and take a look at the last few lines of the latest log. I bet you it'll be similiar to the messages I have been getting on my AIX box.
I am still trying to nail this bug, but lately, JobMonApp have been behaving -- and it correspond to developers actually improving on their job designs and not getting hard crashes (instead of normal aborts).
There are a number of patches available, but again, as Amos mentioned, it does not appears to resolve the problem.
Go to $APT_ORCHHOME/java and take a look at the last few lines of the latest log. I bet you it'll be similiar to the messages I have been getting on my AIX box.
I am still trying to nail this bug, but lately, JobMonApp have been behaving -- and it correspond to developers actually improving on their job designs and not getting hard crashes (instead of normal aborts).