80004005 error now occuring. Looking for ideas...
Moderators: chulett, rschirm, roy
80004005 error now occuring. Looking for ideas...
Our production ETL has begun aborting while scheduled. I can run the same ETL manualy and it succeeds without error. It has been running for months without issue and no changes have been made since long before this began occurring. So I am confident this is not a problem with Ascential or the ETL -- but clearly something has changed. It aborts in the same job each time, although the 3rd run aborted in a different job, while still to the same server. It has been doing this for about two weeks now.
Here's what I've already done and going to do:
I have run SQL Profiler on the source system (it is a job extracting data from this server in which it aborts). The log gives no indication that it even knew an attempt was made. This is what I actually expected given the 80004005 genernal connection failure.
I am going to try scheduling it at other times to see if there is some time-related cause.
Other ideas and suggestions of how to get additional info that may help me identify the cause is appreciated.
Thanks.
Here's what I've already done and going to do:
I have run SQL Profiler on the source system (it is a job extracting data from this server in which it aborts). The log gives no indication that it even knew an attempt was made. This is what I actually expected given the 80004005 genernal connection failure.
I am going to try scheduling it at other times to see if there is some time-related cause.
Other ideas and suggestions of how to get additional info that may help me identify the cause is appreciated.
Thanks.
If you try to google the error code. It will bring you results pointing towards a few things.
1)Connection not available
2)login info incorrect.
Try to investigate into that. you are right. Its not a DS problem but rather a database issue.
1)Connection not available
2)login info incorrect.
Try to investigate into that. you are right. Its not a DS problem but rather a database issue.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
The 80004005 means that the server/database wasn't reachable. It is a catch-all error in that, whatever doesn't fit into one of the specific errors (e.g., 80040e4d login Failed for User '(Null)', 80040e09 access denied, etc.) ends up with an 80004005. Basically, it means "we could not connect and we haven't a clue why not."
Unfortunatley, I'm more familiar with this error than I would like to be. I was just hoping someone had some secret trick that would help me look deeper into what is happening with DataStage and maybe that would give me a clue as to what is preventing the connection. Since the connection never gets made the server can't really tell me why it didn't succeed.
Thanks for the reply.
Unfortunatley, I'm more familiar with this error than I would like to be. I was just hoping someone had some secret trick that would help me look deeper into what is happening with DataStage and maybe that would give me a clue as to what is preventing the connection. Since the connection never gets made the server can't really tell me why it didn't succeed.
Thanks for the reply.
If the connection is not available from the database itself then there is not much that DataStage can do.
You have mention that "Our production ETL has begun aborting while scheduled". A continous tense phrase. So that means it has been aborting during the scheduled time more than once. Have you spoken to your DBA whether the server was available at that time or not.
You have mention that "Our production ETL has begun aborting while scheduled". A continous tense phrase. So that means it has been aborting during the scheduled time more than once. Have you spoken to your DBA whether the server was available at that time or not.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
I am an admin for both the source server and the DataStage server so I have already confirmed that it is available.
There are more recent releases of this same ETL which are running on a different DataStage server (our dev/test server), though scheduled to run about an hour later in the morning. The dev and test ETL continue to succeed without any issues while the production ETL (running on the production server) fails. The job that aborts in the scheduled production run is exactly the same as the corresponding job that runs in the dev and test ETLs -- no changes have been made to that job. Yet both dev and test succeed without error and production fails.
I just rebooted the production server and have scheduled the ETL to run in the next while. So, we'll see if that makes any difference.
I really hate things that just stop working for no apparent reason.
I do appreciate the input.
There are more recent releases of this same ETL which are running on a different DataStage server (our dev/test server), though scheduled to run about an hour later in the morning. The dev and test ETL continue to succeed without any issues while the production ETL (running on the production server) fails. The job that aborts in the scheduled production run is exactly the same as the corresponding job that runs in the dev and test ETLs -- no changes have been made to that job. Yet both dev and test succeed without error and production fails.
I just rebooted the production server and have scheduled the ETL to run in the next while. So, we'll see if that makes any difference.
I really hate things that just stop working for no apparent reason.
I do appreciate the input.
How are you controlling the process. Is it via a Master Control Sequence or by a batch job. How are you providing the parameters. Look into the parameter file to make sure it has the most current username and password. Go into the job log and check and verify if the user name and password passed onto the jobs is correct or not.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
The ETL is controlled by a master sequence and job parameters are passed from there on down to the main sequence and other jobs and stages. The parameter values have not changed (they were entered during the scheduling of the job months ago). Other jobs within this ETL that access the same server succeed while this one does not. Although, I did have it abort once (the 3rd time) on a different job. All other times it has aborted on the same job.
Although, you do give me something to think about. I wonder if somehow this particular job has lost its ability to receive the parameters and so fails? As mentioned earlier, it works fine when I run it manually.
I'm curious to see what happens with the run I just scheduled. Since I just reentered the parameters (most are defaulted) it will be interesting to see if it works or not. Maybe all I need to do is reschedule the ETL.
We'll see.... Thanks.
Although, you do give me something to think about. I wonder if somehow this particular job has lost its ability to receive the parameters and so fails? As mentioned earlier, it works fine when I run it manually.
I'm curious to see what happens with the run I just scheduled. Since I just reentered the parameters (most are defaulted) it will be interesting to see if it works or not. Maybe all I need to do is reschedule the ETL.
We'll see.... Thanks.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Not nessecarily. The individual job can have its own parameters hard coded by the developer. There can be Human mistakeskris007 wrote:
In that case, wouldn't it fail when run manually?
As per OP, Jobs were finished successfully when ran manually, but they failed only when scheduled.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
As mentioned in my last post, the scheduled parameters are the same ones that have been running successfully for months. And they are the same entered/default parameters used when I run it manually. There are only three entered parameters; the rest are all by defaults.
A follow-up: The scheduled ETL I did today, ran successfully to completion. I have rescheduled the ETL for 10 minutes later (2:10am rather than 2:00am) to see if the rescheduling will cause it to now run successfully tomorrow morning.
Interesting oddity.
A follow-up: The scheduled ETL I did today, ran successfully to completion. I have rescheduled the ETL for 10 minutes later (2:10am rather than 2:00am) to see if the rescheduling will cause it to now run successfully tomorrow morning.
Interesting oddity.
Just wanted to wrap this thread up...
I still don't know why -- can't seem to find any cause -- but it turns out to be something related to the time. I moved the schedule time from 2:00am to 2:10am and it succeeds without issue. I've been poring through the logs and have yet to see anything that tells me why access by this job to this server fails at that time of the morning.
BUT, I learned a lesson: Don't discount the time as a non-issue. Next time, I will try rescheduling it sooner. Just never guessed that would be the problem.
I still don't know why -- can't seem to find any cause -- but it turns out to be something related to the time. I moved the schedule time from 2:00am to 2:10am and it succeeds without issue. I've been poring through the logs and have yet to see anything that tells me why access by this job to this server fails at that time of the morning.
BUT, I learned a lesson: Don't discount the time as a non-issue. Next time, I will try rescheduling it sooner. Just never guessed that would be the problem.
Thanks for the update. Even i was wondering wether you were able to solve this or not.
So a ten minute difference did the trick huh
But stuff like this doesnt happen without a reason. Maybe, in your spare time, go int &PH& directory and try to see what messages were created by the stages in that particular job around that time. I dont know just a little bit of extra research.
Well i am glad you got through the issue.
Guess the server was on a tea break at 2
So a ten minute difference did the trick huh
But stuff like this doesnt happen without a reason. Maybe, in your spare time, go int &PH& directory and try to see what messages were created by the stages in that particular job around that time. I dont know just a little bit of extra research.
Well i am glad you got through the issue.
Guess the server was on a tea break at 2
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
What else was happening in the database at 2:00am? In another project I was caught by something like this - "they" ran a series of batch updates about which they'd neglected to inform us. DataStage job just hung waiting for table-level locks to be released.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.