Page 1 of 1

Import/Delete hangs up in Datastage designer

Posted: Fri Jul 04, 2014 2:28 pm
by SachinCho
Hi,
We are on v9.1.2 and trying to import few jobs after version control in pre-prod environment. Issue is that whenever we import pre-existing job, process hangs at window "clearing job". Nothing happens after that and I need to close the session through task manager. If I try to delete job from designer result is same. It hangs up. One obsrvation is that if its a new job it gets imported successfully

What we have done so far
1) Executed syncproject.sh which gave few corrupted jobs. We removed them using IBM's standard steps. using bin/uvsh, then deleting RT* files, and then RIDs
2) Ran DS.CHECKER...few old files popped up asking to delete. Deleted those
3) executed DS.REINDEX ALL
4) DSImport.sh from ASBNode/bin - this command is executing well and job gets imported as many times we want by overwrite mode
5) We have services tier and two DSEngines are hosted on one. We are facing this issue only on one of DSEngines and other DSEngine its working fine
6) On our remote machine where DS clients are installed I could see some exceptions in file called as orbtrc.02072014.1541.36.txt - "org.omg.CORBA.OBJECT_NOT_EXIST: SERVANT_NOT_FOUND (4) for key"

Any pointers ??

Thanks,
Sachin

Posted: Sat Jul 05, 2014 7:11 am
by ArndW
That is an odd problem. Can you manually delete a job from the designer/director that causes a hang, and then successfully import that job? That might rule out internal locking issues. On a related note, did you re-boot the server(s) after doing all that maintenance work?

Posted: Sat Jul 05, 2014 10:52 pm
by SachinCho
Hi,
We are not able to delete a job that is causing the problem from designer/director. That process also hungs up.

And I missed out in earlier post that we have already rebooted DSengine, and WAS on services tier as well.

One more observation is that on services tier we did notice similar expection of COBRA OBJECT NOT FOUND on services tier as well in startup logs somewhere.

Posted: Sun Jul 06, 2014 6:43 am
by ArndW
In order to simplify the analysis, can you rename the job, or does that hang as well?

Posted: Sun Jul 06, 2014 4:08 pm
by SachinCho
Hi,
we are not able to rename also. Just noticed below error while doing so after the process hung. Rename/Delete/Import all does not work

IBM.DataStage.RepositoryAccess.ReposAccessException: Error opening file EDW_JPR_DWM_PRODUCT_DIM

Talking to official service provider as well. But havent got any concrete solution yet

Posted: Mon Jul 07, 2014 5:55 am
by ArndW
You have problems in your repository structure, it is still corrupted. I think that your support provider might suggest a complete export, project deletion & re-creation; I would recommend that as well. Even though it is a lot of work, the total effort will be less than spending a lot of time trying to find the and fix the problem.

Posted: Mon Jul 07, 2014 12:29 pm
by PaulVL
Look at your group permissions under the project directories.

ensure that someone is not locking some stuff under HA_STATUS as well.

Posted: Wed Jul 09, 2014 5:49 pm
by SachinCho
Hi Guys,
Thanks for your responses so far. We had a look at HA_STATUS as well and did not find anything there except a single old file. That was deleted too ! Directories under projects directory also looks good in terms of group permissions etc

Still talking to service provider as well.

Posted: Sun Jul 20, 2014 5:38 pm
by SachinCho
Hi Guys,
Looks like we have got the root cause for this issue.

We have a SFTP script scheduled in crontab which polls for files 24/7 and it uses internally "at command". It looks like we had failure of at daemon and due to this there was huge list of pending processes. Every day it was adding like 15 to 20K processes. So wehn we used at -l command it gave us output of 290000 processes.
As per discussion with official service provider, we came to a conclusion that datastage import of already existing job in repository was scanning this list of at -l processes and due to such huge number it was hanging.

Steps followed to resolve
1. Cleaned up all processes in at -l listing
2. Started at daemon again
3. Now datastage Import/delete works fine

Questions unanswered at the moment
1. Why datastage job import process looks at scanning these process

I am marking topic as resolved.

Thanks !