Job Corruption problem

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
sankarsadasivan
Premium Member
Premium Member
Posts: 39
Joined: Tue Dec 23, 2003 3:47 am
Location: India

Job Corruption problem

Post by sankarsadasivan »

We have a problem in production..

All of a sudden one of the DS jobs is not running properly.

It has a oracle -> Hash file stage, followed by lookups and the sequential file and finally update/inser on oracle stage. Its a fairly big job.

The job was hanging initially while creating the hash file. We then cleared
existsing hash file on Unix, rename the hash file in the job and finally now hash gets created but the job is not progressing further. Its just stand still for 12 hrs, normally it finished in 3-4 hrs.

Director shows o rows, designer shows blue and no oracle sessions at all.

We tried all the following
1. Recompiled the job
2. Renamed the job
3. Saved a copy, renamed and compiled.
4.Deleted the job from the server and imported again
5. Imported on a different name

Nothing worked!!

When we loginto the project thru director , we frequently get these error message
Cannot open executable job file RT_CONFIG1219

We cleared the jobs logs, status file and resources through unix and director. Also imported the job in a different name, again the new job hangs.

Rest other jobs are running fine.
Totally clueless...

Anyidea??
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

A job that just hangs there forever without doing anything is most likely waiting on a lock - any playing around with "kill -9" or other hanky-panky with the engine process might lead to a lock left over that isn't going to get cleared. It looks like the lock is on the hashed file, which is why running a different copy of the job isn't making a difference.

I would recommend you start your deadlock daemon to clear this up; or better yet to stop and re-start DataStage to make sure your locks are "clean".

The RT_CONFIG error means that you should also use DS.TOOLS to clean up your repository indices and it would make sense to couple this with your restart of DataStage, as it requires that all users are out. You can use the time to also do a quick call to clean up the project job files from DS.TOOLS as well.
sankarsadasivan
Premium Member
Premium Member
Posts: 39
Joined: Tue Dec 23, 2003 3:47 am
Location: India

Post by sankarsadasivan »

Hi

Will try that.

Can you please tell me what options to select in DS.TOOLS to clear repository indices.

That would be of great help

thanks
jzparad
Charter Member
Charter Member
Posts: 151
Joined: Thu Apr 01, 2004 9:37 pm

Post by jzparad »

Arnd,

Could you explain how to do this.
I would recommend you start your deadlock daemon to clear this up;
Jim Paradies
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Sankar,

you will need to start numbers (2) to rebuild the indices and (4) to check integrity of job files.

Jim,

the DataStage $DSHOME directory has a file called "dsdlock.config" with a line that reads either start=1 or start=0 to control whether or not the lock daemon is fired off when DataStage is started.

You need to be root to start the deadlock daemon manually. Attach to the DataStage home directory, and enter the command "bin/dsdlock -config"
sankarsadasivan
Premium Member
Premium Member
Posts: 39
Joined: Tue Dec 23, 2003 3:47 am
Location: India

Post by sankarsadasivan »

I assume that DS engine should be up while doing this..
Am I right?

Pls suggest
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

yes, the engine needs to be up while doing the options (2) and (4) and you should ensure that no users are in DataStage.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

If RT_CONFIG1219 can not be opened, reindexing will have no benefit at all, because there are no indexes on the configuration files. Probably the best thing to do is to make a copy of the job, compile and run that (then delete the old version of the job that has a corrupted configuration file).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
sankarsadasivan
Premium Member
Premium Member
Posts: 39
Joined: Tue Dec 23, 2003 3:47 am
Location: India

Post by sankarsadasivan »

Sorry for the delay!.
Once we stopped and started Datastage the problem was resolved.

Appreciate all the help.
Thanks all of you.
Post Reply