Page 1 of 1

DS job and the data sets will not be cleaned up

Posted: Tue Feb 17, 2009 3:57 pm
by p4paulian
Hi

I was told by IBM support that if we have director open and are monitoring Column Analysis executions and the job finishes while we are monitoring, then the DS job and the data sets will not be cleaned up.

Is this how it works for Info analyzer column analysis??

I was assuming that any job irrespective of whether it is column analysis or a normal px job should work the same way as per resource allocation and clearance goes.

thanks

Posted: Tue Feb 17, 2009 4:03 pm
by ray.wurlod
Welcome aboard.

It's not a DataStage thing. No operating system will allow you to delete a file that is open. So it is here - if it's open (being viewed by Director) it can't be cleaned up.

Posted: Tue Feb 17, 2009 4:25 pm
by p4paulian
Thanks for your response Ray.

If a job fails, then it will not clean up or is it regardless what the end status of job is, it wont clean up if the director is open.

We do use director to monitor a running job, right?
:roll:

Posted: Tue Feb 17, 2009 4:37 pm
by chulett
'Clean up'? :?

Posted: Tue Feb 17, 2009 4:45 pm
by p4paulian
Temporary datasets created during the job run "clean up".

Do we monitor a running a job or not via director :roll: :?:

Posted: Tue Feb 17, 2009 5:17 pm
by chulett
I should have been more specific. I was wondering what about the job itself, not the data sets, would be cleaned up... since you mentioned both. But that's not important if the question is strictly in regards to 'temporary' datasets and if they're removed after the job runs.

As to your eye-rolling Director question, in general the answer is yes - that's exactly what it is for. However, I can't speak to what quirks your 'Column Analysis executions' may bring to the table, you'll need to wait for Ray to come back and clarify things.

Posted: Tue Feb 17, 2009 6:00 pm
by p4paulian
Director is for monitoring jobs, thats where I am lost,

I had been using it to monitor Datastage jobs, and am wondering how it is different when I monitor Column analysis jobs.

Posted: Tue Feb 17, 2009 7:24 pm
by ray.wurlod
All Information Analyzer tasks are run as DataStage jobs (the osh is generated directly by IA, there is no graphical job design generated) in the ANALYZERPROJECT project.

Posted: Tue Feb 17, 2009 8:59 pm
by p4paulian
Very true,

But Ray, how is it different as per viewing the logs in director for a column analysis job to viewing logs for a parallel job.

We never came across issues like disk space full if we had director open for a parallel job, but in case of column analysis it happens.

Due to this issue, they suggested us not to monitor while running a IA job.

Posted: Tue Feb 17, 2009 9:39 pm
by ray.wurlod
IA analyses require huge amounts of scratch space. My guess - and I don't know for sure - is that viewing logs or monitoring jobs exacerbates the total demand for disk space on the server to a point where it fills the disk. This would especially be true if you used the default configuration file supplied with the product, which places scratch disk in the same file system as the engine and the projects.

Posted: Tue Feb 17, 2009 10:47 pm
by p4paulian
Thanks for your response Ray.

We are using the default configuration file which is a single node.
So, in case we change the configuration file to 2 or 4 node.
Have scratch disk in file system other than where engine and projects reside.

Doing all this may also help, instead of not monitoring in Director.

Could we change the configuration file in IA, or only Administrator can do it.

Re: DS job and the data sets will not be cleaned up

Posted: Wed Feb 18, 2009 12:17 am
by syrup75
hello,

i guess.. you checked 'retain script' and 'retain datasets' options.

if you don't check these options, jobs and datasets will be clean up after doing column analysis, i think..

also,, you can change configuration file in engine tab. (you can find retain script & datasets options in engine tab also...)

however, it's not like datastage.. you should type a specific directory and .apt file name.

when you execute 'run column analysis', you may find scheduler, sample, option and engine tab in right side of window.

thanks,