Removing obsolete jobs

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
Gazelle
Premium Member
Premium Member
Posts: 108
Joined: Mon Nov 24, 2003 11:36 pm
Location: Australia (Melbourne)

Removing obsolete jobs

Post by Gazelle »

We have the requirement: To remove unused jobs from a project

"unused" is defined as:
- any job that has not been compiled
- any job that has not been "started" or "compiled" in the last 180 days (it would be great if "180" could be a parameter).

I am open to ideas, but the options I can see are:
1. Write a DataStage routine
2. Write a Unix script

The steps I can see are:
- Get all of the jobs in the given project.
- For each job, get the jobinfo and identify whether it is "unused".
- If a job is "unused", append it to a dsx export file.
- and then delete it from the project.

There's a slight preference for a Unix script, mainly because the support team currently knows Unix better than DataStage.
I've been reading through some posts here, and am tentatively exploring the uvsh commands, and various command line statements.

The concerns I have are:
a) How to check the return code after each step (such as after the export, and after the delete)?
b) Before deleting the job, how can we check whether it is referenced by another component (and do we need to delete the objects in a specific order, such as Job and then Sequence)?

To do the delete, I currently have the unix script containing:

Code: Select all

cd ${DSHOME}
/bin/uvsh<<EOF
  LOGTO ${ds_project}
  DELETE DS_JOBS ${ds_jobname}
  LO
EOF
But uvsh seems to always have a return code of 0, regardless of the value of ds_project or ds_jobname (even when only running the LOGTO command).

Am I on the right track, or can anyone suggest a better approach?

Thanks,

- g
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

You are totally on the wrong track. By deleting entries in DS_JOBS table you are creating literally zillions of orphan records in other tables (and, indeed, OF other tables). How did you propose to determine your criteria within a script?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Gazelle
Premium Member
Premium Member
Posts: 108
Joined: Mon Nov 24, 2003 11:36 pm
Location: Australia (Melbourne)

Post by Gazelle »

Thanks for the sanity-check, Ray. It looks like I failed.
I could not find a Command Line call to the GUI's delete function (which handles the dependent objects), and when I finally found the "DELETE DS_JOBS" command I got all excited... and forgot to ask "What about the orphans?"
I "could" run the DS.CHECKER to clean up the orphans, but it is pretty ugly to deliberately create orphans and then line them up and kill them afterwards. Also, I am not confident that DS.CHECKER will do everything that is required.
So I am not keen to continue down this track.

Can anyone suggest a better approach?

What I am really after is a Command Line equivalent to the GUI's delete option.
Does one exist?
If not, then what are all of the steps that the GUI delete performs (and how can we reproduce the steps from a Unix script)?
Is this level of information publicly available somewhere?

In answer to Ray's question, I was going to apply the "is the job unused" criteria by searching the output from the "dsjob -jobinfo" command.
The dsjob -jobinfo command returns, amongst other things:
- "ERROR: Failed to open job" if the job is not in a compiled state (or otherwise unusable and hence can be considered "unused")
- the Job Start Time (either the date compiled or the date last executed)
Is there a better approach?
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Enable server side tracing then open Designer and delete a job. Disable server side tracing and close Designer. Examine the trace file to see what happens when a job delete request is processed. Create a routine that makes the same calls. Don't be surprised that it's quite complex.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
asorrell
Posts: 1707
Joined: Fri Apr 04, 2003 2:00 pm
Location: Colleyville, Texas

Post by asorrell »

Also - don't be surprised if IBM refuses to assist in recovery from any problems except on a paid basis! Modifying the internal tables manually is sort of like opening the back of the TV and voiding your warranty...

and messing up the tables can have the same results as touching the spot marked "High Voltage"!

On a more serious note - anything you do at 7 will not work at release 8 either - the XMETA database complicates the situation even more.
Andy Sorrell
Certified DataStage Consultant
IBM Analytics Champion 2009 - 2020
Gazelle
Premium Member
Premium Member
Posts: 108
Joined: Mon Nov 24, 2003 11:36 pm
Location: Australia (Melbourne)

Post by Gazelle »

I found the server-side trace too difficult to use:
a) It is very large and it is hard to find the relevant parts.
b) The copy function doesn't have a "select all" feature, and only seems to allow text to be highlighted within a single screen (there is no auto-scroll, and manual scrolling deselects the text).
c) I could not find the file on the server, so I could not open it with a more powerful editor.
d) I got the feeling that there was a lot of missing detail, and I could not confidently deduce the statements to enter into a routine.

For example, the trace file included the entry:

Code: Select all

2009-05-15 14:59:24: DSR_JOB IN
  Arg=3
  Arg=Seq_to_delete
2009-05-15 14:5:25: DSR_MESSAGE  =Deleting file RT_STATUS4155
Deleting file RT_STATUS4155
DELETEd "RT_STATUS4155", Type 30
Field 2 in file definition record "RT_STATUS4155" has been set to nothing.
2009-05-15 14:5:25: DSR_MESSAGE  =Deleting file RT_LOG4155
Deleting file RT_LOG4155
DELETEd "RT_LOG4155", Type 30
Field 2 in file definition record "RT_LOG4155" has been set to nothing.
... etc.
From this, I'm guessing that the command from the routine is something like: DSR_JOB(3,Seq_to_delete)
But "guessing" is not as good as "knowing".
I simply do not yet know enough about the trace file to confidently convert what I see in the trace file into a routine.

Andy makes a very good point.
There are plans here to upgrade to v8, so that means there is limited "payback time" for building such a housekeeping script.

So I'll put this into the "too hard" basket for now, and suggest that we manually delete jobs via the GUI until we upgrade to v8.
Or I could request that IBM write use such a routine for us.

Thanks for your help.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

The file on the server is in the &COMO& directory.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
jatayl
Premium Member
Premium Member
Posts: 47
Joined: Thu Jan 19, 2006 11:20 am
Location: Rogers, AR

Post by jatayl »

I did something similiar, but I used uvsh to identify the jobs that were candidates for deletion. From that list, I manually deleted the jobs from the project using DataStage Manager. I'm sure there are other ways of doing this, but the dsjob -info will give you a start, and then from that list, grab the seq files/hash file names that could be candidates for deletion. I deleted about half of my project, and several hash files, which saved about 7% of the file system.

Jason
Post Reply