Delete/Remove multiple instances of jobs

A forum for discussing DataStage<sup>®</sup> basics. If you're not sure where your question goes, start here.

Moderators: chulett, rschirm, roy

Post Reply
SachinCho
Participant
Posts: 45
Joined: Thu Jan 14, 2010 1:23 am
Location: Pune

Delete/Remove multiple instances of jobs

Post by SachinCho »

Hi,
in our project we have lot of resuable multiple instances job. This creates hundreds on instances in single day. We want to clean these instances on weekly basis through an automated process. There should not be any manual intervention.
Any pointers
e.g we have a batch id generation job which runs for every process to generate new batch id. Invocation id passed to it is process name. So as many times a process runs in a day, those many instances would be available.
Sachin C
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

This isn't as trivial task as one might assume. When a job is compiled or imported the instance information is no longer displayed in the director, so a quick method would be to schedule an unconditional import (with binaries) when the job is not active. It isn't a pretty solution, but functional.
I've been at a site with a similar configuration and wrote a program to cleanly purge the logs of instance run records but the code ended up being quite a bit more complex than I had originally planned and I would not recommend manually modifying the log files (plus if the repository is in XMETA this method will not function as it deletes directly from the log hashed file)
zulfi123786
Premium Member
Premium Member
Posts: 730
Joined: Tue Nov 04, 2008 10:14 am
Location: Bangalore

Post by zulfi123786 »

Hi,

I am exactly in the same situation, Just got into a project where there are too many (7000) instances of a job and there are few jobs like this. Because of too many instances the Director crashes when launched.

It has been three years since this thread but posting with a hope if any workaround was made available to purge multi instance log entries.

Thanks
- Zulfi
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Still the same two methods - re-import or re-compile. As mentioned earlier, doing it via BASIC code is possible, but not a trivial task.
asorrell
Posts: 1707
Joined: Fri Apr 04, 2003 2:00 pm
Location: Colleyville, Texas

Post by asorrell »

Is there any reason you can't use the automatic log clean-up setting in the Administrator Client? I do have one client that sets it to only keep 2 days worth of logs because of the massive number of multi-instance jobs that they run.
Andy Sorrell
Certified DataStage Consultant
IBM Analytics Champion 2009 - 2020
zulfi123786
Premium Member
Premium Member
Posts: 730
Joined: Tue Nov 04, 2008 10:14 am
Location: Bangalore

Post by zulfi123786 »

Project level auto purge is set for one day, problem is that all of the 7000 instances run on week days so the auto purge won't help :(

Thanks
- Zulfi
PaulVL
Premium Member
Premium Member
Posts: 1315
Joined: Fri Dec 17, 2010 4:36 pm

Post by PaulVL »

???

Not sure I follow that statement.

Autopurge happens every time the job executes. It purges any log entry (for that run) on that job that is older than your purge setting. It doesn't care about day of the week.
zulfi123786
Premium Member
Premium Member
Posts: 730
Joined: Tue Nov 04, 2008 10:14 am
Location: Bangalore

Post by zulfi123786 »

PaulVL wrote:???

Not sure I follow that statement.
Appears like my post was unclear.

What I meant was, The jobs run every weekday and on every weekday all of its 7000 instances are triggered, this populates too much data into the log file and when ever the Director is opened it has to look into the log file (or some other place) to fetch all of its instances and list them as different entries in the log view and this appears to take a lot of time and the Director crashes after being hung for long time.

To fix this I need to have only fewer instance logs but if I specify one day as auto purge criteria it would still keep all of 7000 instance entries as all of them have run on the current day.

Just wondering if I specify only last 2-3 entries for the job would keep only 2-3 instance logs OR would it keep 2-3 log entries per each instance ?

Either way we have to maintain log entries for each instance Just in case something blows up and we are doomed in the darkness having no clue of what actually happened.

Looks to me as DataStage (specifically the Director Client) as an application is unable to handle too may instances, I do understand 7000 is way too high count but I wish it could have handled this gracefully .

Curious to know how many max instances have people had in their projects without any kind of issues.

Thanks
- Zulfi
PaulVL
Premium Member
Premium Member
Posts: 1315
Joined: Fri Dec 17, 2010 4:36 pm

Post by PaulVL »

Ahhh...

You do not want to limit your job entries to 3. If you have 3 instances running and a 4th turns up... boom, #1 job will fail because his log is not there anymore.

You'd have to estimate how many concurrent jobs you have of that multi instance job, and even then... it would nuke any history of aborted jobs.


7000 for one job for one day is a tad much.

10 jobs (clones) with 700 entries each?

100 with 70?

job_1pm, job_2pm, job_3pm, etc...


Code updates would be a pain if you had to _1, _2, _3 them all.


The magic number of how many job entries are sustainable is influenced by how many log entries exist per job.

you are in a pickle, that's for sure.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

If you clear the entries for a particular invocation from the RT_STATUSnnn table, then the log entries for that invocation will also cease to be visible, and will be removed at the next auto-purge (or, perhaps, sooner, depending on how the logging agent is operating).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
zulfi123786
Premium Member
Premium Member
Posts: 730
Joined: Tue Nov 04, 2008 10:14 am
Location: Bangalore

Post by zulfi123786 »

ray.wurlod wrote:If you clear the entries for a particular invocation from the RT_STATUSnnn table, then the log entries for that invocation will also cease to be visible
I wish there was an API for that, this keeps me wondering how to actually do it ? is it as simple as just running a delete command using the invocation id OR is there lot more to it (guess there is lot more) and what about the XMETA stuff if we are deleting directly from RT_STATUSnnn and RT_LOGnnn, is there anything to worry about ?

If I can safely delete the entries in RT_STATUSnnn would it unconditionally remove its corresponding log entries in RT_LOGnnn or did you mean they would get deleted based on the auto purge settings (is this what you meant by logging agent) . I can see many instances have run for a single time and the auto purge is to keep Older than days = 1 since they were never rerun they still live there.
- Zulfi
Post Reply