Page 1 of 1

Delete/Remove multiple instances of jobs

Posted: Wed Nov 03, 2010 3:23 am
by SachinCho
Hi,
in our project we have lot of resuable multiple instances job. This creates hundreds on instances in single day. We want to clean these instances on weekly basis through an automated process. There should not be any manual intervention.
Any pointers
e.g we have a batch id generation job which runs for every process to generate new batch id. Invocation id passed to it is process name. So as many times a process runs in a day, those many instances would be available.

Posted: Wed Nov 03, 2010 4:33 am
by ArndW
This isn't as trivial task as one might assume. When a job is compiled or imported the instance information is no longer displayed in the director, so a quick method would be to schedule an unconditional import (with binaries) when the job is not active. It isn't a pretty solution, but functional.
I've been at a site with a similar configuration and wrote a program to cleanly purge the logs of instance run records but the code ended up being quite a bit more complex than I had originally planned and I would not recommend manually modifying the log files (plus if the repository is in XMETA this method will not function as it deletes directly from the log hashed file)

Posted: Tue Jul 08, 2014 5:38 am
by zulfi123786
Hi,

I am exactly in the same situation, Just got into a project where there are too many (7000) instances of a job and there are few jobs like this. Because of too many instances the Director crashes when launched.

It has been three years since this thread but posting with a hope if any workaround was made available to purge multi instance log entries.

Thanks

Posted: Tue Jul 08, 2014 7:51 am
by ArndW
Still the same two methods - re-import or re-compile. As mentioned earlier, doing it via BASIC code is possible, but not a trivial task.

Posted: Tue Jul 08, 2014 8:24 am
by asorrell
Is there any reason you can't use the automatic log clean-up setting in the Administrator Client? I do have one client that sets it to only keep 2 days worth of logs because of the massive number of multi-instance jobs that they run.

Posted: Tue Jul 08, 2014 8:56 am
by zulfi123786
Project level auto purge is set for one day, problem is that all of the 7000 instances run on week days so the auto purge won't help :(

Thanks

Posted: Tue Jul 08, 2014 9:16 am
by PaulVL
???

Not sure I follow that statement.

Autopurge happens every time the job executes. It purges any log entry (for that run) on that job that is older than your purge setting. It doesn't care about day of the week.

Posted: Tue Jul 08, 2014 1:50 pm
by zulfi123786
PaulVL wrote:???

Not sure I follow that statement.
Appears like my post was unclear.

What I meant was, The jobs run every weekday and on every weekday all of its 7000 instances are triggered, this populates too much data into the log file and when ever the Director is opened it has to look into the log file (or some other place) to fetch all of its instances and list them as different entries in the log view and this appears to take a lot of time and the Director crashes after being hung for long time.

To fix this I need to have only fewer instance logs but if I specify one day as auto purge criteria it would still keep all of 7000 instance entries as all of them have run on the current day.

Just wondering if I specify only last 2-3 entries for the job would keep only 2-3 instance logs OR would it keep 2-3 log entries per each instance ?

Either way we have to maintain log entries for each instance Just in case something blows up and we are doomed in the darkness having no clue of what actually happened.

Looks to me as DataStage (specifically the Director Client) as an application is unable to handle too may instances, I do understand 7000 is way too high count but I wish it could have handled this gracefully .

Curious to know how many max instances have people had in their projects without any kind of issues.

Thanks

Posted: Tue Jul 08, 2014 2:42 pm
by PaulVL
Ahhh...

You do not want to limit your job entries to 3. If you have 3 instances running and a 4th turns up... boom, #1 job will fail because his log is not there anymore.

You'd have to estimate how many concurrent jobs you have of that multi instance job, and even then... it would nuke any history of aborted jobs.


7000 for one job for one day is a tad much.

10 jobs (clones) with 700 entries each?

100 with 70?

job_1pm, job_2pm, job_3pm, etc...


Code updates would be a pain if you had to _1, _2, _3 them all.


The magic number of how many job entries are sustainable is influenced by how many log entries exist per job.

you are in a pickle, that's for sure.

Posted: Tue Jul 08, 2014 11:36 pm
by ray.wurlod
If you clear the entries for a particular invocation from the RT_STATUSnnn table, then the log entries for that invocation will also cease to be visible, and will be removed at the next auto-purge (or, perhaps, sooner, depending on how the logging agent is operating).

Posted: Wed Jul 09, 2014 4:17 am
by zulfi123786
ray.wurlod wrote:If you clear the entries for a particular invocation from the RT_STATUSnnn table, then the log entries for that invocation will also cease to be visible
I wish there was an API for that, this keeps me wondering how to actually do it ? is it as simple as just running a delete command using the invocation id OR is there lot more to it (guess there is lot more) and what about the XMETA stuff if we are deleting directly from RT_STATUSnnn and RT_LOGnnn, is there anything to worry about ?

If I can safely delete the entries in RT_STATUSnnn would it unconditionally remove its corresponding log entries in RT_LOGnnn or did you mean they would get deleted based on the auto purge settings (is this what you meant by logging agent) . I can see many instances have run for a single time and the auto purge is to keep Older than days = 1 since they were never rerun they still live there.