Page 1 of 1

Strange behaviour of multiple instances

Posted: Tue Jan 15, 2008 9:25 am
by eldonp
In an effort to reduce the number of jobs that do the same thing, i enabled multiple instances. I see some unexpected behaviour.

1. Only 2 instances of the job appear in the director - irrespective of how many are kicked off.
2. If more that 2 instances are run, only 2 run succesfully. Others abort. (i unfortunately last had this a few days ago, so i dont have an error)
3. I use the multiple instance jobs to create headers. Sometimes the instance runs, and the log/monitor shows that the record was created, but the file does not get created.

Has anyone else experienced this - or know what the cause is?

Posted: Tue Jan 15, 2008 10:33 am
by chulett
There's more to it than just turning it on. What have you done to make the jobs 'multi-instance'?

Posted: Tue Jan 15, 2008 10:41 am
by ds
Also, can you post the exact error message that is being logged with the aborts ?

- James

Posted: Tue Jan 15, 2008 2:16 pm
by ray.wurlod
Each instance needs a unique invocation ID. I suspect that you've run one instance with no invocation ID, and all the rest with the same invocation ID, which would result in one of them running and the remainder aborting because the job is already running.

But do identify the real cause of the problem from the job log next time this situation arises.

Posted: Wed Jan 16, 2008 3:27 am
by eldonp
chulett : I have checked the "Allow Multiple Instance" box in Job Properties to make it multiple instance.

ds : I said that I do not have an error message.

ray.wurlod: I do not run the job - only the instances, each with a unique invocation id, and parameters to ensure that the sources and targets are correct.

Workaround - create duplicate jobs.

Posted: Wed Jan 16, 2008 4:21 am
by ash_singh84
This workaround seems to be strange!!!

Posted: Wed Jan 16, 2008 6:05 am
by ArndW
If I recall correctly, some instance names don't work as expected, I think I had strange issues with instance "000" that would get trunated to an empty string when used in some functions.

Posted: Wed Jan 16, 2008 7:44 am
by chulett
As noted, there is more to it than simply checking a box. You have to strucuture the job such that it functions properly under MI control. For example, objects like sequential and hashed files can need unique names, typically incorporating the invocation id, in order to ensure one instance doesn't step on another instances data. Perhaps something like that is going on.

How about a quick overview of the job in question - stages used, flow, etc?

Posted: Mon Oct 20, 2008 12:22 pm
by jdmiceli
Hey Arnd,

Doesn't DataStage interpret 000 as the Nul indicator? That might account for the weird results.

I too am having very strange issues with V8. I have a ticket open with IBM Support now since they seem to think mine is multi-instance related as well. I will post any solutions they come up with in the hopes it will help eldonp with the problem.

V7 runs the exact same code with no issues on my systems. I have multiple companies running at the same time with no issues.

Hope this helps!

Bug in v8 with multi instance and log autopurge

Posted: Fri Oct 24, 2008 11:11 am
by jdmiceli
Having been pounding on the issue of multi-instanced jobs and V8 for a couple of weeks now, here is what we have learned at my company thus far:

1. There is a bug in V8 that appears when you have multi-instanced jobs and autopurge turned on for the logs. They have issued a couple of patches and tweaks to fix this issue, but it still does not appear to have worked completely.

2. There is a variable you should set to allay some of the problem. Add 'DS_NO_INSTANCE_PURGING=1' to your .profile on Unix (I'm not sure where it goes on Windows). This supposedly gets rid of the problem.

3. I have found a potential workaround but I have more testing to do yet.
It may be that it will only work for my project due to the way I have it set up (maximum parametization, invocation_id used for all job calls including the .csv files for job control, single base code for 19 companies). I recompiled all my jobs and then modified my .csv file to remove the invocation_id for the run. I did NOT change my .ini with regards to number of instances allowed. I set autopurge to retain 5 days worth of logs instead of the previous day settings I had before. I ran the job control with those settings. Then I re-added the invocation_id to my .csv file and ran again, and again, and again, clearing the database each time just to make sure a lot of data and log entries were made. Basically, I was trying to make it break. So far, doing that has been successful, but I reserve the right to be pessimistic until it continues to run for weeks on end without intervention.

IBM is still looking at the problem and we are waiting for responses.

Hope this helps!

Posted: Wed Oct 29, 2008 2:45 am
by telenet_bi
hi,
did you get a real solution from IBM for this?

I feel we have a simular problem but IBM tells us that
Following my investigations on DataStage architecture it appears that you cannot run more job in a second than the number of CPU installed on the server, there would be a dispatch issue. This is valid for DataStage 8
I think the link with the purging makes more sense