Page 1 of 2

Jobs not compiling in production, but they do in test

Posted: Thu Feb 19, 2009 7:29 am
by tracy
I've got a strange situation.

I've got code that's been running in production without any issues for a long time. Then all of a sudden, various parts started erroring. I can't Compile them from Manager. If I open the jobs in Designer, some of the "nodes" get that little yellow warning symbol.

All the parts that are failing are "nodes" that call multi instance reusable jobs that require you to fill in the Invocation Id. Some of the invocation ids had quotes and periods which if I remove, the problem goes away. Some of them were trying to use variables that didn't seem to exist so I fixed them appropriately. So I was able to fix all these pieces and things seem to be running fine.

But I'm worried now that these jobs had no problems yesterday, nor does the issue show up in dev/test. So I may develop/test something with no issues and then move to production and it doesn't work.

Another strange part:
We have two servers: a dev/test server and a production server. We have three projects: dev, test, prod. I can technically log onto the production server and open the test project, and vice versa. Here's the strange part: If I log onto the dev/test server and open the production project (the one throwing errors when on the production server), I don't see any problems. Similarly, if I log onto the production server and open the test project (which doesn't show any issues when on the dev/test server), the errors/warnings show up. So it doesn't seem like this is a project specific issue, but more of a server/environmental issue.

For the sake of full disclosure:
None of the jobs were modified in production as far as I know. We did have a few routines that weren't set to read-only and were accidentally changed. When I discovered that problem, I changed them back, initialized them back into Version Control and then promoted them right back as read only so that that doesn't happen again. I wouldn't think this has anything to do with it, but it was a change that happened right around the time the issue popped up.

So, does anybody have any ideas on what could cause this? It's like the Invocation Ids weren't taken seriously/literally before and then all of a sudden they were.

Re: Jobs not compiling in production, but they do in test

Posted: Thu Feb 19, 2009 9:39 am
by v2kmadhav
Hello Tracy

how were your jobs run untill these issues started to surface. where they scheduled in production?? where they being run from director??
how often is your code compiled. again are they just compiled or force compiled?? when was the last time any code was promoted into the prod box?
what is the exact msg that you get when you compile them? if you place your mouse on that warning exclamation what does it say?

when you say a different project is is opened on the server..... do you think they have everything set up the same way. eg: paramters etc Do they have the same set of env variables??

cheers

Posted: Thu Feb 19, 2009 10:04 am
by tracy
The jobs were scheduled in production.

I only compile when:
1. I've made a change in development.
2. When I promote to test or prod, I check the compile checkbox.
3. When something goes wrong. For instance, I did a full compilation through Manager yesterday to see how many jobs were suddenly having problems.

Other than the routines that I mentioned above, nothing has been promoted into production since early last week. Most of these jobs that had problems are run nightly, so I would've expected to see issues last week if that caused it.

The messages that were given when I compiled or placed my mouse on the warning had to do with the invocation id can only have alphanumeric/underscores/etc, that the parameter being used in the invocation id was invalid. That's how I knew that I could fix the Invocation Id to avoid the issue.

I would assume everything is the same on all the servers. We've never had a problem before. I've asked the admins if they did anything yesterday and they cannot think of anything.

Posted: Thu Feb 19, 2009 10:19 am
by v2kmadhav
Tracy
Hello again

do you find that all these jobs that have started showing issues have any similarity? do you think they all use those routines that you re-promoted?

Posted: Thu Feb 19, 2009 10:39 am
by tracy
The similarity is that they all use Invocation Ids. After the issue surfaced, it seemed as though the Invocation Ids were invalid. But prior to the issue surfacing (for instance, last week in production or currently in dev/test), everything seemed fine.

So it's like there was suddenly an Invocation Id Check that started happening.

I think all our jobs use routines and since I repromoted all the routines to make sure they were all read only, they were all somewhat affected. But only the jobs with "invalid" Invocation Ids are having problems.

Posted: Thu Feb 19, 2009 10:46 am
by ArndW
Can you compare your two uvconfig files to see if they are identical?

Posted: Thu Feb 19, 2009 10:48 am
by v2kmadhav
Tracy


If you are very confident that both servers have the same config, same versions, same patches etc...

let me ask you one last thing before someone else familiar with a similar issue helps us understand this better.

if you say that you test/dev box still has the unchanged code ....
can you please force compile those jobs in test/dev box and try running them without changing the way those invocation ids are being coded....

if they work there its obvious that someone has changed something on the dev box to let that work... else if you have a older backup of the jobs try replacing them and see if they are any different to the present ones.

cheers...all the best.

Posted: Thu Feb 19, 2009 10:53 am
by tracy
uvconfig seems to be the same on each server.

Posted: Thu Feb 19, 2009 10:53 am
by Sainath.Srinivasan
Did not read all the mails, so do not know if this has been covered...

1.) How were the jobs scheduled? i.e. via DataStage or external scheduler?
2.) What is the invocation id (prior to failure)? i.e. what format?
3.) What was the invocation id (during failure)?
4.) Were there any invocation sequence value reset which may have affected?

Posted: Thu Feb 19, 2009 10:54 am
by tracy
uvconfig seems to be the same on each server.

Posted: Thu Feb 19, 2009 10:55 am
by chulett
No such thing as 'force compile' in Server... and I do believe they've said they've already used the Multiple Job Compiler option in the Manager with no effect.

Posted: Thu Feb 19, 2009 10:59 am
by tracy
1) The jobs were scheduled from Director.
2) These invocation ids worked before: 'a' a.a
3) During failure, they were same as #2. Once they failed, I changed them to: a a_a
4) I don't know what an invocation sequence value reset is, so I'm guessing the answer is no.

Posted: Thu Feb 19, 2009 10:59 am
by v2kmadhav
chulett wrote: .....can't Compile them from Manager.......
thought it never went beyond that point....

sorry it was my fault never realized that it was server....My fault.

Posted: Thu Feb 19, 2009 11:04 am
by ArndW
Export a sample job from each environment and see if the .dsx files are identical.

Posted: Thu Feb 19, 2009 11:10 am
by Sainath.Srinivasan
1.) What if you try to run now with old format of invocation id?
2.) Was the invocation Id prepared from some given set of values?
3.) Was the job/routine preparing invocation id disturbed / corrupted?