Page 1 of 1

Unable to open project '<project_name>' - -1002

Posted: Fri Jul 31, 2009 9:36 am
by shankar_ramanath
I have a job that uses the Uverse function SPLICE in a BASIC transformer. When I run this job on Development, the job works fine irrespective of whether I set it for Sequential or Parallel processing (Stage - Advanced tab). The development server uses one machine and the job is typically configured with 4 nodes.

When I run the same job on Production, it fails with the error "Unable to open project '<project_name>' - -1002". I have tried to run the job by setting the BASIC transformer stage to both Sequential and Parallel processing with the same results. The production configuration has four machines and is configured with 16 nodes.

Before the error message occurs, the following messages appear in the event log.

Code: Select all

APT_PM_StartProgram: Locally - /opt/IBM/InformationServer/Server/PXEngine/etc/standalone.sh /opt/IBM/InformationServer/Server/PXEngine -APT_PMprotoSectionLeaderFlag --APTNoSetupProgram /opt/IBM/InformationServer/Server/PXEngine/etc/standalone.sh -APT_PMsetupFailedFlag /opt/IBM/InformationServer/Server/PXEngine/bin/osh -APT_PMsectionLeaderFlag etl0001 10000 1 30 node1 etl0001 1249053186.862427.2cae 0 -os_charset UTF-8
There is one such message for each node. Although I do not pretend to understand this well, I think these message set up the environment for using the BASIC transformer.

After these messages are logged, the job fails with the error mentioned in the subject.

I did not try to run this job using a single machine APT configuration and I probably cannot afford to, given the number of operators and parallelism needed.

I looked for similar posts and found that had a different error code for which Ray had suggested using "SELECT * FROM SYS.MESSAGE WHERE @ID = <id>". I tried out the same for -1002 (and 1002). There were no results.

Please let me know how to go about it. I am in the process of raising a support ticket to IBM but thought I would check with the Gurus first.

[/code]

Posted: Fri Jul 31, 2009 9:48 am
by chulett
I'm curious if you've successfully run other jobs in that Production environment with BASIC Transformer stages before this or if that is the crux of the problem. Rather than focusing on the SPLICE angle, that is.

Posted: Fri Jul 31, 2009 10:04 am
by ArndW
I think that IBM support is going to ask you the same thing that Craig just did. Try to write a simple job with a pass-through BASIC transform stage and see if you can get that to run.

Posted: Fri Jul 31, 2009 5:24 pm
by shankar_ramanath
Thanks for the replies.

I see that this job works in Production if I configure it to run on one machine. Here I lose the advantage of parallelism because I cannot use the other three machines. I am wondering if the issue is because of connectivity. I have a couple related questions.

These machines are named Prod1, Prod2, Prod3, Prod4. The issue itself occurred on Prod1.

1. I tried to "rsh" from Prod1 to Prod2. It does not work. Is it necessary for rsh to work between machines?
2. The original error message was "Unable to open project <project_name>". This error message was seen in the Director logs from the machine (Prod1) that corresponds to <project_name>. I am wondering if it was a different machine (Prod2, Prod3, or Prod4) emanating the error message. How would I know? Since the conductor node is on Prod1, all messages are provided by Prod1.

Many thanks,

Posted: Fri Jul 31, 2009 5:27 pm
by chulett
In order to run the BASIC Transformer on multiple servers, don't you need the Server engine installed on all of them? :?

Posted: Fri Jul 31, 2009 6:10 pm
by shankar_ramanath
I am sorry. It is a naive question. How do I know. I looked into the installation directory (/opt/IBM/InformationServer/Server) and I see the following

branded_odbc
Datasets
Dsdk
DSParallel
JCLTemplates
PXEngine
Configuration
DSComponents
DSEngine
Estimation
MsgHandlers
Performance
Template
Projects

Posted: Fri Jul 31, 2009 10:54 pm
by chulett
The question was actually more for Arnd and/or Ray whenever they come back to the thread, or for whomever wants to chime in on the subject. And now... we wait. :wink:

Posted: Sat Aug 01, 2009 3:35 am
by ArndW
Craig - you got it correctly, the BASIC stages will only run on a node where the engine is installed; I missed that the original poster is running in a distributed environment.

Posted: Sat Aug 01, 2009 1:13 pm
by shankar_ramanath
ArndW wrote:Craig - you got it correctly, the BASIC stages will only run on a node where the engine is installed; I missed that the original poster is running in a distributed environment.
Hi Anrdw,

Thanks. Could you please clarify what "the engine" means? Does it refer to Server engine. If so, how would I know if the Server engine is installed? I have provided the list of directories in the installation directory in the previous post.

Many Thanks,

Posted: Sat Aug 01, 2009 5:09 pm
by chulett
Bascially, it does refer to the Server engine and that means the DSEngine directory. It would need to be installed on every physical server you want to run any job with the BASIC Transformer in it.

Posted: Sun Aug 02, 2009 4:44 pm
by ray.wurlod
And that will cost you a bomb in extra licensing charges.
Use a node pool and run the BASIC Transformer in that - you can use four nodes from the machine where the server engine is installed and at least get some degree of parallelism.

Posted: Sun Aug 02, 2009 8:33 pm
by shankar_ramanath
ray.wurlod wrote:And that will cost you a bomb in extra licensing charges.
Use a node pool and run the BASIC Transformer in that - you can use four nodes from the machine where the server engine is installed and at least get some degree of parallelism.
Thanks Craig/Ray.

I see that DSEngine is installed in all four machines.

Could this error because of any other (possibly communication related) issue? For example, I could not perform an rsh from machine 2 to machines 1, 3 and 4 (although I could perform an ssh). Does that indicate anything? In other words, is there a generic resolution to error code -1002? I am confused because this error is specific to BASIC transformer, but the BASIC transformer itself works well on a single machine configuration. Could you please let me know if I need to try something else?

Posted: Wed Sep 09, 2009 11:18 pm
by shankar_ramanath
IBM informed that BASIC transformer cannot work in MPP environments. It is also documented (Parallel Job Developer Guide: page 189)

Closing the thread as workaround :?