Page 1 of 1

Can we run a parallel Extender on s single processor

Posted: Wed Oct 05, 2005 8:58 pm
by king999
hi
can we run parallel job on a single processor unit. if we configure the stage to sequential and do the parallel job

Posted: Wed Oct 05, 2005 9:43 pm
by vmcburney
Yes you can run a parallel job on a single CPU. You can also type with the cap lock off and start sentences with a capital letter but that's getting off the topic.

If you have a one CPU dev environment and you are delivering to a multiple CPU prod environment I recommend you configure at least two nodes to verify that your jobs are partitioning correctly. A single CPU should be able to handle two nodes.

Sometimes you get a requirement to run a job on a single node even though the default configuration for the server is multiple nodes. For example a small job that runs faster as a single instances, or a job that reads from and writes to a sequential file. Just create a config file with one node in it, add the $APT_CONFIG_FILE environment variable to your job and set it to the name of the 1 node file.

Posted: Wed Oct 05, 2005 10:50 pm
by pavanns
I FEEL YOU CAN TRY IT OUT BY CHANGING THE CONFIG_APT FILE IN THE CONFIGURATIONS IN DS MANAGER MODULE..

Posted: Fri Oct 07, 2005 2:06 pm
by track_star
Of course you can....but don't expect great things. You could even run it in a single process if you really want to get crazy. :shock:

Posted: Fri Oct 07, 2005 4:18 pm
by ray.wurlod
That would be a good trick! I assume you're not counting the conductor or player process in this, and combining all operators?

Hey, let's totally defeat the functionality of this parallel architecture that we've just spent a squillion dollars on! :roll:

Posted: Sat Oct 08, 2005 3:50 am
by sudheer05
Hi,
Enterprise Edition is the version of DataStage that allows you to develop
parallel jobs. These run on DataStage UNIX servers that are SMP, MPP, or
cluster systems, but you can install it on an Windows server in order to
develop jobs which can subsequently be run on a UNIX server.

Posted: Sat Oct 08, 2005 7:03 am
by chulett
Well, there is a specific version of DataStage EE that allows you to run PX jobs on Windows - 7.5.X2 - from what I recall.

Posted: Sat Oct 08, 2005 5:38 pm
by ray.wurlod
chulett wrote:Well, there is a specific version of DataStage EE that allows you to run PX jobs on Windows - 7.5.X2 - from what I recall.
No dot, little "X". 7.5x2

Check data integrity one node environment ?

Posted: Mon Oct 10, 2005 2:29 am
by SPI
I would like to spot a point of the reflexion of Vincent. - Can we have problems or differences with results between development and production if we develop jobs on a single node (no matter the number of processor) and exploit them on multiples nodes environment.
Is It really necessary to have two nodes minimum on the environment of development to check the data integrity ?
I would wish that somebody assure me the unpleasant feeling which I'm not making good job by working on a single node development environment...

Posted: Mon Oct 10, 2005 12:47 pm
by roy
Hi,
Your better off with 2 CPU minimum so you can verify your partitioning logic is ok for multiple nodes.

There is no way of verifying it otherwise.
Even experts make mistakes.

IHTH,

Posted: Tue Oct 11, 2005 2:25 pm
by track_star
SPI--you definitely can (and more than likely will) have problems if you develop in one node and deploy to production in multiple nodes. Hash partitioning, joins, removing duplicates, and loading databases in parallel come to mind......

You can run PX on a single CPU (like to do UAT), just don't overload that single CPU. Roy is right, you're better off with multiple CPUs, but you can run a two node config file on a single CPU system.

Ray....the env var to set is APT_EXECUTION_MODE=single process. It's a pretty good tool for debugging--and not much else!!!

Posted: Tue Oct 11, 2005 6:15 pm
by vmcburney
I would never develop on a single node if my server could handle two. A single CPU server should be able to handle two nodes. If I had two CPUs I would consider four nodes. The more nodes you use the more likely it is that you will spot partitioning errors during your unit testing. If you cannot do it in your development environment then make sure you are doing it in your testing environment.

To move a job into production without ever checking it against multiple nodes is very dangerous.

We also keep several different configuration files in our dev environment so if we see something that we think is a partition problem we switch to a different number of nodes to see if we get a different result.

No multiple nodes configuration with Sun OS !!!

Posted: Wed Oct 12, 2005 2:52 am
by SPI
Thank you has all for your answers, my anguish was justified. Meanwhile, one gave me the following explanation : on our machine (SUN 5.8, PX V7.0.1), the jobs failed all the time when they were processed with multiples nodes configurations. The problem would be identified at Ascential support on SUN OS. The increase in the number of processor and RAM did nothing there. I acknowledge to find these explanations eccentric because the nodes define only directories and thus disk spaces to distribute the datasets. For me, there's no direct link with the core parameters or the dynamic capacity of the machine. Have you ever heard about this bug on SUN. Thank you for your answers.

Posted: Wed Oct 12, 2005 9:49 am
by track_star
If you don't have it in your LD_LIBRARY_PATH for the projects, make sure /usr/lib/lwp is in there. Sun had a known issue with some of the libraries at 2.8 and provided those new ones to resolve the issues. I don't know if that would help with your issue of jobs failing when run with multi-node configs or not, but it might help. You could also drop some design-specific info into a new post--there might be something else going on.