Page 1 of 1

Logical Node vs Physical Node

Posted: Fri May 30, 2014 12:29 pm
by chandra.shekhar@tcs.com
Hi All,

I had an informal discussion my lead today regarding datastage architecture.
We had a bit of conflict of interest on the logical/physical node.
I told him the actually if we have a single CPU setup then basically it can be considered as a physical node.
Whereas multiple logical node can be created within a physical node.
But he wasnt impressed with my explanation.

Can somebody correct/confirm me if I have said correctly ?

Posted: Fri May 30, 2014 12:53 pm
by chulett
Close but it has nothing to do with how many CPUs are involved. A physical server can be considered to be a "physical node" and can support a number of "logical nodes" as they are more akin to threads than anything else.

Posted: Fri May 30, 2014 2:38 pm
by qt_ky
And don't forget virtual servers and virtual CPUs... :wink:

A physical processor chip (CPU) can have many physical processor cores, all of which can be virtualized. It can be confusing without fully qualifying everything.

A single physical node in a grid could consist of a 4 processor core SMP system.

Posted: Fri May 30, 2014 11:57 pm
by chandra.shekhar@tcs.com
@qt_ky
can you explain what do you mean by
all of which can be virtualized
I guess I need to revisit my architecture concepts :(

And what do I understand if somebody says that - my DS is installed on a physical server vs its installed on a virtual machine ?

Posted: Sat May 31, 2014 8:26 am
by chulett
He just means that your "server", the "box" you are running DataStage or anything else on can be a dedicated physical piece of hardware or a small virtual slice of it. In today's world, more and more servers are virtual servers. Our hosting service provides a crap ton of servers for us, 100% of which are of the virtual variety. And a physical CPU can be shared across virtual servers. One such discussion of that subject is here. Looks the same from the inside (we don't treat them in any way differently) and doesn't really change the conversation in my mind when discussing DataStage nodes.

Your "physical node" is a hardware topology concept and extends well beyond DataStage or any other application. It can some into play in the Parallel world when worrying about the Conductor node versus the other player nodes and of course with a Grid setup. However, outside of a grid, all of those nodes will play happily on a single server. And as noted, a "node" in DataStage is a purely logical concept. When you run something on "four nodes" you're creating four separate processes that work on the problem together, be it all on one server or some number of servers up to five. It doesn't really care and the number of CPUs involved - virtual or otherwise, physical chips or processor cores - do not come into the definition.

That's my take on the subject, anywho.