Fine tuning a parallel job

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
bhasannew
Participant
Posts: 31
Joined: Wed Aug 06, 2008 10:54 pm

Fine tuning a parallel job

Post by bhasannew »

Hi,

When i ran the job in single node it is taking 10 mins around. If i run it using the 4 nodes then it is taking 30 mins around to process all the records. I am hash partiitoning the incoming records based on key fields.
Please throw some light on this. I would like to know why the performance is decreasing even after increasing the nodes.

Thanks,
bhasan.
BugFree
Participant
Posts: 82
Joined: Wed Dec 13, 2006 6:02 am

Post by BugFree »

hi,

Need to know your exact job design. Whats the source, stages that you have used, target details etc.
Ping me if I am wrong...
bhasannew
Participant
Posts: 31
Joined: Wed Aug 06, 2008 10:54 pm

Post by bhasannew »

The stages used inthe job are

MQRead ---> Transformer --> column import ---> Oracle Stage

Hash partitioning is done in the MQRead stage
priyadarshikunal
Premium Member
Premium Member
Posts: 1735
Joined: Thu Mar 01, 2007 5:44 am
Location: Troy, MI

Post by priyadarshikunal »

When the time increases after increasing the nodes significantly, it generally means you are running out of resources. Increase in number of nodes Increases the CPU/Memory/IO utilization.

First try to run that job on different number of nodes while monitoring the resource utilization on the server to find out optimal number of nodes on that configuration.
Priyadarshi Kunal

Genius may have its limitations, but stupidity is not thus handicapped. :wink:
bhasannew
Participant
Posts: 31
Joined: Wed Aug 06, 2008 10:54 pm

Post by bhasannew »

hi,

thanks for your replies.
based on my observation of Statistics,
when i am significantly increasing the nodes the time is decreasing. but when i run the same job in single node it is taking very less time.
May i know the reason behind this?

I think since i am hash partitioning in the source stage, to process the records it is taking much time, but if it is single node no need of hashing as it is running in single node.

Am i rite?

Thanks in advance
Bhasan
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Rite? No. Right? Yes.
-craig

"You can never have too many knives" -- Logan Nine Fingers
bhasannew
Participant
Posts: 31
Joined: Wed Aug 06, 2008 10:54 pm

Post by bhasannew »

hi chulett,

Sorry for using unofficial words (eg: rite).

So based on your reply, i assume that my conclusion is correct.
am i right now? :)

thanks,
Bhasan.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

As before, yes - right.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Sainath.Srinivasan
Participant
Posts: 3337
Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom

Post by Sainath.Srinivasan »

I will like to differ from that assumption.

It is probably because your data volume is low that it can be processed in single node in 10 seconds.

But when you run in 4 nodes, there is a delay to create and spawn that many process and hence the delay.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

That 'delay' would also be affected by the hash partitioning, which would not be happening on a single node. So while it may not be the complete answer as to why it is faster, but it is correct that no hashing is needed on a single node job run.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Sainath.Srinivasan
Participant
Posts: 3337
Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom

Post by Sainath.Srinivasan »

Single node does do the partition but all rows landup in same bucket.

Partitioning is extremely fast and will take less than a nano-second (in SMP).

So the impact from partition - if any - will be less than a second.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Less than a nano-second, eh? I suppose you've actually measured that. :wink:

Of course, one should be dumping the score in both cases and comparing startup times and the work being done, single versus multiple nodes and contrasting that to the volume of data being processed. For all we know, a Server job would be the fastest implementation. And you're making an assumption that all nodes are on the same physical server, again for all we know they're distributed across multiple servers.

As always, lots of factors involved.
-craig

"You can never have too many knives" -- Logan Nine Fingers
priyadarshikunal
Premium Member
Premium Member
Posts: 1735
Joined: Thu Mar 01, 2007 5:44 am
Location: Troy, MI

Post by priyadarshikunal »

Sainath.Srinivasan wrote:I will like to differ from that assumption.

It is probably because your data volume is low that it can be processed in single node in 10 seconds.
The time is increasing from 10 mins to 30 mins, which is quite significant difference to look at startup time and time taken in partitioning.

As Sainath said it does partition the data on single node and then lands it to a single bucket. So, in no scenario it will increase the time by 20 minutes. I still think its because of resource congestion.

I agree with Craig to dump the score, also get the time taken by each operator in both cases and compare the stats.

What about monitoring the resources too in both cases?
Priyadarshi Kunal

Genius may have its limitations, but stupidity is not thus handicapped. :wink:
Post Reply