CPU AND LINK PARTIONERS

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
kollurianu
Premium Member
Premium Member
Posts: 614
Joined: Fri Feb 06, 2004 3:59 pm

CPU AND LINK PARTIONERS

Post by kollurianu »

Hi All,

Can please shed me some light on like how many partions ( link partioner) are good to get good results with 2 cpus


thank you
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Not more than four, and that may be reduced by whatever else is happening in the job, particularly between the link partitioner and link collector stages.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
roy
Participant
Posts: 2598
Joined: Wed Jul 30, 2003 2:05 am
Location: Israel

Post by roy »

Hi,
naturally you must also concider the machine load at the time you want to run the job.
bare in mind 1 process for the link partitioner and 1 for the collector making 2, plus depending on your design x processes where x stands for the number of links you split it to; making a minimum total of 2 + x processes.
as Ray said it depend on what you implement before,between and after the link partitioner/colector stages.

IHTH,
Roy R.
Time is money but when you don't have money time is all you can afford.

Search before posting:)

Join the DataStagers team effort at:
http://www.worldcommunitygrid.org
Image
kollurianu
Premium Member
Premium Member
Posts: 614
Joined: Fri Feb 06, 2004 3:59 pm

Post by kollurianu »

still iam not clear as to how many partions i need to use for link partioner

if i have to 2 cpus and how to determine this number on what basis.

thank you all
chucksmith
Premium Member
Premium Member
Posts: 385
Joined: Wed Jun 16, 2004 12:43 pm
Location: Virginia, USA
Contact:

Post by chucksmith »

When your job completes, what % CPU does the job monitor in DataStage Director show?

Let's say it says 50%. Since you have 2 CPUs, you have 200% available.

If your job is the only job running, then 3 or 4 partitions would be possible.

That is:

Code: Select all

200 % available CPU
---------------------------------- = 4 partitions
50 % CPU used by 1 partition
If the CPU statistics are not available in the job monitor, you can do a similar calculation based upon the sum of all CPU times from the finishing records in the job log divided by the elapse run time of the job.
kollurianu
Premium Member
Premium Member
Posts: 614
Joined: Fri Feb 06, 2004 3:59 pm

Post by kollurianu »

But when you develop a job in the development environment and
then some time you run on production environment , so how do determine
how partitions are good for optimal performance for that job.

thank you very much
mhester
Participant
Posts: 622
Joined: Tue Mar 04, 2003 5:26 am
Location: Phoenix, AZ
Contact:

Post by mhester »

I believe the answer to your question is not as simple as some algorithm or formula that can be given to you here. I'm not sure Ascential publishes such information in a definitive way. My experience has always been that there is a point of diminishing return, meaning that at some point the overall performance of the job will suffer as you add more links/processes. I have found the types of transformations (if any) that happen between the partitioner and collector make a difference as well as the source and target.

I have found on a Wintel box with 8 procs (in our configuration) that 4 streams worked very well while 5 caused a significant slow down. I would think this would be different on Unix and have witnessed this to be true.

You also have to be concerned with what other processes are running on the box like Oracle, SQL Server, or other applications. These will and does have a direct impact on how processing takes place and will also help dictate how to partition the data.

I suggest you add links till it hurts and then back off and namely you should play around with different configurations to find the one that is best suited to your environment.

Regards,
kollurianu
Premium Member
Premium Member
Posts: 614
Joined: Fri Feb 06, 2004 3:59 pm

Post by kollurianu »

well, thanks for all ur inputs , this looks like variable performance , on
production environment u can never when u need to run the job and at that time how the job is going to perform , depending on the cpu availability.

Can any one shed me light on how link partioners and multiinstance job are related ,

exactly how does multiinstance job work and in which scenarios it is used.


Thank you all once again.
chucksmith
Premium Member
Premium Member
Posts: 385
Joined: Wed Jun 16, 2004 12:43 pm
Location: Virginia, USA
Contact:

Post by chucksmith »

They can provide you with similar function. However, with multi-instance jobs, you must be able to partition your input, and ensure you do not have any contention issues with your outputs. My opinion is that partitioner/collector pairs give you more control over the parts of a job that you parallelize.

Still, efficient routines and derivation should be your first concern.
kollurianu
Premium Member
Premium Member
Posts: 614
Joined: Fri Feb 06, 2004 3:59 pm

Post by kollurianu »

which one is better to user multiinstance or linkpartitioners and collectors

thank you all once again
Post Reply