TPMC vs The Number of CPU

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
dhwankim
Premium Member
Premium Member
Posts: 45
Joined: Mon Apr 07, 2003 2:18 am
Location: Korea
Contact:

TPMC vs The Number of CPU

Post by dhwankim »

Hi Gurus.

We consider New Server For DataStage (DW Server). So I want to know what is more affect to job speed, TPMC Speed or The Number of CPU.

Addition information of our circumstance, We use datastage 4.2, We did not use PX Module. and When the pick time, We run 30 more job parallelly. Oracel also use same server.

We consider to buy 16 CPU (clock speed 1.5 GHz) or 24 CPU (800 MHz)

We want a your opinion of ELT View poing.

Thanks in advance.

Regards
D.H Kim
roy
Participant
Posts: 2598
Joined: Wed Jul 30, 2003 2:05 am
Location: Israel

Post by roy »

Hi,
well it depends on several things.
I bet Ken will say the more jobs you run in parallel the more CPUs you want (or just the more CPUs the better).
so basically your question would be what is the max number of parallel DS processes you need to run?

getting a simple comparison of 1 on 1 1.5GH is better then 800MH means that the same process, if it is CPU bound, will run faster.
getting 16 times 1.5 GH you get 24GH power while 24 times 800MH gets only 19.2GH power
you didn't mention what is the CPU memory cache :(

this issue, or similar was dicused here in the past (search for it).

bare in mind that performace comes also from network and disk speed, especially when dealing with large data volumes that go thru theese resources, so a great servers won't help if theese resources are poor (meaning small band width of network and/ or slow disks).

IMHO I guess you'll need to check some things, but I think you'll ,surpizing as it may sound, probably get more with the 24 CPU configuration if you have plenty more small jobs that run in parallel (more then could be run on 16 CPUs).

the real question would be, can you mesure the resources you'll need?

the only sad thing is that you actually need to run realtime tests on both to get the real answer and also get as much work as you can in parallel hopefully maxing out your resources

Good Luck
Roy R.
Time is money but when you don't have money time is all you can afford.

Search before posting:)

Join the DataStagers team effort at:
http://www.worldcommunitygrid.org
Image
roy
Participant
Posts: 2598
Joined: Wed Jul 30, 2003 2:05 am
Location: Israel

Post by roy »

Also,
do you have long processes?
by that I mean a process that is composed of many small ones that are sequential in logic
i.e. process x is dependant on processes a + b, process y depends on process d and x and so on.
this would mean that you have some sequential dependancies that might be complex and can't be done in parallel untill some other tasks were done first.
in that case you want this to be done ASAP and that means you need them to finish faster hence possibly faster CPU.

IHTH
Roy R.
Time is money but when you don't have money time is all you can afford.

Search before posting:)

Join the DataStagers team effort at:
http://www.worldcommunitygrid.org
Image
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

You also should seriously consider upgrading DataStage. The changes that have come in versions 5+ are not only welcome from a developer / ease of use standpoint but there were considerable speed improvements made in the underlying engine.
-craig

"You can never have too many knives" -- Logan Nine Fingers
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

Excellent replies so far folks. One thing to consider is that a higher end machine with fewer cpus leaves you with room to expand. A lower end machine fully stocked leaves you with no options. You have to consider psychology, can you get a fewer cpu but faster machine today, and then bargain for more cpus later?

For a single job perspective, a faster cpu is best. But you have to do double time with a database, so you'll probably want to optimize for the database, and worry less about the ETL. Thru divide and conquer job instantiation you will be able to arrive at your service level for the ETL.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
roy
Participant
Posts: 2598
Joined: Wed Jul 30, 2003 2:05 am
Location: Israel

Post by roy »

Hi,
regarding the issue of buying more CPUs at a later date to enhance the macine you have,
the only thing I can contribute is that most times if you don't do it withing a couple of years it turns to be not as easy as you want it to be and you'll probably end up buying a new server which might not be a good idea if you have the funds.

in other words if you plan to upgrade your server, resource wise, make sure it won't be a problem in the time frame you plan or estimate you'll need to.

IHTH
Roy R.
Time is money but when you don't have money time is all you can afford.

Search before posting:)

Join the DataStagers team effort at:
http://www.worldcommunitygrid.org
Image
Post Reply