Hi Gurus.
We consider New Server For DataStage (DW Server). So I want to know what is more affect to job speed, TPMC Speed or The Number of CPU.
Addition information of our circumstance, We use datastage 4.2, We did not use PX Module. and When the pick time, We run 30 more job parallelly. Oracel also use same server.
We consider to buy 16 CPU (clock speed 1.5 GHz) or 24 CPU (800 MHz)
We want a your opinion of ELT View poing.
Thanks in advance.
Regards
D.H Kim
TPMC vs The Number of CPU
Moderators: chulett, rschirm, roy
Hi,
well it depends on several things.
I bet Ken will say the more jobs you run in parallel the more CPUs you want (or just the more CPUs the better).
so basically your question would be what is the max number of parallel DS processes you need to run?
getting a simple comparison of 1 on 1 1.5GH is better then 800MH means that the same process, if it is CPU bound, will run faster.
getting 16 times 1.5 GH you get 24GH power while 24 times 800MH gets only 19.2GH power
you didn't mention what is the CPU memory cache
this issue, or similar was dicused here in the past (search for it).
bare in mind that performace comes also from network and disk speed, especially when dealing with large data volumes that go thru theese resources, so a great servers won't help if theese resources are poor (meaning small band width of network and/ or slow disks).
IMHO I guess you'll need to check some things, but I think you'll ,surpizing as it may sound, probably get more with the 24 CPU configuration if you have plenty more small jobs that run in parallel (more then could be run on 16 CPUs).
the real question would be, can you mesure the resources you'll need?
the only sad thing is that you actually need to run realtime tests on both to get the real answer and also get as much work as you can in parallel hopefully maxing out your resources
Good Luck
well it depends on several things.
I bet Ken will say the more jobs you run in parallel the more CPUs you want (or just the more CPUs the better).
so basically your question would be what is the max number of parallel DS processes you need to run?
getting a simple comparison of 1 on 1 1.5GH is better then 800MH means that the same process, if it is CPU bound, will run faster.
getting 16 times 1.5 GH you get 24GH power while 24 times 800MH gets only 19.2GH power
you didn't mention what is the CPU memory cache
![Sad :(](./images/smilies/icon_sad.gif)
this issue, or similar was dicused here in the past (search for it).
bare in mind that performace comes also from network and disk speed, especially when dealing with large data volumes that go thru theese resources, so a great servers won't help if theese resources are poor (meaning small band width of network and/ or slow disks).
IMHO I guess you'll need to check some things, but I think you'll ,surpizing as it may sound, probably get more with the 24 CPU configuration if you have plenty more small jobs that run in parallel (more then could be run on 16 CPUs).
the real question would be, can you mesure the resources you'll need?
the only sad thing is that you actually need to run realtime tests on both to get the real answer and also get as much work as you can in parallel hopefully maxing out your resources
Good Luck
Roy R.
Time is money but when you don't have money time is all you can afford.
Search before posting:)
Join the DataStagers team effort at:
http://www.worldcommunitygrid.org
![Image](http://www.worldcommunitygrid.org/images/logo.gif)
Time is money but when you don't have money time is all you can afford.
Search before posting:)
Join the DataStagers team effort at:
http://www.worldcommunitygrid.org
![Image](http://www.worldcommunitygrid.org/images/logo.gif)
Also,
do you have long processes?
by that I mean a process that is composed of many small ones that are sequential in logic
i.e. process x is dependant on processes a + b, process y depends on process d and x and so on.
this would mean that you have some sequential dependancies that might be complex and can't be done in parallel untill some other tasks were done first.
in that case you want this to be done ASAP and that means you need them to finish faster hence possibly faster CPU.
IHTH
do you have long processes?
by that I mean a process that is composed of many small ones that are sequential in logic
i.e. process x is dependant on processes a + b, process y depends on process d and x and so on.
this would mean that you have some sequential dependancies that might be complex and can't be done in parallel untill some other tasks were done first.
in that case you want this to be done ASAP and that means you need them to finish faster hence possibly faster CPU.
IHTH
Roy R.
Time is money but when you don't have money time is all you can afford.
Search before posting:)
Join the DataStagers team effort at:
http://www.worldcommunitygrid.org
![Image](http://www.worldcommunitygrid.org/images/logo.gif)
Time is money but when you don't have money time is all you can afford.
Search before posting:)
Join the DataStagers team effort at:
http://www.worldcommunitygrid.org
![Image](http://www.worldcommunitygrid.org/images/logo.gif)
You also should seriously consider upgrading DataStage. The changes that have come in versions 5+ are not only welcome from a developer / ease of use standpoint but there were considerable speed improvements made in the underlying engine.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
Excellent replies so far folks. One thing to consider is that a higher end machine with fewer cpus leaves you with room to expand. A lower end machine fully stocked leaves you with no options. You have to consider psychology, can you get a fewer cpu but faster machine today, and then bargain for more cpus later?
For a single job perspective, a faster cpu is best. But you have to do double time with a database, so you'll probably want to optimize for the database, and worry less about the ETL. Thru divide and conquer job instantiation you will be able to arrive at your service level for the ETL.
For a single job perspective, a faster cpu is best. But you have to do double time with a database, so you'll probably want to optimize for the database, and worry less about the ETL. Thru divide and conquer job instantiation you will be able to arrive at your service level for the ETL.
Kenneth Bland
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
Hi,
regarding the issue of buying more CPUs at a later date to enhance the macine you have,
the only thing I can contribute is that most times if you don't do it withing a couple of years it turns to be not as easy as you want it to be and you'll probably end up buying a new server which might not be a good idea if you have the funds.
in other words if you plan to upgrade your server, resource wise, make sure it won't be a problem in the time frame you plan or estimate you'll need to.
IHTH
regarding the issue of buying more CPUs at a later date to enhance the macine you have,
the only thing I can contribute is that most times if you don't do it withing a couple of years it turns to be not as easy as you want it to be and you'll probably end up buying a new server which might not be a good idea if you have the funds.
in other words if you plan to upgrade your server, resource wise, make sure it won't be a problem in the time frame you plan or estimate you'll need to.
IHTH
Roy R.
Time is money but when you don't have money time is all you can afford.
Search before posting:)
Join the DataStagers team effort at:
http://www.worldcommunitygrid.org
![Image](http://www.worldcommunitygrid.org/images/logo.gif)
Time is money but when you don't have money time is all you can afford.
Search before posting:)
Join the DataStagers team effort at:
http://www.worldcommunitygrid.org
![Image](http://www.worldcommunitygrid.org/images/logo.gif)