Hi All,
We are in the process of sizing a datastage environment.
What are the inputs needed that would help in arriving at a reasonably close hardware requirements estimation..No of Cores, Memory (I understand it cannot not be accurate but in the ballpark)
e.g
Data Volumes
Available Load Window within which we have to finish the cycle
...
Thanks,
VN
Datastage (Infosphere Information Server) Sizing
Moderators: chulett, rschirm, roy
You might find Eric's reply in this thread of interest.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
By default no data are stored in DataStage, therefore data volumes do not affect sizing of DataStage itself. If you choose to use Data Sets for intermediate storage, then you need to allocate sufficient disk space for that. If you perform sorting, or lookups, etc., involving large data volumes then you need to ensure that you have plenty of scratch disk. And, of course, there must be sufficient space available in databases, including transaction logs (this is particularly true for Information Analyzer).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Sizing the server is actually a very difficult thing to do. It's almost a "How high is Up?" kind of question because there are literally dozens of factors that come into play.
In many cases I've even seen hardware vendors fall back on "how much money do you have" to size the system because all they'll do is suggest the biggest system you can afford.
Can you engage your hardware vendor to assist?
In many cases I've even seen hardware vendors fall back on "how much money do you have" to size the system because all they'll do is suggest the biggest system you can afford.
Can you engage your hardware vendor to assist?
You might want to decide if you want Windows, Linux or AIX. I would go with Linux, but you need to ask your system admins. If you are not an AIX shop, then it pretty much rules that out.
Next you will ask the same group: What is the minimum amount of Cores I can get one of those servers configured with?
I would not configure any datastage server with less than 32GB of Ram. That stuff is cheap and usefull so go big or go home. Install guide says something like 2GB per core... that's poop. Keep it big. It's not just the datastage jobs you have to deal with, it's the user written shell scripts that are part of your framework. Those can get ugly.
They will pitch you Virtual Servers... Cloud blah blah... You have to chose if you want your production box on a virtual or not.
15M rows with 10-15% growth anually is chump change.
Another question you have to ask your team is... once this DataStage environment kicks off and is a huge success... will others want to put their ETL process into this environment? Meaning... YOUR growth is expected and planned, but do you think others will want to go swimming into your pool too? Your company should try to avoid setting up a unique install for each project that pops up. To costly to do that, but at times politics trumps budget.
Next you will ask the same group: What is the minimum amount of Cores I can get one of those servers configured with?
I would not configure any datastage server with less than 32GB of Ram. That stuff is cheap and usefull so go big or go home. Install guide says something like 2GB per core... that's poop. Keep it big. It's not just the datastage jobs you have to deal with, it's the user written shell scripts that are part of your framework. Those can get ugly.
They will pitch you Virtual Servers... Cloud blah blah... You have to chose if you want your production box on a virtual or not.
15M rows with 10-15% growth anually is chump change.
Another question you have to ask your team is... once this DataStage environment kicks off and is a huge success... will others want to put their ETL process into this environment? Meaning... YOUR growth is expected and planned, but do you think others will want to go swimming into your pool too? Your company should try to avoid setting up a unique install for each project that pops up. To costly to do that, but at times politics trumps budget.