Server and client sizing
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 26
- Joined: Thu Apr 15, 2004 12:54 am
Server and client sizing
Hi,
Generally what should be the indicative server and client configuration assuming that we have around 1 TB data flowing through?
Are there any high level sizing guidelines for Ascential Server?
Thanks,
Generally what should be the indicative server and client configuration assuming that we have around 1 TB data flowing through?
Are there any high level sizing guidelines for Ascential Server?
Thanks,
The client configuration, at least the minimums, are outlined by IBM/Ascential. Sizing the server is a bit tougher. I can flow 1Tb through my notebook with no problems, except it will take quite a bit longer than on a top-end Solaris box. So part of the sizing questions must be times and not just data amounts. Is the data being processed heavily (i.e. a lot "T" from the ETL equation) or just being moved? What are the sources and the database(s)? Are they local or remote?
Many factor play into sizing and capacity planning, making correct judgement a matter of experience (and luck).
Many factor play into sizing and capacity planning, making correct judgement a matter of experience (and luck).
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>
Are lookin at the version of Datastage or the size of the server?
Assuming you are after the server size and capacity, the vauge answer would be, "it depends on your jobs desing that you use". That true. More the stage, more the resource used in Datastage. I have seen some site, where, ETL is used just to trigger Oracle Stored procedure, and Extract and load directly into other server. Which may not ponentially utilize the Datastage resource. Say if you have some lookup through hashed file need to be done, and the whole file need to be palce in Datastage server (atleast for better performance).
Assuming you are after the server size and capacity, the vauge answer would be, "it depends on your jobs desing that you use". That true. More the stage, more the resource used in Datastage. I have seen some site, where, ETL is used just to trigger Oracle Stored procedure, and Extract and load directly into other server. Which may not ponentially utilize the Datastage resource. Say if you have some lookup through hashed file need to be done, and the whole file need to be palce in Datastage server (atleast for better performance).
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
-
- Participant
- Posts: 26
- Joined: Thu Apr 15, 2004 12:54 am
Since the input you have, just gives the size inoformation, I again assume that you are after Disk Size.
As mentioned, it all depends on the level of transformation involed. What is your source/targer? Is it generally a daily run/weekly/Fortnightly/monthly run? Based on your high level desing, how many days you require the staging files/tables to be retained. Is it in the same server or different?
IF you dont have any of this information, just go ahead with double or triple of your input (assuming that you are not going to process all the 1TB at the same time).
As mentioned, it all depends on the level of transformation involed. What is your source/targer? Is it generally a daily run/weekly/Fortnightly/monthly run? Based on your high level desing, how many days you require the staging files/tables to be retained. Is it in the same server or different?
IF you dont have any of this information, just go ahead with double or triple of your input (assuming that you are not going to process all the 1TB at the same time).
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
Ok, I recommend using an old Intel 486 processor (or a Pentium if you feel like splurging) and the minimum amount of memory that the IBM/Ascential datastage guide suggests, but you could probably make do with 512Mb. This will be your cheapest solution.
(do you now understand that this is why you've been asked twice about performance This DS Server configuration will work, but your load might take over a week to complete).
This is like me asking you what sort of a car I should purchase and tell you that I have 60 boxes in my garage that I need to move.
Do you have operational constraints? Is the company a UNIX or a Windows shop The list of questions can go on quite a bit, but without any information about the general parameters you will not be able to get a good answer from anyone. Except perhaps a hardware salesman - he/she will sell you a box with pleasure.
(do you now understand that this is why you've been asked twice about performance This DS Server configuration will work, but your load might take over a week to complete).
This is like me asking you what sort of a car I should purchase and tell you that I have 60 boxes in my garage that I need to move.
Do you have operational constraints? Is the company a UNIX or a Windows shop The list of questions can go on quite a bit, but without any information about the general parameters you will not be able to get a good answer from anyone. Except perhaps a hardware salesman - he/she will sell you a box with pleasure.
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>
-
- Participant
- Posts: 26
- Joined: Thu Apr 15, 2004 12:54 am
Devyani,
great (that rules out me being able to sell you my old PC, though).
What UNIX hardware platform is currently used in-house; usually companies prefer homogenous hardware?
Any UNIX system can process 1TB from a sequential file to a sequential file in that time period. But if your database isn't on your DS server and you have a gigabit ntework card you will only be able to transfer just under 3Tb per hour (if nothing else is going on and everything is perfect)
great (that rules out me being able to sell you my old PC, though).
What UNIX hardware platform is currently used in-house; usually companies prefer homogenous hardware?
Any UNIX system can process 1TB from a sequential file to a sequential file in that time period. But if your database isn't on your DS server and you have a gigabit ntework card you will only be able to transfer just under 3Tb per hour (if nothing else is going on and everything is perfect)
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
NVLLVS ANXIETAS
If you have multiple CPUs (or even sufficient spare CPU capacity) you can create multiple instances each of which processes a subset of the rows - partition parallelism, in short.
If you have multiple CPUs (or even sufficient spare CPU capacity) you can create multiple instances each of which processes a subset of the rows - partition parallelism, in short.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.