Hardware Requirements for setting up the Server Environment

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
vnspn
Participant
Posts: 165
Joined: Mon Feb 12, 2007 11:42 am

Hardware Requirements for setting up the Server Environment

Post by vnspn »

Hi,

We have planned to move our platform from Server Edition to the Enterprise Edition and we are planning on the requirements needed for setting up the infrastructure for the PX 7.5 version on UNIX.

We would like to hear some suggestions on what could be the needed parameter requirements for establishing a robust server side hardware environment based on,
No. of CPUs on server side -
Amount of memory (RAM) -
Amount of Harddisk space -
and anymore parameters that are important to be considered -

In our application, we would be processing about 10 million records from the source.

Thanks.
nick.bond
Charter Member
Charter Member
Posts: 230
Joined: Thu Jan 15, 2004 12:00 pm
Location: London

Re: Hardware Requirements for setting up the Server Environm

Post by nick.bond »

vnspn wrote:Hi,

We have planned to move our platform from Server Edition to the Enterprise Edition and we are planning on the requirements needed for setting up the infrastructure for the PX 7.5 version on UNIX.

We would like to hear some suggestions on what could be the needed parameter requirements for establishing a robust server side hardware environment based on,
No. of CPUs on server side -
Amount of memory (RAM) -
Amount of Harddisk space -
and anymore parameters that are important to be considered -

In our application, we would be processing about 10 million records from the source.

Thanks.
I doubt anyone will be able to give you an answer to this at the moment because there isn't enough information. You have said you want to process 10 million records, but how long do you want this to take? what would the transformations like? what is the source? what is the target? how many concurrent developers will there be...

......and probably more importantly what is your budget?

a single processor machine with 1Gb of RAM would work but not very quickly!

You're probably better off working out exactly what your system needs to be able to handle now, and for the next two years and ask IBM or you provider to supply a speck (minimum and recommended). I don't know for sure but I imagine if they think they are about to make a sale they provide this sort of information for free (??? maybe not ?)
Regards,

Nick.
vnspn
Participant
Posts: 165
Joined: Mon Feb 12, 2007 11:42 am

Post by vnspn »

Ok, I would give you the details based on information / idea that we have currently.

- We would like the processing for 10 million records to be under 1 hour (1 hour could be maximum)

- The Source is going to be a file on the Mainframe system. We have a plan of making use of the FTP stage to fetch the data. In case, if this does not turn out to give good performance, then we may transfer the file to DS server and read it using CFF or Sequential File stage.

- The Target would be Oracle database.

- The transformations would consist of some aggregations in certain jobs and mostly lookups in many other jobs.

- We might have 5 concurrent developers.

- Budget might be flexible based on whatever we would critically need.

These are some of the answers that we have for the kind of question you had asked. It would be great if you could share you thought / suggestions.

Thanks.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Ten million source records once, ten million daily, ten millon per second?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
vnspn
Participant
Posts: 165
Joined: Mon Feb 12, 2007 11:42 am

Post by vnspn »

Ray,

We would have 10 million records at the source. We would be applying some filter conditions immediately after reading from the source, which may bring down the record count to around 1 or 2 million. This would be number of records that would be processed in all the others jobs that are downstream.

The whole process would be a weekly run.
Post Reply