DataStage Machine Hardware configration Query

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
ashik_punar
Premium Member
Premium Member
Posts: 71
Joined: Mon Nov 13, 2006 12:40 am

DataStage Machine Hardware configration Query

Post by ashik_punar »

Hi Everyone,

In my current project we are moving 20GB of data from the source to the target. While processing this data sometimes we are blowing up the records to form new records. So, while processing the data that is moving through the system can go up to 60 GB. As the business requirement most of our processing is being done sequentially. Our current hardware configration is as below:

Stand alone Server without clustering
8 GB RAM
4 CPU (Multi threading disabled)
4 Hard drives (144GB)

The scratch space and the dataset space as given in the configuration file is only 20 GB. Frequently we get SIGKILL,SIGSEGV and SIGPIPE errors. We understand that there is a resource crunch.

It will be helpful if somebody can suggest a hardware configuration(DataStage PX) and also the necessary scratch and dataset space that will be ideal for our requirement.


Thanks in advance,
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

If the business requires you to process sequentially, consider using lower impact server jobs.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
ashik_punar
Premium Member
Premium Member
Posts: 71
Joined: Mon Nov 13, 2006 12:40 am

Post by ashik_punar »

Hi Ray,

Thanks a lot for the quick reply. In a few jobs we are processing the data sequentially but in most of the jobs that we have we are using the parallelism only. But the problem is that when we try to run a no of jobs parallely(sometimes a single job also) in a sequence then we get a no of hardware realted issue like : 'Scratch Space Full', 'Heap Allocation Failed',SIGKILL,SIGSEGV,SIGINT. So we are considering an upgradation of the hardware resources. For the same we need to provide a configration with which we will not be getting these kinds of error.

Thats why we need your valuable inputs on this issue that what could be the best hardware configration for the kind of data processing we are doing.

Thanks in advanca,
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

The hardware you have will support multi-instance server jobs with less demand for resources than parallel jobs. There's no way I can suggest figures without monitoring and measuring what's happening on your system. Some things are obvious: scratch space full means that you need to configure more scratch space, but how much more must depend on exactly what is demanding scratch space. But 20GB does not seem like much. Try using an approach that "gives all partitions all the disk". For example, in a two node configuration:

Code: Select all

{
   node "firstnode"
   {
      fastname "myserver"
         pools "" "firstnodepool" "import"
      resource disk "C:\Data\DataSets"
      resource disk "D:\Data\DataSets"
      resource scratchdisk "E:\Work\Scratch"
      resource scratchdisk "F:\Work\Scratch"
   }
   node "secondnode"
      fastname "myserver"
         pools "" "secondnodepool" "export"
      resource disk "D:\Data\DataSets"
      resource disk "C:\Data\DataSets"
      resource scratchdisk "F:\Work\Scratch"
      resource scratchdisk "E:\Work\Scratch"
   }
}
Last edited by ray.wurlod on Fri Jan 12, 2007 3:45 pm, edited 2 times in total.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

Other errors like 'Heap Allocation Failed',SIGKILL,SIGSEGV,SIGINT are OS level errors. You can do thorough research on google and ways to avoid and/or get rid of them.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
Post Reply