Resource & Scratch Disk

nagarjuna · Post by **nagarjuna** » Tue Jan 17, 2012 4:14 pm

Hello All -- I need your suggestion regarding a setting up of configuration file

We have a cluster environment ( 4 servers ) & I am setting the configuration file .I am wondering what would be the best option of pointing scratch & resource disk . We have an NFS for which all those 4 servers has access .Also , Local space is present which is accessible be corresponding server.

a) As datastage creates lots of tem files while sorting in the scratch disk , I am planning to point scratch disk to be local . Please let me know if it is good option ?
b) Do I need to point the resource disk to the NFS or local disks . Please suggest .

Thanks
Nag

jwiles · Post by **jwiles** » Tue Jan 17, 2012 8:29 pm

1) Local storage for your scratch disks is recommended. SAN would be second depending upon it's performance. NFS is WAY down on the list for scratch storage

2) NFS would be better as far as easy access from all nodes of your cluster, with the connection to NFS on a private network separate (shared with nothing else) if possible and a minimum 1GB speed, 10GB would be better

Regards,

jassu · Post by **jassu** » Tue Jan 17, 2012 10:45 pm

Thanks James for your response . For point 2 , Is there any drawback if we are mentioning local disk for resource disk ?

jwiles · Post by **jwiles** » Tue Jan 17, 2012 11:47 pm

Each server will have direct access to only the datasets (or portions thereof) stored on their local disks. This will make it difficult to run jobs on any available server (but not entirely impossible), hampering the ability to effectively balance processing loads across the cluster.

Regards,

ray.wurlod · Post by **ray.wurlod** » Wed Jan 18, 2012 2:57 am

(What James said)... and, should you ever need to re-partition, you will need to move rows across the network.

nagarjuna · Post by **nagarjuna** » Wed Jan 18, 2012 7:54 am

In case of repartition of the data , There is a need for each server to have an access for the all the resource disks specified . Let us suppose if I am specifying configuration file :

Node1 --- Server1 -- resource disk pointed local disk of server 1
Node2 --- Server2 -- resource disk pointed local disk of server 2
Node3 --- Server3 -- resource disk pointed local disk of server 3
Node4 --- Server4 -- resource disk pointed local disk of server 4

So , In this case will the job aborts in case of repartition ?

ray.wurlod · Post by **ray.wurlod** » Wed Jan 18, 2012 2:51 pm

No. Communications are established between the relevant player processes (the operators on each end of the link in which repartitioning is occurring) and they pass data to other nodes using APT_Communicator class (if I remember correctly) using TCP port numbers beginning at 11000 by default (again if I remember correctly).

nagarjuna · Post by **nagarjuna** » Wed Jan 18, 2012 6:48 pm

Thanks Ray & James for helping me out & clarifying this.

Resolved :D