DataStage DataSet usage in a grid environment

jwiles · Post by **jwiles** » Sat Mar 12, 2011 6:43 pm

Where your datasets are created is dependent upon how you design your grid infrastructure and the configuration files you run the jobs with. Therefore, it is basically up to you. Take some time to read through the grid redbook a few times and the IS documentation...especially the chapter on Parallel Engine Configuration Files in the Parallel Job Developer Guide.

Regards,

PaulVL · Post by **PaulVL** » Sun Mar 13, 2011 10:20 am

Your dataset resource path should be accessible by all servers in your grid, in order to handle job restarts that send you to a different server.

We use NAS storage for our project workspace.

pneumalin · Post by **pneumalin** » Sun Mar 13, 2011 3:20 pm

Paul,
The NAS is exactly what I am looking for, and my question on DataSet usage in a grid denvironment is answered. I will create the dataset in the ResourceDisk that can be reached by all nodes, and I will do the same on all persistent data storage such as sequential file and hash file. Appreciated if you post some of your working Configuration file samples that manage the ResourceDisk and ScratchDisk across Fornt-end node, stand-by node, and compute nodes. Thanks again.