DataStage DataSet usage in a grid environment

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Post by jwiles »

Where your datasets are created is dependent upon how you design your grid infrastructure and the configuration files you run the jobs with. Therefore, it is basically up to you. Take some time to read through the grid redbook a few times and the IS documentation...especially the chapter on Parallel Engine Configuration Files in the Parallel Job Developer Guide.

Regards,
- james wiles


All generalizations are false, including this one - Mark Twain.
PaulVL
Premium Member
Premium Member
Posts: 1315
Joined: Fri Dec 17, 2010 4:36 pm

Post by PaulVL »

Your dataset resource path should be accessible by all servers in your grid, in order to handle job restarts that send you to a different server.

We use NAS storage for our project workspace.
pneumalin
Premium Member
Premium Member
Posts: 125
Joined: Sat May 07, 2005 6:32 am

Post by pneumalin »

Paul,
The NAS is exactly what I am looking for, and my question on DataSet usage in a grid denvironment is answered. I will create the dataset in the ResourceDisk that can be reached by all nodes, and I will do the same on all persistent data storage such as sequential file and hash file. Appreciated if you post some of your working Configuration file samples that manage the ResourceDisk and ScratchDisk across Fornt-end node, stand-by node, and compute nodes. Thanks again.
Post Reply