Provided we can get legal clearance (that the legal niceties have been followed) from Ascential, a white paper on exactly this topic will appear on
www.datastagexchange.com, just as soon as their imprimatur is received.
The particular answer in your case will depend on what release of DataStage you are running. I will assume that you have release 5.2 or later, or release 5.1 with Axcel Pack.
Partition parallelism can be accomplished by starting multiple instances of your job, with each instance processing a partition of the data. Typically you will have parameterized selection criteria in the stage that extracts the data. For example:
WHERE column BETWEEN (#lowvalue# AND #highvalue#)
To run multiple instances of a job, the job must have multi-instance capability enabled (a check box on the job properties window). Then when you invoke it, you append a period and an "invocation ID", which can be any string that provides unique identity (and only contains alphanumeric characters). For example:
hJob1 = DSAttachJob("MyJob.1", DSJ.ERRNONE)
spCode = DSSetParam(hJob1, "lowvalue", 1)
spCode = DSSetParam(hJob1, "highvalue", 5000000)
hJob2 = DSAttachJob("MyJob.2", DSJ.ERRNONE)
spCode = DSSetParam(hJob2, "lowvalue", 5000001)
spCode = DSSetParam(hJob2, "highvalue", 10000000)
In this example, error checking/handling has been omitted for clarity.
It is also possible to create multiple streams within a single server job; if these streams are independent, they will execute in separate processes.
Should you have Parallel Extender installed and licensed (DataStage 6.0 and later), you can encapsulate your server job in a shared container, and allow the parallelism to be handled automatically in a parallel job that includes that shared container.
The Parallel Extender environment allows you to take optimal, yet controlled, advantage of all the processing nodes in a symmetric multi-processor (SMP), or massively parallel processing (MPP) system or cluster.
Ray Wurlod
Education and Consulting Services
ABN 57 092 448 518