Collection Methods

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
muralisankarr
Premium Member
Premium Member
Posts: 83
Joined: Tue Oct 28, 2008 1:55 am
Location: Chennai

Collection Methods

Post by muralisankarr »

I have designed two job,which has four stages.

JobA : Oracle->(Partition Method : Round Robin) Sorter->Transformer->(Collection Method :Sorted Merge) Dataset
JobB : Oracle->(Partition Method : Round Robin) Sorter->Transformer->(Collection Method :Round Robin) Dataset

The output (order of the data) are same in above to methods. But In the Px guide it is mentioned
Note that collecting methods are mostly non-deterministic. That is, if you run the same job twice with the same data, you are unlikely to get data collected in the same order each time. If order matters, you need to use the sorted merge collection method.
If I partition the sorted record by round robin and collect it by round robin is there any chance for the data to loose its order? Is there any risk we have in the JobB when the data order need to be preserved?

Many Thanks
MSR
The minute you start talking about what you're going to do if you lose, you have lost
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

How many nodes are in your APT_CONFIG file? Also, if you aren't changing the number of nodes between the transform and output dataset, no collection will take place so the data will, of necessity, be in the same order each run.
muralisankarr
Premium Member
Premium Member
Posts: 83
Joined: Tue Oct 28, 2008 1:55 am
Location: Chennai

Post by muralisankarr »

ArndW wrote:How many nodes are in your APT_CONFIG file? Also, if you aren't changing the number of nodes between the transform and output dataset, no collection will take place so the data will, of necessity, be in the same order each run.
There are four nodes and I configured the last dataset in sequential mode. So the node changes in the last stage
The minute you start talking about what you're going to do if you lose, you have lost
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Then use a text file. It will be clearer.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply