Collection Methods

muralisankarr · Post by **muralisankarr** » Fri Nov 20, 2009 10:35 am

I have designed two job,which has four stages.

JobA : Oracle->(Partition Method : Round Robin) Sorter->Transformer->(Collection Method :Sorted Merge) Dataset
JobB : Oracle->(Partition Method : Round Robin) Sorter->Transformer->(Collection Method :Round Robin) Dataset

The output (order of the data) are same in above to methods. But In the Px guide it is mentioned

Note that collecting methods are mostly non-deterministic. That is, if you run the same job twice with the same data, you are unlikely to get data collected in the same order each time. If order matters, you need to use the sorted merge collection method.

If I partition the sorted record by round robin and collect it by round robin is there any chance for the data to loose its order? Is there any risk we have in the JobB when the data order need to be preserved?

Many Thanks
MSR

ArndW · Post by **ArndW** » Fri Nov 20, 2009 12:04 pm

How many nodes are in your APT_CONFIG file? Also, if you aren't changing the number of nodes between the transform and output dataset, no collection will take place so the data will, of necessity, be in the same order each run.

muralisankarr · Post by **muralisankarr** » Fri Nov 20, 2009 12:19 pm

ArndW wrote:How many nodes are in your APT_CONFIG file? Also, if you aren't changing the number of nodes between the transform and output dataset, no collection will take place so the data will, of necessity, be in the same order each run.

There are four nodes and I configured the last dataset in sequential mode. So the node changes in the last stage

ray.wurlod · Post by **ray.wurlod** » Sat Nov 21, 2009 2:35 am

Then use a text file. It will be clearer.