Page 1 of 1

sort funnel and apt_grid_partition

Posted: Mon Oct 17, 2011 9:53 pm
by just4u_sharath
Hi,
I am trying to create a sample job to test sort funnel property in the funnel stage.
1st source file
coulmn1
2
6
4

2nd file
column1
2
3
7
I am expecting data to be sorted in ascending order and my final file should be
2
2
3
4
6
7
But i got in different order. I added a environmental variable $apt_Grid_partitions and set to 1 then i got correct order but if i change it to 4 then i am not getting in ascending order.

My question is what is $apt_Grid_partitions and why is it influencing the outcome?

Re: sort funnel and apt_grid_partition

Posted: Mon Oct 17, 2011 10:56 pm
by deeplind07
$APT_GRID_PARTITIONS Specifies the number of partitions for each compute node.
Hence when you specify 4, the data is partitioned on 4 partitions, but when it is 1 all the records go to a single partition. and hence the output is desired output

Posted: Mon Oct 17, 2011 11:31 pm
by ray.wurlod
Sorting is performed separately on each partition. If you wish to preserve the overall sorted order when collecting into a sequential operator then you need to specify SortMerge as the collection algorithm.