Page 1 of 1

Sort and Filter gives different results run to run

Posted: Tue Nov 13, 2012 10:47 pm
by ppp
I am performing a Sort and then Filter to remove duplicates.

However I have noticed that with every run the results change by 1 record.
The output from filter after the sort give 15 rows (where keychange=1) for run 1 and produces 2 records rejected i.e. (where keychange=0).

But for Run 2, 3 records are rejected (where keychange=0) and output from filter is 14 records (where keychange=1) .

Can you please help me understand why the result changes from run to run. Is there any particular partitioning or environment variable I should be using?
Thank you

Posted: Tue Nov 13, 2012 10:58 pm
by aartlett
G'day PPP,
First thing that comes to mind is a partitioning error.

1) Are you partitioning going into the sort (or carrying it Same from a previous partitioning)
2) Is the partitioning against the first part(s) of the sort

These are the 2 most common reasons for this error I have seen

3) Is it null effected
(No 3 reason).

I'm sure others here will have other scenarios, but these are the most common I have seen before doing a deep drill.

Posted: Tue Nov 13, 2012 11:12 pm
by ppp
Thank you for the reply.

I am using the default partitioning for both Sort and Filter - Auto.
And there are no NULLs.

Posted: Wed Nov 14, 2012 12:05 am
by ssreeni3
Hi PPP,
Please specify the records of input,output and rejects for better understanding.
--------------------------------
Srini

Posted: Wed Nov 14, 2012 3:30 am
by aartlett
ppp wrote:I am using the default partitioning for both Sort and Filter - Auto.
And there are no NULLs.
"Well there's your problem".

You need to partition, I normally go Hash, on the first field of the sort if it's a large enough domain.

Posted: Wed Nov 14, 2012 10:37 am
by ray.wurlod
Actually the first thing that came to my mind was the question "are you processing the same data in each run?"