Different results in 8.7 job than 8.1 version job
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 50
- Joined: Tue Jan 19, 2010 4:14 am
Different results in 8.7 job than 8.1 version job
Hi All,
I am facing a unique issue while migrating 8.1 version jobs to 8.7 version. Couple of jobs that have remove duplicate stage with hash partition are displaying difference in results when i compare 8.1 output with 8.7
Scenario is like this:-
i/p
colA,colB,colC,colD
A,B,C,1
A,B,D,2
B,C,D,1
keys for removing duplicates, hash partitioning and sorting (in remove duplicate stage partitioning tab). duplicate to retain=first
colA, colB
Results come like this:-
DS 8.1 job o/p
A,B,C,1
B,C,D,1
DS 8.7 job o/p
A,B,D,2
B,C,D,1
Every time I run both jobs the records get randomly retained (for duplicates only)
Can anyone show some way out of this situation? Would be great help.
I am facing a unique issue while migrating 8.1 version jobs to 8.7 version. Couple of jobs that have remove duplicate stage with hash partition are displaying difference in results when i compare 8.1 output with 8.7
Scenario is like this:-
i/p
colA,colB,colC,colD
A,B,C,1
A,B,D,2
B,C,D,1
keys for removing duplicates, hash partitioning and sorting (in remove duplicate stage partitioning tab). duplicate to retain=first
colA, colB
Results come like this:-
DS 8.1 job o/p
A,B,C,1
B,C,D,1
DS 8.7 job o/p
A,B,D,2
B,C,D,1
Every time I run both jobs the records get randomly retained (for duplicates only)
Can anyone show some way out of this situation? Would be great help.
This is very strange behavior. I can't say I've seen that problem on either of the two working 8.7 environments. It sounds like your job is configured correctly. I assume you've insured that the new job has the sorts specified in correct order (descending).
Have you switched to NLS on the new system? Can you subset some of the records in question and output them to a sequential file so you can look at them in a Hex editor? I'm wondering if there are invisible characters in the field that is causing it to sort "higher".
Have you switched to NLS on the new system? Can you subset some of the records in question and output them to a sequential file so you can look at them in a Hex editor? I'm wondering if there are invisible characters in the field that is causing it to sort "higher".
-
- Participant
- Posts: 50
- Joined: Tue Jan 19, 2010 4:14 am
-
- Participant
- Posts: 50
- Joined: Tue Jan 19, 2010 4:14 am
How are you sorting? Best use a sort stage and explicitly specify "Stable Sort = true" to remove the non-deterministic part of your problem.
Since the data
A,B,C,1
A,B,D,2
is only sorted on "A" and "B" the record order when not using a stable sort might be different.
Since the data
A,B,C,1
A,B,D,2
is only sorted on "A" and "B" the record order when not using a stable sort might be different.
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>
-
- Participant
- Posts: 50
- Joined: Tue Jan 19, 2010 4:14 am
If you sort the following rows that have 4 columns
A,B,C,1
B,C,D,1
A,B,D,2
on the first 2 columns using a non-stable (but faster) sort you might get a result of:
A,B,C,1
A,B,D,2
B,C,D,1
or you might get a result of:
A,B,D,2
A,B,C,1
B,C,D,1
This is due to the way the sort algorithm works internally, as it creates groups and subtrees and it might change the order of the rows for items with duplicate sort keys. Using "stable sort" guarantees that the order of rows for duplicates is identical to the source order, but a stable sort can be a lot slower and less efficient.
A,B,C,1
B,C,D,1
A,B,D,2
on the first 2 columns using a non-stable (but faster) sort you might get a result of:
A,B,C,1
A,B,D,2
B,C,D,1
or you might get a result of:
A,B,D,2
A,B,C,1
B,C,D,1
This is due to the way the sort algorithm works internally, as it creates groups and subtrees and it might change the order of the rows for items with duplicate sort keys. Using "stable sort" guarantees that the order of rows for duplicates is identical to the source order, but a stable sort can be a lot slower and less efficient.
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>
-
- Participant
- Posts: 4
- Joined: Thu Aug 15, 2013 12:54 pm
- Location: Bangalore
Re: Different results in 8.7 job than 8.1 version job
Hi nikhil_bhasin,
Is this issue resolved ... ?
If not then can you please confirm
1) The no.of nodes that you are using in 8.1 and 8.7 for this job?
2) Is there any range lookup you are using in the job..?
Is this issue resolved ... ?
If not then can you please confirm
1) The no.of nodes that you are using in 8.1 and 8.7 for this job?
2) Is there any range lookup you are using in the job..?
-
- Participant
- Posts: 10
- Joined: Sun Aug 11, 2013 10:46 pm
- Location: Dalian