I am using the difference stage to find the difference between 2 files, FileA and FileB. I copied FileA into FileB. However, the difference stage still finds most of the rows different between the files. The diff column returns these values in the output dataset:
0
1
2
2
0
0
1
1
2
What's interesting is, the 2 files have only 6 rows. The 3rd column is a numeric column and was the key column. All columns are char type.
Difference stage problem
Moderators: chulett, rschirm, roy
have you checked the files are the same using unix
is your partitioning the same for the 2 inputs to the stage?
Code: Select all
diff file1 file2
Regards,
Nick.
Nick.