Read very high volume sorted sequential file
Posted: Mon Nov 15, 2010 5:09 pm
Hello,
I have a sequential file with 2000 million records which is already sorted on key1,key2 and key3. I am reading this file using a sequential file stage running in sequential mode and then hash partitioning the data after reading on Key1.
Seq stage -> copy stage (input hash on key 1) -> Dataset stage
From the test i did, the data going into the dataset is sorted within the partition.
Is my understanding from the test that the sorted data remains sorted even after partitioning correct?
Thanks,
Ds
I have a sequential file with 2000 million records which is already sorted on key1,key2 and key3. I am reading this file using a sequential file stage running in sequential mode and then hash partitioning the data after reading on Key1.
Seq stage -> copy stage (input hash on key 1) -> Dataset stage
From the test i did, the data going into the dataset is sorted within the partition.
Is my understanding from the test that the sorted data remains sorted even after partitioning correct?
Thanks,
Ds