Sorted Hashed Partition in DataSet
Posted: Wed Mar 12, 2008 6:18 am
Hi,
After doing a search, I sifted through the results and couldn't come up with the exact answer I was looking for, so here I am seeing if I can get some discourse about my question.
I want to see if my thinking is correct here. I have some files being read using a complex flat file stage. These are each then being loaded into a DataSet for use in a later job. Because this job will be utilizing the Join stage heavily, I wanted to perform a sorted hash when writing these initial DataSets.
What I'm looking to confirm is...
If I write 5 different DataSets in one Job, all using a sorted hash partition (sorting on the same key), can I read these DataSets in another job and use 'same' partitioning on a Join stage to bring some of these DataSets back together?
Thanks in advance![Wink ;)](./images/smilies/icon_wink.gif)
Jason
After doing a search, I sifted through the results and couldn't come up with the exact answer I was looking for, so here I am seeing if I can get some discourse about my question.
I want to see if my thinking is correct here. I have some files being read using a complex flat file stage. These are each then being loaded into a DataSet for use in a later job. Because this job will be utilizing the Join stage heavily, I wanted to perform a sorted hash when writing these initial DataSets.
What I'm looking to confirm is...
If I write 5 different DataSets in one Job, all using a sorted hash partition (sorting on the same key), can I read these DataSets in another job and use 'same' partitioning on a Join stage to bring some of these DataSets back together?
Thanks in advance
![Wink ;)](./images/smilies/icon_wink.gif)
Jason