I have 2 jobs. Output of first job is a Dataset which is the input to the 2nd job.
The 2nd job looks like this
Code: Select all
-----
|DB2|
-----
|
------
|Sort2|
-----
|
|DS of 1 | --> | Sort1 |-->|join | ---> |ouput |
My doubt here is, can we avoid Sort1 stage in 2nd job ?
In my 1st job, while writing to the output dataset, if I hash parition and sort data based on keys in the second job, can i avoid the Sort1 stage in the 2nd job ?
1. Will the parition & sort be maintained in a Dataset ?
2. Will the parition & sort be maintained in a Dataset for a different Configuration file ?
Thanks in advance for your comments