I have a job design which looks as follows
Code: Select all
Ds1----
Join 1-----
Ds2---- |
Funnel -------- Sort ------- Dataset
Ds3---- |
Join 2-----
Ds4----
Ds1 and Ds2 are sorted and partitioned on 2 keys (total key length 52 char)
Ds3 and Ds4 are partitioned on one key (Len 2)
The link from funnel is partitioned by 3 keys and in Sort stage 6 keys are used to define the sort.
The length of all keys defined in sort stage is 66
The link after sort has same partitioning defined on it
All re-partitions defined are required for the logic
Issue:
At a particular point almost 200mn+ rows are written out at the funnel, and the job fails due to no space issue.
Scratch space available is 100GB
How do I solve this?
Env team will not be increasing space in near future
Regards
Wah