Hi All,
I need this clarification from all of you. Thanks in advance. :D
My job design is:
Dataset -----> Transformer ------> Remove Duplicate ------> Dataset
In the input link of the transformer stage, I am performing a 'HASH' partioning.
In the input link of the Remove Duplicate I am using 'SAME' partioning.
Now, my question I got the information from my code reviewer that transformer stage is not capable of retening partioning in the output link.
It automatically converts the partioning to 'AUTO'. So, RD stage will have 'AUTO' partioning in the input link. Is that so?
2. RD is a key based stage, it is getting 'AUTO' partioning the input link (if my question 1 is correct). So, will DS optimize the partioning to 'HASH'?
I will be delighted if you all can throw some light on my doubts.
Need a clarification
Moderators: chulett, rschirm, roy
Need a clarification
Thanks,
Pratik.
Pratik.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Re: Need a clarification
So, you want 38309 replies. Is this correct?Bicchu wrote:Hi All, I need this clarification from all of you.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Re: Need a clarification
Visit the stage properties of the Transformer stage and tell us whether its Propagate property is set to Set, Clear or Default. It's on the Advanced tab. And, if it's not Clear, then your code reviewer is wrong, wrong, wrong.
You can prove this by inspecting the score.
As to question 2, if the partitioning is set to (Auto) and the upstream stage executes in parallel, then the partitioning algorithm used will be Same.
You can prove this by inspecting the score.
As to question 2, if the partitioning is set to (Auto) and the upstream stage executes in parallel, then the partitioning algorithm used will be Same.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Then your Remove Duplicates stage will have its partitioning set to Same which, because the Transformer stage is running using Hash as its partitioning algorithm, will mean that the Remove Duplicates stage will execute using Hash partitioning (the Same as that used in the Transformer stage).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.