Need a clarification

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
Bicchu
Participant
Posts: 26
Joined: Sun Oct 03, 2010 10:49 pm
Location: India

Need a clarification

Post by Bicchu »

Hi All,

I need this clarification from all of you. Thanks in advance. :D

My job design is:

Dataset -----> Transformer ------> Remove Duplicate ------> Dataset

In the input link of the transformer stage, I am performing a 'HASH' partioning.

In the input link of the Remove Duplicate I am using 'SAME' partioning.



Now, my question I got the information from my code reviewer that transformer stage is not capable of retening partioning in the output link.
It automatically converts the partioning to 'AUTO'. So, RD stage will have 'AUTO' partioning in the input link. Is that so?

2. RD is a key based stage, it is getting 'AUTO' partioning the input link (if my question 1 is correct). So, will DS optimize the partioning to 'HASH'?

I will be delighted if you all can throw some light on my doubts.
Thanks,
Pratik.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Re: Need a clarification

Post by ray.wurlod »

Bicchu wrote:Hi All, I need this clarification from all of you.
So, you want 38309 replies. Is this correct?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Bicchu
Participant
Posts: 26
Joined: Sun Oct 03, 2010 10:49 pm
Location: India

Post by Bicchu »

Sorry, for that line.

I just want what are the answers for my question.

Thanks,
Pratik
Thanks,
Pratik.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Re: Need a clarification

Post by ray.wurlod »

Visit the stage properties of the Transformer stage and tell us whether its Propagate property is set to Set, Clear or Default. It's on the Advanced tab. And, if it's not Clear, then your code reviewer is wrong, wrong, wrong.

You can prove this by inspecting the score.

As to question 2, if the partitioning is set to (Auto) and the upstream stage executes in parallel, then the partitioning algorithm used will be Same.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Bicchu
Participant
Posts: 26
Joined: Sun Oct 03, 2010 10:49 pm
Location: India

Post by Bicchu »

I had set that property to 'Propagate'
Thanks,
Pratik.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Then your Remove Duplicates stage will have its partitioning set to Same which, because the Transformer stage is running using Hash as its partitioning algorithm, will mean that the Remove Duplicates stage will execute using Hash partitioning (the Same as that used in the Transformer stage).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply