Need a clarification

Bicchu · Post by **Bicchu** » Thu Jun 23, 2011 10:23 pm

Hi All,

I need this clarification from all of you. Thanks in advance. :D

My job design is:

Dataset -----> Transformer ------> Remove Duplicate ------> Dataset

In the input link of the transformer stage, I am performing a 'HASH' partioning.

In the input link of the Remove Duplicate I am using 'SAME' partioning.

Now, my question I got the information from my code reviewer that transformer stage is not capable of retening partioning in the output link.
It automatically converts the partioning to 'AUTO'. So, RD stage will have 'AUTO' partioning in the input link. Is that so?

2. RD is a key based stage, it is getting 'AUTO' partioning the input link (if my question 1 is correct). So, will DS optimize the partioning to 'HASH'?

I will be delighted if you all can throw some light on my doubts.

ray.wurlod · Post by **ray.wurlod** » Thu Jun 23, 2011 11:53 pm

Bicchu wrote:Hi All, I need this clarification from all of you.

So, you want 38309 replies. Is this correct?

Bicchu · Post by **Bicchu** » Thu Jun 23, 2011 11:55 pm

Sorry, for that line.

I just want what are the answers for my question.

Thanks,
Pratik

ray.wurlod · Post by **ray.wurlod** » Thu Jun 23, 2011 11:57 pm

Visit the stage properties of the Transformer stage and tell us whether its Propagate property is set to Set, Clear or Default. It's on the Advanced tab. And, if it's not Clear, then your code reviewer is wrong, wrong, wrong.

You can prove this by inspecting the score.

As to question 2, if the partitioning is set to (Auto) and the upstream stage executes in parallel, then the partitioning algorithm used will be Same.

Bicchu · Post by **Bicchu** » Sat Jun 25, 2011 2:28 pm

I had set that property to 'Propagate'

ray.wurlod · Post by **ray.wurlod** » Sat Jun 25, 2011 5:01 pm

Then your Remove Duplicates stage will have its partitioning set to Same which, because the Transformer stage is running using Hash as its partitioning algorithm, will mean that the Remove Duplicates stage will execute using Hash partitioning (the Same as that used in the Transformer stage).

DSXchange

Need a clarification

Need a clarification

Re: Need a clarification

Re: Need a clarification