Join growth on distinct Account ID

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
jiegao
Premium Member
Premium Member
Posts: 46
Joined: Fri Sep 22, 2006 6:12 pm

Join growth on distinct Account ID

Post by jiegao »

Hi,

Currently we are doing a left join (hash partition on Account ID) on two Oracle tables on Account ID. The left table is a simple select statement with more than 1 unique Account ID, whereas the right statement has the 'Distinct' keyword and only selecting Account ID's from the table.

My questions is: Why do the link statistics show join growth on the output of the join? Clearly, if there is a distinct right account id the join output should be the same as the left input.

I replicated the query in Oracle and it gives me the results which I think are correct which leads me to believe something is happening in Datastage that I don't understand.

Thanks in advance!
Regards
Jie
jiegao
Premium Member
Premium Member
Posts: 46
Joined: Fri Sep 22, 2006 6:12 pm

Re: Join growth on distinct Account ID

Post by jiegao »

I think I know what is happening here. This is caused by the "Enable partitioned read" on the data source. DISTINCT and "Enable partitioned read" should not be used at the same time.
Regards
Jie
Post Reply