Regarding Partitions

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
manojbh31
Premium Member
Premium Member
Posts: 83
Joined: Thu Jun 21, 2007 6:41 am

Regarding Partitions

Post by manojbh31 »

Hi,

I have job where i am using join stage and doing left outer join. Whenever i run the job the output varies. Means for each run the output count varies. earlier I was not sorting the data before joining. I tried to use hash partition and sort the data in transformer but this is also not working properly. Can anybody help me to understand how to use partitions in PX. Data is about 0.5 Million records.

Thanks
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

The join stage requires sorted input to work properly. Not sure how one would sort 'in a transformer', so make sure that either happens in your source selects (if from a database) or via explicit sort operations before the join. The hash partitioning should be good as long as you are partitioning on the same keys you are sorting / joining on.
-craig

"You can never have too many knives" -- Logan Nine Fingers
daignault
Premium Member
Premium Member
Posts: 165
Joined: Tue Mar 30, 2004 2:44 pm
Contact:

Post by daignault »

Actually, in the properties of the transformer, you can define a sort (stable or unstable) as well as partitioning info for the incoming virtual dataset.

When I deliver EE training, I suggest to developers that they use the SORT stage instead of the sort repartitioning on the link. I like the ability of the sort stage to tailor memory usage which is not available on the link properties.

Ray Daignault
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

I figured what was meant was a sort on a transformer link, which to me isn't "in" the transformer but that could just be semantics... or I could just be wrong. :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

It's always the link.

What confuses people is that they open the link properties via the Input tab in the stage properties.

In most cases, of course, you can right click the stage and open the link properties directly.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply