Page 1 of 1

mismatch in join stage

Posted: Sun May 13, 2007 12:29 am
by bgs
I have two inputs to the join stage.
1. from a dataset which is hash partitioned on col A and sorted on col B.
2. from a transformer.

In the join stage I used partitioning type as "same" for the link coming from dataset and for the link from transformer I selected the hash partition on col A and sorted on col B.
The key column is col B. With these setting I am getting a mismatch in join. Could someone tell me if there is any mistake in my settings.

Thanks

Posted: Sun May 13, 2007 4:12 pm
by ray.wurlod
Take a look at the score. DataStage is probably inserting tsort operators that sort on the hash partitioning key. Add Sort stage set to "don't sort (previously sorted)" on the link from the Data Set.

Posted: Sun May 13, 2007 4:45 pm
by nick.bond
You need to partition by your key column.

Imagine you have this

DataSet
ColA ColB
1 x
2 y

Transformer
ColA ColB
1 y
2 x

If you partition by ColA it is likely that your matching records x=x and y=y will be on separate partitions so will not match.

Posted: Mon May 14, 2007 9:01 am
by bgs
hi nick,
value in colA is the last character of colB which will have value between 0-9,so all the records with same value should fall in same partition.
I tried repartitioning the data from the dataset and it worked.But I thought using partition type "same" should also work.

Posted: Mon May 14, 2007 4:31 pm
by nick.bond
value in colA is the last character of colB
didn't know that!

...then i would also expect 'Same' to work......

...are you running this job on the same number of nodes as the job that created the dataset?

..can you do as Ray suggested and take a look at the score? perhaps you could post it here...