Page 1 of 1

Again Join Stage (Data Error)

Posted: Thu Jul 17, 2008 3:44 am
by Gopinath
Hi,

Iam just Joining two Datasets with single Key column using JOIN stage. The Data has been verified for NULL, TRIM etc. The output of Join is giving 0 records but iam having 3matching records.

Iam getting the same three matching records in Backend Query, while using Lookup stage but whats the problem in Join stage.

I tried to change with sequential mode(JOIN Stage), then iam getting the matching records but i beleive this is not advisable coz of huge volume of Data. Am i right here...

One of my dataset having millions of records and another is having in thousands...

If i use Lookup stage for this scenario since it's working fine but in actual scenario the both Datasets is going to give Millions of records.


I tried giving partition but same problem like no HIT or few matching records is coming etc..


Awaiting for experts reply..

Thanks,
M Gopinath

Posted: Thu Jul 17, 2008 4:10 am
by mahadev.v
Looks like a partition problem to me. What partition have you used? Use hash partition on key fields on both links to the join.

Posted: Thu Jul 17, 2008 4:20 am
by Gopinath
mahadev.v wrote:Looks like a partition problem to me. What partition have you used? Use hash partition on key fields on both links to the join.

Hi,
NIce to get a reply from you.. Thanks..

ya its partition problem only, i have tried using Hash partition then iam getting 2records out of the 3matching records...

Any other option to be used..

Posted: Thu Jul 17, 2008 4:33 am
by ersunnys
Gopinath wrote:
mahadev.v wrote:Looks like a partition problem to me. What partition have you used? Use hash partition on key fields on both links to the join.

Hi,
NIce to get a reply from you.. Thanks..

ya its partition problem only, i have tried using Hash partition then iam getting 2records out of the 3matching records...

Any other option to be used..
Gopi,

Use Hash partition in Join stage for both input links with sort on Key column.

Sunny

Posted: Thu Jul 17, 2008 4:53 am
by Gopinath
ersunnys wrote:
Gopinath wrote:
mahadev.v wrote:Looks like a partition problem to me. What partition have you used? Use hash partition on key fields on both links to the join.

Hi,
NIce to get a reply from you.. Thanks..

ya its partition problem only, i have tried using Hash partition then iam getting 2records out of the 3matching records...

Any other option to be used..
Gopi,

Use Hash partition in Join stage for both input links with sort on Key column.

Sunny

Hi,
I have used Has partition but not getting complete matching records..
Iam getting 2records out of the 3matching records as i have replied in previous post..

Posted: Thu Jul 17, 2008 6:30 am
by keshav0307
if hash partition on join keys not working then something unusual happening there.
do you have any char or varchar column in join keys?

Posted: Thu Jul 17, 2008 7:02 am
by Gopinath
keshav0307 wrote:if hash partition on join keys not working then something unusual happening there.
do you have any char or varchar column in join keys?
HI,

The Join column is Varchar column and i have handled TRIM and NULL function in extract query itself and also in the Intermediate Transformers..

Posted: Thu Jul 17, 2008 7:26 am
by mahadev.v
Can you identify the 3rd record? See what exactly is coming into the Join Stage. Maybe a copy, filter and a peek stage on each link would do. And how are you handling the null in the key field (Since you mentioned TRIM and NULL)?

Posted: Thu Jul 17, 2008 8:10 pm
by keshav0307
does the type and length of columns in both input links are same?