Join stage only working running in Sequential mode

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
manuel.gomez
Premium Member
Premium Member
Posts: 291
Joined: Wed Sep 26, 2007 11:23 am
Location: Madrid, Spain

Join stage only working running in Sequential mode

Post by manuel.gomez »

Hello all,

I am facing a problem with Join stage.

I do have two input links, being joined in left outer join mode. Subsequent to Join stage, there is a transformer implementing the logic to detect unmatched records, and place them in a file. In the records matched, they move on in the job flow.

To detect this unmatching, I use a dummy column.
The right link in the Join is comming with a constant column, got like this:

SELECT ......, '1' AS DUMMY
FROM....

So, in transformer, I only need to do this

NullToEmpty(DUMMY) <> '1'

But when executed the job, all records were considered unmatched.

I am running the job with a two node config file, and the two input links were defined, for Join stage, as hashed partitioning method on the column used for join, and performing sort.

I defined the join stage to run in Sequential mode, and the problem was fixed.

Can anybody please give me an idea on what could be happening?

Thanks a lot
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Overkill. Why not just allow relevant fields to be nullable and test for null? (Tip: test a key column from the outer input for null - it can never genuinely be null in its table because it's a key.)
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
sureshreddy2009
Participant
Posts: 62
Joined: Sat Mar 07, 2009 4:59 am
Location: Chicago
Contact:

Re: Join stage only working running in Sequential mode

Post by sureshreddy2009 »

manuel.gomez wrote:Hello all,

I am facing a problem with Join stage.

I do have two input links, being joined in left outer join mode. Subsequent to Join stage, there is a transformer implementing the logic to detect unmatched records, and place them in a file. In the records matched, they move on in the job flow.

To detect this unmatching, I use a dummy column.
The right link in the Join is comming with a constant column, got like this:

SELECT ......, '1' AS DUMMY
FROM....

So, in transformer, I only need to do this

NullToEmpty(DUMMY) <> '1'

But when executed the job, all records were considered unmatched.

I am running the job with a two node config file, and the two input links were defined, for Join stage, as hashed partitioning method on the column used for join, and performing sort.

I defined the join stage to run in Sequential mode, and the problem was fixed.

Can anybody please give me an idea on what could be happening?

Thanks a lot
:D Hi,

Based on your sentences i can say one thing based on my understanding
step1:
from query you are fetching constant column dummy which always contain the value as '1'
step2:
in the transformer you write the constraint as
NullToEmpty(DUMMY) <> '1', first this function on lefthand side always gives 1,because there is no null value in dummy column, if it is null it makes empty but here no null value only the values is 1 and when we compare 1<>1, it fails and all records which checked this condition becomes mismatched.is this useful or i understand your question in another way
Suresh Reddy
ETL Developer
Research Operations

"its important to know in which direction we are moving rather than where we are"
Kryt0n
Participant
Posts: 584
Joined: Wed Jun 22, 2005 7:28 pm

Re: Join stage only working running in Sequential mode

Post by Kryt0n »

sureshreddy2009 wrote: Based on your sentences i can say one thing based on my understanding
step1:
from query you are fetching constant column dummy which always contain the value as '1'
What you are missing is that the dummy column is coming from the right hand side of a left outer join. So when no match is found, this column will be null.

As for the actual problem, can't you use the lookup stage instead and define a reject link for when no match is found?
manuel.gomez
Premium Member
Premium Member
Posts: 291
Joined: Wed Sep 26, 2007 11:23 am
Location: Madrid, Spain

Re: Join stage only working running in Sequential mode

Post by manuel.gomez »

Kryt0n wrote:
sureshreddy2009 wrote: Based on your sentences i can say one thing based on my understanding
step1:
from query you are fetching constant column dummy which always contain the value as '1'
What you are missing is that the dummy column is coming from the right hand side of a left outer join. So when no match is found, this column will be null.

As for the actual problem, can't you use the lookup stage instead and define a reject link for when no match is found?
you are right Kryt0n, the dummy column is the right side

No, we cant use lookup, we are joining big volumes of data, and the use of memory would be huge
sureshreddy2009
Participant
Posts: 62
Joined: Sat Mar 07, 2009 4:59 am
Location: Chicago
Contact:

Re: Join stage only working running in Sequential mode

Post by sureshreddy2009 »

manuel.gomez wrote:
Kryt0n wrote:
sureshreddy2009 wrote: Based on your sentences i can say one thing based on my understanding
step1:
from query you are fetching constant column dummy which always contain the value as '1'
What you are missing is that the dummy column is coming from the right hand side of a left outer join. So when no match is found, this column will be null.

As for the actual problem, can't you use the lookup stage instead and define a reject link for when no match is found?
you are right Kryt0n, the dummy column is the right side

No, we cant use lookup, we are joining big volumes of data, and the use of memory would be huge
Hi
I checked your query at my side, my conclusion is
i dont bother about the data input to join,but after join the field dummy contains value '1' for some records and null for some records,if this data will go to transformer then the function NullToEmpty(DUMMY) <> '1' results like this way,
step1:NullToEmpty(DUMMY) gives 1 for some records and empty(no value but not equal to null) for some records

step2:NullToEmpty(DUMMY) <> '1' constraint in transformer is giving output records which are containing null only and discarding the records containing value as '1'
let check at your side
Suresh Reddy
ETL Developer
Research Operations

"its important to know in which direction we are moving rather than where we are"
manuel.gomez
Premium Member
Premium Member
Posts: 291
Joined: Wed Sep 26, 2007 11:23 am
Location: Madrid, Spain

Re: Join stage only working running in Sequential mode

Post by manuel.gomez »

sureshreddy2009 wrote:
manuel.gomez wrote:
Kryt0n wrote: What you are missing is that the dummy column is coming from the right hand side of a left outer join. So when no match is found, this column will be null.

As for the actual problem, can't you use the lookup stage instead and define a reject link for when no match is found?
you are right Kryt0n, the dummy column is the right side

No, we cant use lookup, we are joining big volumes of data, and the use of memory would be huge
Hi
I checked your query at my side, my conclusion is
i dont bother about the data input to join,but after join the field dummy contains value '1' for some records and null for some records,if this data will go to transformer then the function NullToEmpty(DUMMY) <> '1' results like this way,
step1:NullToEmpty(DUMMY) gives 1 for some records and empty(no value but not equal to null) for some records

step2:NullToEmpty(DUMMY) <> '1' constraint in transformer is giving output records which are containing null only and discarding the records containing value as '1'
let check at your side
Thanks for answering sureshreddy2009

I am not sure if I properly understood you

What I said in first post was

So, in transformer, I only need to do this

NullToEmpty(DUMMY) <> '1'
The problem is this constraing was not working. All records were returning null value for DUMMY, when actually a few records should match data

Setting sequential mode for Join, the issue was fixed, but this is not desirable solution to this.......
Post Reply