Hello all,
I am facing a problem with Join stage.
I do have two input links, being joined in left outer join mode. Subsequent to Join stage, there is a transformer implementing the logic to detect unmatched records, and place them in a file. In the records matched, they move on in the job flow.
To detect this unmatching, I use a dummy column.
The right link in the Join is comming with a constant column, got like this:
SELECT ......, '1' AS DUMMY
FROM....
So, in transformer, I only need to do this
NullToEmpty(DUMMY) <> '1'
But when executed the job, all records were considered unmatched.
I am running the job with a two node config file, and the two input links were defined, for Join stage, as hashed partitioning method on the column used for join, and performing sort.
I defined the join stage to run in Sequential mode, and the problem was fixed.
Can anybody please give me an idea on what could be happening?
Thanks a lot
Join stage only working running in Sequential mode
Moderators: chulett, rschirm, roy
-
- Premium Member
- Posts: 291
- Joined: Wed Sep 26, 2007 11:23 am
- Location: Madrid, Spain
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Overkill. Why not just allow relevant fields to be nullable and test for null? (Tip: test a key column from the outer input for null - it can never genuinely be null in its table because it's a key.)
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 62
- Joined: Sat Mar 07, 2009 4:59 am
- Location: Chicago
- Contact:
Re: Join stage only working running in Sequential mode
:D Hi,manuel.gomez wrote:Hello all,
I am facing a problem with Join stage.
I do have two input links, being joined in left outer join mode. Subsequent to Join stage, there is a transformer implementing the logic to detect unmatched records, and place them in a file. In the records matched, they move on in the job flow.
To detect this unmatching, I use a dummy column.
The right link in the Join is comming with a constant column, got like this:
SELECT ......, '1' AS DUMMY
FROM....
So, in transformer, I only need to do this
NullToEmpty(DUMMY) <> '1'
But when executed the job, all records were considered unmatched.
I am running the job with a two node config file, and the two input links were defined, for Join stage, as hashed partitioning method on the column used for join, and performing sort.
I defined the join stage to run in Sequential mode, and the problem was fixed.
Can anybody please give me an idea on what could be happening?
Thanks a lot
Based on your sentences i can say one thing based on my understanding
step1:
from query you are fetching constant column dummy which always contain the value as '1'
step2:
in the transformer you write the constraint as
NullToEmpty(DUMMY) <> '1', first this function on lefthand side always gives 1,because there is no null value in dummy column, if it is null it makes empty but here no null value only the values is 1 and when we compare 1<>1, it fails and all records which checked this condition becomes mismatched.is this useful or i understand your question in another way
Suresh Reddy
ETL Developer
Research Operations
"its important to know in which direction we are moving rather than where we are"
ETL Developer
Research Operations
"its important to know in which direction we are moving rather than where we are"
Re: Join stage only working running in Sequential mode
What you are missing is that the dummy column is coming from the right hand side of a left outer join. So when no match is found, this column will be null.sureshreddy2009 wrote: Based on your sentences i can say one thing based on my understanding
step1:
from query you are fetching constant column dummy which always contain the value as '1'
As for the actual problem, can't you use the lookup stage instead and define a reject link for when no match is found?
-
- Premium Member
- Posts: 291
- Joined: Wed Sep 26, 2007 11:23 am
- Location: Madrid, Spain
Re: Join stage only working running in Sequential mode
you are right Kryt0n, the dummy column is the right sideKryt0n wrote:What you are missing is that the dummy column is coming from the right hand side of a left outer join. So when no match is found, this column will be null.sureshreddy2009 wrote: Based on your sentences i can say one thing based on my understanding
step1:
from query you are fetching constant column dummy which always contain the value as '1'
As for the actual problem, can't you use the lookup stage instead and define a reject link for when no match is found?
No, we cant use lookup, we are joining big volumes of data, and the use of memory would be huge
-
- Participant
- Posts: 62
- Joined: Sat Mar 07, 2009 4:59 am
- Location: Chicago
- Contact:
Re: Join stage only working running in Sequential mode
Himanuel.gomez wrote:you are right Kryt0n, the dummy column is the right sideKryt0n wrote:What you are missing is that the dummy column is coming from the right hand side of a left outer join. So when no match is found, this column will be null.sureshreddy2009 wrote: Based on your sentences i can say one thing based on my understanding
step1:
from query you are fetching constant column dummy which always contain the value as '1'
As for the actual problem, can't you use the lookup stage instead and define a reject link for when no match is found?
No, we cant use lookup, we are joining big volumes of data, and the use of memory would be huge
I checked your query at my side, my conclusion is
i dont bother about the data input to join,but after join the field dummy contains value '1' for some records and null for some records,if this data will go to transformer then the function NullToEmpty(DUMMY) <> '1' results like this way,
step1:NullToEmpty(DUMMY) gives 1 for some records and empty(no value but not equal to null) for some records
step2:NullToEmpty(DUMMY) <> '1' constraint in transformer is giving output records which are containing null only and discarding the records containing value as '1'
let check at your side
Suresh Reddy
ETL Developer
Research Operations
"its important to know in which direction we are moving rather than where we are"
ETL Developer
Research Operations
"its important to know in which direction we are moving rather than where we are"
-
- Premium Member
- Posts: 291
- Joined: Wed Sep 26, 2007 11:23 am
- Location: Madrid, Spain
Re: Join stage only working running in Sequential mode
Thanks for answering sureshreddy2009sureshreddy2009 wrote:Himanuel.gomez wrote:you are right Kryt0n, the dummy column is the right sideKryt0n wrote: What you are missing is that the dummy column is coming from the right hand side of a left outer join. So when no match is found, this column will be null.
As for the actual problem, can't you use the lookup stage instead and define a reject link for when no match is found?
No, we cant use lookup, we are joining big volumes of data, and the use of memory would be huge
I checked your query at my side, my conclusion is
i dont bother about the data input to join,but after join the field dummy contains value '1' for some records and null for some records,if this data will go to transformer then the function NullToEmpty(DUMMY) <> '1' results like this way,
step1:NullToEmpty(DUMMY) gives 1 for some records and empty(no value but not equal to null) for some records
step2:NullToEmpty(DUMMY) <> '1' constraint in transformer is giving output records which are containing null only and discarding the records containing value as '1'
let check at your side
I am not sure if I properly understood you
What I said in first post was
The problem is this constraing was not working. All records were returning null value for DUMMY, when actually a few records should match dataSo, in transformer, I only need to do this
NullToEmpty(DUMMY) <> '1'
Setting sequential mode for Join, the issue was fixed, but this is not desirable solution to this.......