full outer join (returning defaults)

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

mystuff
Premium Member
Premium Member
Posts: 200
Joined: Wed Apr 11, 2007 2:06 pm

full outer join (returning defaults)

Post by mystuff »

Hi,

I am using FullOuterJoin in my job. All its inputs from start of the stream (this is the first job in the stream) have been kept as nullable (Yes). When I try to join based on Key fields say KeyField_1, KeyField_2. I am getting default values instead of nulls when no match is found.

I managed to implement it once to obtain nulls if match is not found instead of default values. But not able to recollect how to do it. Can anyone help.

Thanks
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Remove the default values from the record schema. I assume you've used different names to receive the key values from left and right inputs.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
mystuff
Premium Member
Premium Member
Posts: 200
Joined: Wed Apr 11, 2007 2:06 pm

Post by mystuff »

I have checked whether the record schema has default values, but it doesn't have any default values (Neither provision for providing default values, I pressed right click on the column def and selected Edit Row).

For the KeyFields I have used the same names. Full Outer Join is creating two fields (leftRec_keyfield_1 & rightRec_keyfield_2) in output for each Key field
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Those default values have to be coming from somewhere! Check all upstream stages.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
madhukar
Participant
Posts: 86
Joined: Fri May 20, 2005 4:05 pm

Post by madhukar »

check all stages from input till join for defaults and nullability
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

Hope you are not just doing the Veiw data, which might some times trick you with the default values with the given stage properties.
If you are sure that, the defaults are not from source stages, Peek it right after the Join.
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
mystuff
Premium Member
Premium Member
Posts: 200
Joined: Wed Apr 11, 2007 2:06 pm

Post by mystuff »

Hi,

ray.wurlod wrote: Those default values have to be coming from somewhere! Check all upstream stages.

when I go into the records schema definition for all the upstream stages, they show up as "

Code: Select all

No schema properties can be specified at this point
"

kumar_s wrote:Hope you are not just doing the Veiw data, which might some times trick you with the default values with the given stage properties.
If you are sure that, the defaults are not from source stages, Peek ...
Those default values are actually getting created, as they are creating issues with my downstream jobs.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Try setting the environment variable that prints the record schema used by each operator into the log. From memory it's OSH_PRINT_SCHEMAS (but please do check).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
mystuff
Premium Member
Premium Member
Posts: 200
Joined: Wed Apr 11, 2007 2:06 pm

Post by mystuff »

ray.wurlod wrote:Try setting the environment variable that prints the record schema used by each operator into the log. From memory it's OSH_PRINT_SCHEMAS (but please do check). ...
Hi Ray I tried to enable OSH_PRINT_SCHEMAS by making it TRUE, but can't find it in director log. I have gone through the logs, but wasn't able to check the metadata definitions.

Is there anywhere in particular that I need to search.
mystuff
Premium Member
Premium Member
Posts: 200
Joined: Wed Apr 11, 2007 2:06 pm

Post by mystuff »

ray.wurlod wrote:Try setting the environment variable that prints the record schema used by each operator into the log. From memory it's OSH_PRINT_SCHEMAS (but please do check). ...
Hi Ray I tried to enable OSH_PRINT_SCHEMAS by making it TRUE, but can't find it in director log. I have gone through the logs, but wasn't able to check the metadata definitions.

Is there anywhere in particular that I need to search.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

It's an environment variable that exists in the Administrator client, in the Reporting group. You can add it to a job as a parameter using "Add Environment Variable" in the Parameters page of the job properties dialog. Once it's there leave its default value as False and only set it to True on those runs where you need it.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
mystuff
Premium Member
Premium Member
Posts: 200
Joined: Wed Apr 11, 2007 2:06 pm

Post by mystuff »

Hi Ray,

I have enabled the OSH_PRINT_SCHEMA and tried to look in the logs. But the log doesn't give schema information of each link. It just gives info about one of the inputs (a dataset) and not for the other inputs (teradata enterprise).

I am looking at the OSH Script in the director log. Please let me know if I am not looking at the right place.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

If you activated $OSH_PRINT_SCHEMAS then you will have a log entry entitled "main_program: Schemas: (...)." and that will contain the schema desciriptions for each and every runtime link (n.b. that the optimizer might combine some stages)
mystuff
Premium Member
Premium Member
Posts: 200
Joined: Wed Apr 11, 2007 2:06 pm

Post by mystuff »

Hi,

I have looked at the schema's. While unloading from teradata, the schema is defined as Not Nullable for few columns, even though I have specified them as nullable in metadata. This got carried forward to most of the stages until a transformer is used.

Is there a best way to change it without using transformer. Modify stage is as well not allowing to change them to nullable.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

The modify stage should do this. What happens when you change the nullability from not-nullable to nullable in the output link in a modfy stage?
Post Reply