Page 1 of 2

full outer join (returning defaults)

Posted: Sun Jul 25, 2010 8:13 pm
by mystuff
Hi,

I am using FullOuterJoin in my job. All its inputs from start of the stream (this is the first job in the stream) have been kept as nullable (Yes). When I try to join based on Key fields say KeyField_1, KeyField_2. I am getting default values instead of nulls when no match is found.

I managed to implement it once to obtain nulls if match is not found instead of default values. But not able to recollect how to do it. Can anyone help.

Thanks

Posted: Sun Jul 25, 2010 8:25 pm
by ray.wurlod
Remove the default values from the record schema. I assume you've used different names to receive the key values from left and right inputs.

Posted: Sun Jul 25, 2010 8:29 pm
by mystuff
I have checked whether the record schema has default values, but it doesn't have any default values (Neither provision for providing default values, I pressed right click on the column def and selected Edit Row).

For the KeyFields I have used the same names. Full Outer Join is creating two fields (leftRec_keyfield_1 & rightRec_keyfield_2) in output for each Key field

Posted: Mon Jul 26, 2010 12:07 am
by ray.wurlod
Those default values have to be coming from somewhere! Check all upstream stages.

Posted: Mon Jul 26, 2010 12:12 am
by madhukar
check all stages from input till join for defaults and nullability

Posted: Mon Jul 26, 2010 1:41 am
by kumar_s
Hope you are not just doing the Veiw data, which might some times trick you with the default values with the given stage properties.
If you are sure that, the defaults are not from source stages, Peek it right after the Join.

Posted: Mon Jul 26, 2010 5:26 am
by mystuff
Hi,

ray.wurlod wrote: Those default values have to be coming from somewhere! Check all upstream stages.

when I go into the records schema definition for all the upstream stages, they show up as "

Code: Select all

No schema properties can be specified at this point
"

kumar_s wrote:Hope you are not just doing the Veiw data, which might some times trick you with the default values with the given stage properties.
If you are sure that, the defaults are not from source stages, Peek ...
Those default values are actually getting created, as they are creating issues with my downstream jobs.

Posted: Mon Jul 26, 2010 6:52 am
by ray.wurlod
Try setting the environment variable that prints the record schema used by each operator into the log. From memory it's OSH_PRINT_SCHEMAS (but please do check).

Posted: Wed Jul 28, 2010 8:24 pm
by mystuff
ray.wurlod wrote:Try setting the environment variable that prints the record schema used by each operator into the log. From memory it's OSH_PRINT_SCHEMAS (but please do check). ...
Hi Ray I tried to enable OSH_PRINT_SCHEMAS by making it TRUE, but can't find it in director log. I have gone through the logs, but wasn't able to check the metadata definitions.

Is there anywhere in particular that I need to search.

Posted: Wed Jul 28, 2010 8:25 pm
by mystuff
ray.wurlod wrote:Try setting the environment variable that prints the record schema used by each operator into the log. From memory it's OSH_PRINT_SCHEMAS (but please do check). ...
Hi Ray I tried to enable OSH_PRINT_SCHEMAS by making it TRUE, but can't find it in director log. I have gone through the logs, but wasn't able to check the metadata definitions.

Is there anywhere in particular that I need to search.

Posted: Wed Jul 28, 2010 8:36 pm
by ray.wurlod
It's an environment variable that exists in the Administrator client, in the Reporting group. You can add it to a job as a parameter using "Add Environment Variable" in the Parameters page of the job properties dialog. Once it's there leave its default value as False and only set it to True on those runs where you need it.

Posted: Fri Jul 30, 2010 6:47 am
by mystuff
Hi Ray,

I have enabled the OSH_PRINT_SCHEMA and tried to look in the logs. But the log doesn't give schema information of each link. It just gives info about one of the inputs (a dataset) and not for the other inputs (teradata enterprise).

I am looking at the OSH Script in the director log. Please let me know if I am not looking at the right place.

Posted: Fri Jul 30, 2010 6:49 am
by ArndW
If you activated $OSH_PRINT_SCHEMAS then you will have a log entry entitled "main_program: Schemas: (...)." and that will contain the schema desciriptions for each and every runtime link (n.b. that the optimizer might combine some stages)

Posted: Tue Aug 03, 2010 4:51 am
by mystuff
Hi,

I have looked at the schema's. While unloading from teradata, the schema is defined as Not Nullable for few columns, even though I have specified them as nullable in metadata. This got carried forward to most of the stages until a transformer is used.

Is there a best way to change it without using transformer. Modify stage is as well not allowing to change them to nullable.

Posted: Tue Aug 03, 2010 5:51 am
by ArndW
The modify stage should do this. What happens when you change the nullability from not-nullable to nullable in the output link in a modfy stage?