What is the best practice for joining Nullable keys
Posted: Tue Jul 22, 2014 3:49 am
Hello,
I have a job design where source and target are oracle tables. There are two additional tables to be joined with the source. Job design is source ->join1->join2->copy->target. In both the joins the key column identified is nullable and the existing sample records has no null values in the data though these join keys has been defined as nulls. At the pre join step, data has been partitioned (hash) and sorted as per the join keys in each join stage as the join key varies between join1 and join2.
I would like to clarify how to handle this case as per the best practice as i have applied modify stage on both links of join considering the rule that key column should not be null.Point to be noted is the target column to be mapped is also null in target.
Also this scenario can be handled without modify stage keeping all keys right from source to target as nullable and i get the same result in both cases (with and without modify stages). My assumption is DS will drop records with null values in join keys and even i tried to simulate the join keys as null and still i could see the data is processing successful
Recommending the table definition to be changed for nullability of key columns is not possible in current scenario and hence no changes could be made to tables. Expected data volume in each table is around 100K records in each table
please share your thoughts. i have checked all previous mail threads but couldnt get this clarified and hence created a new post
I have a job design where source and target are oracle tables. There are two additional tables to be joined with the source. Job design is source ->join1->join2->copy->target. In both the joins the key column identified is nullable and the existing sample records has no null values in the data though these join keys has been defined as nulls. At the pre join step, data has been partitioned (hash) and sorted as per the join keys in each join stage as the join key varies between join1 and join2.
I would like to clarify how to handle this case as per the best practice as i have applied modify stage on both links of join considering the rule that key column should not be null.Point to be noted is the target column to be mapped is also null in target.
Also this scenario can be handled without modify stage keeping all keys right from source to target as nullable and i get the same result in both cases (with and without modify stages). My assumption is DS will drop records with null values in join keys and even i tried to simulate the join keys as null and still i could see the data is processing successful
Recommending the table definition to be changed for nullability of key columns is not possible in current scenario and hence no changes could be made to tables. Expected data volume in each table is around 100K records in each table
please share your thoughts. i have checked all previous mail threads but couldnt get this clarified and hence created a new post