Page 1 of 1

Join Stage - Produces More Records Than Expected

Posted: Wed Jul 11, 2007 8:26 pm
by dtatem
Hello To All:

I created a job where I am reading in 20 records.In these 20 records, I have created a stage variable, groupkey and assigned it a value of "1" with the data and this result gets moved to a copy stage.

I am also using the DB2 Store procedure stage where I am passing these 20 rows to the store procedure to generate row_id, which it is doing correctly and using the transformer stage. Also in the transformer stage, I have also declared a stagevar ,Groupkey and also assigned it a value of "1".

The output from the db2 stored procedure have 2 columns, the Groupkey column and the row_ids which were generated, no other rows.

The first output, the initial 20 records have groupkey plus the other columns a total of 10 columns, being copied to the copy stage.

Now using the Join Stage to bring the records back, I am performing an inner join and joining on Groupkey. The result of the JOIN is producing 400 records. I would have thought that the join would produce only 20 records, not 400 records. So can someone tell me what I am doing wrong?

Any help and or ideas would be greatly appreciated.

Regards,

dtatem

Re: Join Stage - Produces More Records Than Expected

Posted: Wed Jul 11, 2007 10:23 pm
by pravin1581
dtatem wrote:Hello To All:

I created a job where I am reading in 20 records.In these 20 records, I have created a stage variable, groupkey and assigned it a value of "1" with the data and this result gets moved to a copy stage.

I am also using the DB2 Store procedure stage where I am passing these 20 rows to the store procedure to generate row_id, which it is doing correctly and using the transformer stage. Also in the transformer stage, I have also declared a stagevar ,Groupkey and also assigned it a value of "1".

The output from the db2 stored procedure have 2 columns, the Groupkey column and the row_ids which were generated, no other rows.

The first output, the initial 20 records have groupkey plus the other columns a total of 10 columns, being copied to the copy stage.

Now using the Join Stage to bring the records back, I am performing an inner join and joining on Groupkey. The result of the JOIN is producing 400 records. I would have thought that the join would produce only 20 records, not 400 records. So can someone tell me what I am doing wrong?

Any help and or ideas would be greatly appreciated.

Regards,

dtatem

The join stage is producing a cartesian product on basis of the key which has been set to 1 in both the input links.