SCD stage not detecting a record already exists
Posted: Mon Jun 08, 2015 10:17 am
Running version 9x on linux
starting last week a random occurence of jobs using SCD stage do not appear to be partitoning the data correctly. Some input records are not being detected as already present and the stage is determining it to be a new record and inserting it into the target dimension table, causing duplicate records on the natural keys.
Also, despite the stage throwing the warning "Ignoring duplicate key entry trying to be inserted; no further warnings will be issued for this table" it still inserts the record into the target dimension.
The SCD stage for both the source data set coming in and the database reference link are set to manually Hash sort on the same 3 keys.
Anyone have any idea what might be going on all of a sudden? Server has been rebooted and issue still occurs. Again, it's random. We can delete the duplicates on a table and rerun the job and it detects the record exists and treates the inputs as updates not inserts. We tried this NOT recompiling and recompiling and it works ok. Only seems to happen during normal automated runs when other jobs are running in the batch schedule.
Thanks,
Glenn
starting last week a random occurence of jobs using SCD stage do not appear to be partitoning the data correctly. Some input records are not being detected as already present and the stage is determining it to be a new record and inserting it into the target dimension table, causing duplicate records on the natural keys.
Also, despite the stage throwing the warning "Ignoring duplicate key entry trying to be inserted; no further warnings will be issued for this table" it still inserts the record into the target dimension.
The SCD stage for both the source data set coming in and the database reference link are set to manually Hash sort on the same 3 keys.
Anyone have any idea what might be going on all of a sudden? Server has been rebooted and issue still occurs. Again, it's random. We can delete the duplicates on a table and rerun the job and it detects the record exists and treates the inputs as updates not inserts. We tried this NOT recompiling and recompiling and it works ok. Only seems to happen during normal automated runs when other jobs are running in the batch schedule.
Thanks,
Glenn