Running version 9x on linux
starting last week a random occurence of jobs using SCD stage do not appear to be partitoning the data correctly. Some input records are not being detected as already present and the stage is determining it to be a new record and inserting it into the target dimension table, causing duplicate records on the natural keys.
Also, despite the stage throwing the warning "Ignoring duplicate key entry trying to be inserted; no further warnings will be issued for this table" it still inserts the record into the target dimension.
The SCD stage for both the source data set coming in and the database reference link are set to manually Hash sort on the same 3 keys.
Anyone have any idea what might be going on all of a sudden? Server has been rebooted and issue still occurs. Again, it's random. We can delete the duplicates on a table and rerun the job and it detects the record exists and treates the inputs as updates not inserts. We tried this NOT recompiling and recompiling and it works ok. Only seems to happen during normal automated runs when other jobs are running in the batch schedule.
Thanks,
Glenn
SCD stage not detecting a record already exists
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
One of the difficulties in this is it's not consistent. You could delete the duplicates that got inserted and then rerun the ETL job and the job would detect that the record existed and just update.
Anyway we removed the APY_OLD_BOUNDED_LENGTH from the environment variable section and did NOT experience any job using the SCD stage last night throwing the warning and then inserting a duplicate record on the natural key.
A colleague found this article which seems to be a bug with this environment variable and the SCD stage working together. The article points to the variable being "turned on" which we would assume meant value set to "True", but we removed it altogether because even though the variable was set to 'False' in the environment it still looks like it caused an issue.
http://www-01.ibm.com/support/docview.w ... wg1JR45634
Hopefully this was the cause and we'll continue to monitor.
Anyway we removed the APY_OLD_BOUNDED_LENGTH from the environment variable section and did NOT experience any job using the SCD stage last night throwing the warning and then inserting a duplicate record on the natural key.
A colleague found this article which seems to be a bug with this environment variable and the SCD stage working together. The article points to the variable being "turned on" which we would assume meant value set to "True", but we removed it altogether because even though the variable was set to 'False' in the environment it still looks like it caused an issue.
http://www-01.ibm.com/support/docview.w ... wg1JR45634
Hopefully this was the cause and we'll continue to monitor.