Data loss potential ODBC stage issue.

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
highpoint
Premium Member
Premium Member
Posts: 123
Joined: Sat Jun 19, 2010 12:01 am
Location: Chicago

Data loss potential ODBC stage issue.

Post by highpoint »

Hi

We have a datastage job which reads from a table and loads in to target table using 2 odbc stages.


Code: Select all

Diagram
				--------Transformer-----ODBC

ODBC ----------processing

				------Transformer-------ODBC




NOTE: Both the target ODBCs load the same table.


We are implementing Slowly changing Dimension 1.
To implement Slowly changing Dimension 1 both the Target ODBC stages use the "delete statement" in the open command of Target ODBC stages.


The issue we are facing is there are some records missing in the target table after ETL is executed.
The loss of data happens only in PROD environment. We could not recreate the same scenario in lower environments.
For debugging we added the datasets at the transformers just before the Target ODBC stages and could see data in datasets but not in table.


Is this being caused because the "delete statements" are being executed after one another.

Technically, to my knowledge they should execute "delete statements" the moment job starts.

Any help resolving this issue is highly appreciated
MarkB
Premium Member
Premium Member
Posts: 95
Joined: Fri Oct 27, 2006 9:13 am

Post by MarkB »

Since you are loading the same table twice in the same job, my guess would be that your deletes are stepping on each other. My question is, why the deletes? If you are treating a dimension as a SCD1, you don't care about history and are overwriting existing rows and inserting new ones, so shouldn't you be doing upserts (or update then insert) and not use a delete?
Post Reply