Page 1 of 1

2 ODBC Stages Open command not deleting each other's data.

Posted: Mon Aug 30, 2010 11:19 am
by highpoint
Hi,

Looks like we found a bug in code but was not affecting data. Would appreciate gurus advise on the below topic why this was not affecting data:

In my job for target we have 2 ODBC stages writing to the same table.

And each odbc stage uses an "Open Command" which deletes the table before it loads anything in to the table.


process how it works:

1st odbc stage runs i.e executes the open command (delete statement) and then inserts.
2nd odbc stage runs i.e executes the open command (delete statement) and then inserts.

My understanding was which ever odbc stage executes later will delete the data inserted by the earlier odbc stage. But looks like this is not happening.

Appreciate your help in understanding this.

Posted: Mon Aug 30, 2010 5:59 pm
by ray.wurlod
Open command executes when the stage opens. Unless there's some designed-in reason not to, every stage opens when the job opens.

Posted: Mon Aug 30, 2010 10:35 pm
by highpoint
ray.wurlod wrote:Open command executes when the stage opens. Unless there's some designed-in reason not to, every stage opens when the job opens.
Thanks Ray Wurlod!!

I was under assumption that the ODBCwhich recieves records first will get open first. Thanks for the clarification.

So, in my case if i have "same delete statement" as open commnad in one ODBC or more ODBC does not make any difference as they are executed at the same time right? Will having delete in one ODBC serves the same purpose of having in multiple ODBC or is ther any difference?

And how does the close command work then? Is it all stages get close at the same time?

Appreciate your reply.

Posted: Tue Aug 31, 2010 12:26 am
by ray.wurlod
Again, that depends on the job design. Stages such as Sort and Aggregator (Hash mode) can block execution of downstream stages. In an unblocked design stages close as close to simultaneously as whatever is controlling the processes can manage.

Posted: Tue Aug 31, 2010 9:25 am
by highpoint
ray.wurlod wrote:Again, that depends on the job design. Stages such as Sort and Aggregator (Hash mode) can block execution of downstream stages. In an unblocked design stages close as close to simultaneously as whatever is controlling the processes can manage.
Thanks Ray!! Still i am not very clear.

Say if we dont have any sort and aggregator stage in job, then ideally both the ODBC open command "Delete Statements" shall run simultaneously. How will 2 Deletes statements run on the same table at the same time. Each statement will have to get exclusive lock on the table block before it can process delete. How will this be handled in simultaneous deletes.

And if one run's after the other say after few millisec, then we stand by the risk of losing the data inserted by the 1st ODBC stage before the 2nd ODBC stage delete happens.


Then what will be the best option to handle this.
We want to delete data before any ODBC stage in the job writes to table.
And once any ODBC stage writes data to th table we dont want to lose it.

Posted: Tue Aug 31, 2010 9:30 am
by chulett
Too may variables with your current approach. Why not do the delete / truncate as a separate operation before this job runs?

Posted: Tue Aug 31, 2010 9:57 am
by highpoint
chulett wrote:Too may variables with your current approach. Why not do the delete / truncate as a separate operation before this job runs?
Shall we do it in before job sub routine.
If so do we have to write a seperate shell for this?

Or is there any other way to achieve this.?

Posted: Tue Aug 31, 2010 12:24 pm
by ray.wurlod
Nothing's different in DataStage - it's just another client as far as the database is concerned. If you issue two DELETE statements on the same table, the first to start executing will take an exclusive table lock and the second will wait until that lock is released (or a timeout error occurs).

Posted: Tue Aug 31, 2010 4:42 pm
by highpoint
ray.wurlod wrote:Nothing's different in DataStage - it's just another client as far as the database is concerned. If you issue two DELETE statements on the same table, the first to start executing will take an exclusive table lock and the second will wait until that lock is released (or a timeout error occurs).
So, In this case say 1st ODBC takes the exclusive table lock, deletes the records, issues a commit and then starts loading data.

Then the 2nd ODBC stage opens say suppose later may be because this has aggregator and sort over its link. Then the 2nd ODBC will delete the data loaded by the 1st ODBC.

Is this scenario possible? If Yes
which way would you suggest to avoid any risk of losing data.
Continue having the open command in each ODBC or use before job sub routine or is there any other way to reduce the risk?

Appreciate your reply.

Posted: Tue Aug 31, 2010 8:15 pm
by ray.wurlod
Yet again, it depends. If process 1 is busily updating, then it will be taking at least record level locks that will prevent process 2 from obtaining a table level lock. So how are the deletes being accomplished? Through SQL (which will start with record level locks but will probably end up promoting them) or truncate command (which is not transactional and will take a table level lock as it starts)?

Posted: Tue Aug 31, 2010 10:49 pm
by highpoint
Delete are being accomplished using the sql delete command some thing like this:

Delete from table where col=abc;

Appreciate reply.

Posted: Wed Sep 01, 2010 1:26 am
by ray.wurlod
Sentence #2 of my previous reply.