Abort the job from "After job subroutine"

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Sudhindra_ps
Participant
Posts: 45
Joined: Thu Aug 31, 2006 3:13 am
Location: Bangalore

Abort the job from "After job subroutine"

Post by Sudhindra_ps »

hi,

I have shell script invoked from "After job Subroutine" which throws back the success/failure status based on some logic. Now, when script throws back failure status(i.e exit 1) job isn't getting aborted rather it just says "Warning Message" as mentioned below.
---------------------------------------------------------------------------------
After-job routine returned error: Error 1 returned from AFTER routine DSU.ExecSH
---------------------------------------------------------------------------------

In order to overcome above situation I tried to use "dsjob -stop" to stop the job execution but still no luck to abort the job. Could anybody please help me out on to how can I abort the job based on failure status of "after job subroutine".

Thanks & regards
Sudhindra P S
balajisr
Charter Member
Charter Member
Posts: 785
Joined: Thu Jul 28, 2005 8:58 am

Post by balajisr »

Set your warning limit to 1.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Build an routine that calls the script and checks the result. Call DSLogFatal() if the script fails.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Sudhindra_ps
Participant
Posts: 45
Joined: Thu Aug 31, 2006 3:13 am
Location: Bangalore

Post by Sudhindra_ps »

hi Chulett,

I have two ETL jobs one for delete and one for Insert. Am performing delete first in the first ETL job and am calling a second ETL job from "after job subroutine" to do the data insert into the table. Incase if second job fails then I have to rollback the delete operation what I did in the first ETL job so, I was aborting the first job by killing its ProcessID in the shell script. So, that when job aborts I could rollback the data by setting "Ora_upsert Commit Row/Time interval" properties the first ETL job.
I was able to abort first ETL job based on the failure status of second ETL job by killing PID of first ETL job. But it looks like even if I abort first ETL job the data is getting deleted by the first ETL job and it is not getting rolled back. Could you please suggest me an solution on to how can I fix this issue.

Thanks & regards
Sudhindra P S
Sudhindra_ps
Participant
Posts: 45
Joined: Thu Aug 31, 2006 3:13 am
Location: Bangalore

Post by Sudhindra_ps »

hi Chulett,

I have two ETL jobs one for delete and one for Insert. Am performing delete first in the first ETL job and am calling a second ETL job from "after job subroutine" to do the data insert into the table. Incase if second job fails then I have to rollback the delete operation what I did in the first ETL job so, I was aborting the first job by killing its ProcessID in the shell script. So, that when job aborts I could rollback the data by setting "Ora_upsert Commit Row/Time interval" properties the first ETL job.
I was able to abort first ETL job based on the failure status of second ETL job by killing PID of first ETL job. But it looks like even if I abort first ETL job the data is getting deleted by the first ETL job and it is not getting rolled back. Could you please suggest me an solution on to how can I fix this issue.

Thanks & regards
Sudhindra P S
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Once the first job completes, it commits the changes. You would need to do both in a single job if you want the failure of one to roll back the other.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Sudhindra_ps
Participant
Posts: 45
Joined: Thu Aug 31, 2006 3:13 am
Location: Bangalore

Post by Sudhindra_ps »

hi Chulett,

Thanks for replying to my query. The problem with performing Delete and Insert in the same job would lead to different complexities which we have faced, such as
(1) Sometimes tables were getting into a "Dead Lock" issue.
(2) Sometimes job was getting hanged as both Delete and Insert operation creating locks on the table
(3) Sometimes data was getting deleted after data was inserted in to the table. It was completely inconsistent in the ETL jobs even after we had taken care of "Sequential Partitioning" and "link ordering" in the transformer stage which used to have 2 output links, one for Delete and one for Insert operation.
I guess performing Delete/Insert operation on the same table from a single job having huge volume of data to be processed is going to be very inconsistent. Have you ever come across such a scenario.
If yes, could you please suggest me the way you had handled it in a single job.

Thanks & regards
Sudhindra P S
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Not a PX guy, so haven't handled your situation in a Parallel job. We do, however, handle that kind of thing quite often in Server and I've had no such issues as you've noted in (3). It has been nothing but consistent and proper link ordering, division of labor across links and action choice are the key. Others would need to address the proper architecture of a PX job to do that kind of work, but I can't imagine it's all that uncommon. :?

Think about what keeping them as two jobs means. This would not be any different from having two separate stored procedures and expecting a failure in the second to trigger a roll back of all the work the first had already committed, that wouldn't be a reasonable expectation either. How would you arrange that to happen in any tool? No can do.

A common solution would be to ensure your first load was timestamped or 'marked' in such a way (audit columns) so that it could be 'undone' in the event of a failure by running yet another job. Of course, in this case that would be quite a trick. Gone is gone, so perhaps you'd need more steps - your delete could include writing the deleted records to a shadow table of some kind so they could be put back if need be. Or simply leave them as two related but unconnected processes and if the second job fails, just fix the darn problem and rerun it. :lol:

Multiple ways to handle this.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Sudhindra_ps
Participant
Posts: 45
Joined: Thu Aug 31, 2006 3:13 am
Location: Bangalore

Post by Sudhindra_ps »

hi Ray,

Do you have any suggestion on to how to handle Delete/Insert on Oracle database (Same table) in Parallel jobs.

Thanks & regards
Sudhindra P S
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Precisely what is the relationship between that last question and the subject of the thread - abort the job from "After job subroutine"?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Well, that would require actually reading the thread. :wink:

The OP doesn't believe there is a way to handle both deletes and inserts in a single job in a consistent fashion, so two jobs were crafted: a delete job that called an insert job 'after job'. The original question was how to abort from the after job area if the inserts fails so as to rollback the deletes. Since that was a non-starter, we're back to doing everything in one job.
Sudhindra_ps wrote:The problem with performing Delete and Insert in the same job would lead to different complexities which we have faced, such as
(1) Sometimes tables were getting into a "Dead Lock" issue.
(2) Sometimes job was getting hanged as both Delete and Insert operation creating locks on the table
(3) Sometimes data was getting deleted after data was inserted in to the table. It was completely inconsistent in the ETL jobs even after we had taken care of "Sequential Partitioning" and "link ordering" in the transformer stage which used to have 2 output links, one for Delete and one for Insert operation.
I guess performing Delete/Insert operation on the same table from a single job having huge volume of data to be processed is going to be very inconsistent. Have you ever come across such a scenario.
If yes, could you please suggest me the way you had handled it in a single job.
I'm guessing (inferring) the biggest issue was the use of two output stages, rather than one stage with two links.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Sudhindra_ps
Participant
Posts: 45
Joined: Thu Aug 31, 2006 3:13 am
Location: Bangalore

Ray/Chulett please suggest me

Post by Sudhindra_ps »

hi Ray,

I apologize for not explaining you properly on what I was trying to do. I have two jobs , First job does deletion of data where as second job does insertion of data into same table. Am triggering second job from first job using a "after job subroutine(dsjob script)" because I need to delete existing data from table and then load new data flowing in from Source.
(1)Incase if second job fails for some reason, by then first job would have completed delete operation on the table and there is no chance for roll back option.
(2) In order to do the rollback, I killed the "PID" of the first job based on second job execution status so, that whenever first job aborts using "ora_upsert_commit/Row/interval" property I can roll back delete transaction. But this doesn't work, even after I abort first job by killing its PID the delete operation is getting commited in the database.
(3) Earlier going to above design, I was trying to do delete and insert operation on the same table. This works fine with less volume of data. When I do the same with huge volume of data it is not working consistently.Sometimes delete operation works after insert is completed. Sometimes job gets hanged due to "dead lock" on the table. I had followed all steps poperly such as, link ordering in the "Transformer Stage" so that delete link runs before Insert links executes. And I had also set "Transformer Stage","Oracle Enterprise(Insert)","Oracle Enterprise(Delete)" with "sequential partioning".
Do you have any suggestion on to how can we handle this efficiently.

Chulett,
Thanks for making Ray understand my query. I didn't understand when you say "One Stage with two links". Could you please explain me precisely.

Thanks & regards
Sudhindra P S
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

I was guessing that, when you were doing all of the work in one job, that your two links - the one for deletes and the one for inserts - were going to two separate Oracle stages. If so, this creates two separate sessions / transactions that have no knowledge of each other and is the typical reason for the deadlocks and ordering issues you've experienced. I was suggesting you attach both links to a single Oracle stage so that everything is done in a single session / transaction / unit of work and see if that doesn't work out better for you.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

You could try DSLogFatal if you really want to abort the job from within a called subroutine. This is definitely not something that would be in my "best practice" category!
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Sudhindra_ps
Participant
Posts: 45
Joined: Thu Aug 31, 2006 3:13 am
Location: Bangalore

Post by Sudhindra_ps »

hi Chulett,

In parallel jobs we cannot have 2 input links to Oracle Enterprise Stage. So, I just made use of single "Dynamic RDBMS stage" to do both "Delete and Insert" operation on the table. But still no luck as job used to try to insert data first and then try to delete. Which resulted in Database constraint violation. Is there any other way you are suggesting me to carry out this Delete and Insert operation in the same stage(Target Stages other than Oracle Enterprise Stage and DRDBMS stage).

Thanks & regards
Sudhindra P S
Post Reply