Aborting job if too few rows are processed

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
Richard615
Participant
Posts: 12
Joined: Tue Mar 27, 2007 11:08 am

Aborting job if too few rows are processed

Post by Richard615 »

Does anyone have an elegant way to do this?

I have a job that pulls data from a database, runs through a single transform to do some formatting, then writes to a dataset. I would like this job to abort if the number of records passed is less than a certain amount.

I just don't see any way to make this happen directly in the existing transform (the amount of data is small enough I'm willing to force the job to process sequentially if need be).

What I've done is have that transform also write to an aggregator stage for a total row count, which then links to a second transform. That transform has an output link with a constraint that the passed count is less than the cut-off point - and then that constraint has "Abort after 1 row" set.

This works fine enough, but just seems a bit slap-dash.

Any other ideas/suggestions on a cleaner way to make this happen? Or some obvious approach that I'm overlooking?

-> Richard
sud
Premium Member
Premium Member
Posts: 366
Joined: Fri Dec 02, 2005 5:00 am
Location: Here I Am

Re: Aborting job if too few rows are processed

Post by sud »

What if in the select statement that you run on the database you get a count(*), then directly in the first transformer you could use the strategy you are using.
It took me fifteen years to discover I had no talent for ETL, but I couldn't give it up because by that time I was too famous.
ccatania
Premium Member
Premium Member
Posts: 68
Joined: Thu Sep 08, 2005 5:42 am
Location: Raleigh
Contact:

Post by ccatania »

In your first tranform use a stage variable to count the rows, then have that value tested in a constraint. You can then eliminate the other transform and the aggregator stage.
ccatania
Premium Member
Premium Member
Posts: 68
Joined: Thu Sep 08, 2005 5:42 am
Location: Raleigh
Contact:

Post by ccatania »

In your first tranform use a stage variable to count the rows, then have that value tested in a constraint. You can then eliminate the other transform and the aggregator stage.
Richard615
Participant
Posts: 12
Joined: Tue Mar 27, 2007 11:08 am

Post by Richard615 »

Sud:

I could certainly access the database and get a row count, that's true. But since I'm already pulling the rows into DataStage, I was hoping to minimize the need for any major change. I was hoping to just be able to make a few minor modification to the existing job and be done with it.

CCatania:

I was looking at doing a count or looking at the @ruwnum - but where I run into a wall with this approach is WHEN/HOW do I look at that count? I can't look at that count until the transform is "done" - since when the process first starts running, my counts will initially be under the threshold. But that's obviously not an abort situation since if the job keeps running, it will cross that threshold eventually.

Plus with the parallel job, the concept of a transform being "done" is a bit fuzzy - which is why I felt the need to go into an aggregator for a final count.

Thanks for the replies,

-> Richard
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

You could write a short after-job routine, in which you get the row count from your database link and if the nubmer is below your threshold issue a call to DSLogFatal() to force the job to abort.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Simply setting ErrorCode to 1 in the after-job routine will cause the job to abort.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Yes, setting the error code to 1 will do that, but won't put a message in the log stating why, just that the after-job triggered an abort.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Ya hafta maintain the mystique!!
:lol:
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
ccatania
Premium Member
Premium Member
Posts: 68
Joined: Thu Sep 08, 2005 5:42 am
Location: Raleigh
Contact:

Post by ccatania »

Unless I'm missing something.

Your Stage Variable is being incremented with each record processed, the constraint test that value during that same process. The Transform Stage doesn't have to complete before you can test the SV value.

Charlie
Post Reply