Page 1 of 1

Aborting job if too few rows are processed

Posted: Tue Aug 28, 2007 1:05 pm
by Richard615
Does anyone have an elegant way to do this?

I have a job that pulls data from a database, runs through a single transform to do some formatting, then writes to a dataset. I would like this job to abort if the number of records passed is less than a certain amount.

I just don't see any way to make this happen directly in the existing transform (the amount of data is small enough I'm willing to force the job to process sequentially if need be).

What I've done is have that transform also write to an aggregator stage for a total row count, which then links to a second transform. That transform has an output link with a constraint that the passed count is less than the cut-off point - and then that constraint has "Abort after 1 row" set.

This works fine enough, but just seems a bit slap-dash.

Any other ideas/suggestions on a cleaner way to make this happen? Or some obvious approach that I'm overlooking?

-> Richard

Re: Aborting job if too few rows are processed

Posted: Tue Aug 28, 2007 1:24 pm
by sud
What if in the select statement that you run on the database you get a count(*), then directly in the first transformer you could use the strategy you are using.

Posted: Tue Aug 28, 2007 1:35 pm
by ccatania
In your first tranform use a stage variable to count the rows, then have that value tested in a constraint. You can then eliminate the other transform and the aggregator stage.

Posted: Tue Aug 28, 2007 1:38 pm
by ccatania
In your first tranform use a stage variable to count the rows, then have that value tested in a constraint. You can then eliminate the other transform and the aggregator stage.

Posted: Tue Aug 28, 2007 2:04 pm
by Richard615
Sud:

I could certainly access the database and get a row count, that's true. But since I'm already pulling the rows into DataStage, I was hoping to minimize the need for any major change. I was hoping to just be able to make a few minor modification to the existing job and be done with it.

CCatania:

I was looking at doing a count or looking at the @ruwnum - but where I run into a wall with this approach is WHEN/HOW do I look at that count? I can't look at that count until the transform is "done" - since when the process first starts running, my counts will initially be under the threshold. But that's obviously not an abort situation since if the job keeps running, it will cross that threshold eventually.

Plus with the parallel job, the concept of a transform being "done" is a bit fuzzy - which is why I felt the need to go into an aggregator for a final count.

Thanks for the replies,

-> Richard

Posted: Tue Aug 28, 2007 3:26 pm
by ArndW
You could write a short after-job routine, in which you get the row count from your database link and if the nubmer is below your threshold issue a call to DSLogFatal() to force the job to abort.

Posted: Tue Aug 28, 2007 4:57 pm
by ray.wurlod
Simply setting ErrorCode to 1 in the after-job routine will cause the job to abort.

Posted: Tue Aug 28, 2007 7:17 pm
by ArndW
Yes, setting the error code to 1 will do that, but won't put a message in the log stating why, just that the after-job triggered an abort.

Posted: Tue Aug 28, 2007 9:39 pm
by ray.wurlod
Ya hafta maintain the mystique!!
:lol:

Posted: Wed Aug 29, 2007 6:18 am
by ccatania
Unless I'm missing something.

Your Stage Variable is being incremented with each record processed, the constraint test that value during that same process. The Transform Stage doesn't have to complete before you can test the SV value.

Charlie