Page 1 of 1

Rerunning the job from fail point

Posted: Tue May 25, 2010 4:37 am
by agpt
Hi,

I have a job which aborts after processing 1000 records. The input file has total of 10,000 records. How do I make sure that when the job runs next time, it doesn't process the already successfully processed records and starts from 1001th reord?

Posted: Tue May 25, 2010 5:06 am
by ray.wurlod
Depends.

What does "processing" involve?

Are you keeping track of which records have been processed?

What is the source? How are you extracting rows from that source?

Posted: Tue May 25, 2010 8:51 am
by agpt
Source is a db2 table.

Processing involves some simple transformations.

No, I am not keeping track of how many records have been processed and I don't know even how can this be done. So just wanted to know is there any way in data stage where in it can identify the last successfully processed record and in case of restart, can start from the first failed record?

Would check point serve this purpose? According to my understanding checkpoint can restart the whole failed job but not sure how to put it in at job record level

Posted: Tue May 25, 2010 5:20 pm
by ray.wurlod
YOU have to put it in. This does not happen out of the box.

Posted: Tue May 25, 2010 8:20 pm
by chulett
And no, the Sequence job level checkpointing is at that level, the job level. Record level 'checkpointing' you have to build in, as noted.

Posted: Tue May 25, 2010 11:40 pm
by agpt
thanks Ray and Craig. Please give me some more hint like how can I do it....

Posted: Wed May 26, 2010 12:15 am
by gaurav_shukla
This is what i have done whn i faced situation like you...

I have put in a lookup to same table (in my case it was table) where i check whether my already loaded 1000 rows are present if yes...leave it as it is and pass rest 9000 rows to reject link for rest of processing..

Not sure about if you using Seq. file or dataset..

This would add extra processing..but in case of table load ..it avoid abort due to duplicates...

Posted: Wed May 26, 2010 10:21 am
by agpt
Thanks for the info Gaurav. But looking for some more robust solution if possible as we have millions of records.

is anybody else have done it in some other way?

Posted: Fri Sep 03, 2010 1:34 am
by agpt
Hi All,

Just wanted to follow up on the same if any body can help me out on this?

Posted: Fri Sep 03, 2010 4:05 am
by ramsubbiah
agpt wrote:Hi All,

Just wanted to follow up on the same if any body can help me out on this?
If you want aviod by doing with lookup way, then i have one solution
Extract records which are loaded in the previous run , funnel the previous run records with new records and do the aggregation(find the count) then filter out the records which is having count is = 1, i hope this this will look like lengthy job, but am not sure if anyother way, if it is there i am eager to know! :o

Posted: Fri Sep 03, 2010 7:02 am
by HariK
agpt wrote:Thanks for the info Gaurav. But looking for some more robust solution if possible as we have millions of records.

is anybody else have done it in some other way?
Merge or Join can be used in place of Lookup if volume is the only concern.