Rerunning the job from fail point

agpt · Post by **agpt** » Tue May 25, 2010 4:37 am

Hi,

I have a job which aborts after processing 1000 records. The input file has total of 10,000 records. How do I make sure that when the job runs next time, it doesn't process the already successfully processed records and starts from 1001th reord?

ray.wurlod · Post by **ray.wurlod** » Tue May 25, 2010 5:06 am

Depends.

What does "processing" involve?

Are you keeping track of which records have been processed?

What is the source? How are you extracting rows from that source?

agpt · Post by **agpt** » Tue May 25, 2010 8:51 am

Source is a db2 table.

Processing involves some simple transformations.

No, I am not keeping track of how many records have been processed and I don't know even how can this be done. So just wanted to know is there any way in data stage where in it can identify the last successfully processed record and in case of restart, can start from the first failed record?

Would check point serve this purpose? According to my understanding checkpoint can restart the whole failed job but not sure how to put it in at job record level

ray.wurlod · Post by **ray.wurlod** » Tue May 25, 2010 5:20 pm

YOU have to put it in. This does not happen out of the box.

chulett · Post by **chulett** » Tue May 25, 2010 8:20 pm

And no, the Sequence job level checkpointing is at that level, the job level. Record level 'checkpointing' you have to build in, as noted.

agpt · Post by **agpt** » Tue May 25, 2010 11:40 pm

thanks Ray and Craig. Please give me some more hint like how can I do it....

gaurav_shukla · Post by **gaurav_shukla** » Wed May 26, 2010 12:15 am

This is what i have done whn i faced situation like you...

I have put in a lookup to same table (in my case it was table) where i check whether my already loaded 1000 rows are present if yes...leave it as it is and pass rest 9000 rows to reject link for rest of processing..

Not sure about if you using Seq. file or dataset..

This would add extra processing..but in case of table load ..it avoid abort due to duplicates...

agpt · Post by **agpt** » Wed May 26, 2010 10:21 am

Thanks for the info Gaurav. But looking for some more robust solution if possible as we have millions of records.

is anybody else have done it in some other way?

agpt · Post by **agpt** » Fri Sep 03, 2010 1:34 am

Hi All,

Just wanted to follow up on the same if any body can help me out on this?

ramsubbiah · Post by **ramsubbiah** » Fri Sep 03, 2010 4:05 am

agpt wrote:Hi All,

Just wanted to follow up on the same if any body can help me out on this?

If you want aviod by doing with lookup way, then i have one solution
Extract records which are loaded in the previous run , funnel the previous run records with new records and do the aggregation(find the count) then filter out the records which is having count is = 1, i hope this this will look like lengthy job, but am not sure if anyother way, if it is there i am eager to know!

HariK · Post by **HariK** » Fri Sep 03, 2010 7:02 am

agpt wrote:Thanks for the info Gaurav. But looking for some more robust solution if possible as we have millions of records.

is anybody else have done it in some other way?

Merge or Join can be used in place of Lookup if volume is the only concern.