Is there any Automation Process to hanlde Data

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
John Daniel
Participant
Posts: 42
Joined: Mon Oct 15, 2007 10:35 pm
Location: Charlotte
Contact:

Is there any Automation Process to hanlde Data

Post by John Daniel »

Hi All,

Please give me some inputs on this

I have source of 100 records, Source 80 are records are processed to target. I got an error in the source at 81st record, So I need to start the job from 82nd records onwards.

Is there any automatic hanlding process is there to handle this kind of issues in DataStage(Px).

Looking for your kind reply in this.

Regards,
John
Pagadrai
Participant
Posts: 111
Joined: Fri Dec 31, 2004 1:16 am
Location: Chennai

Re: Is there any Automation Process to hanlde Data

Post by Pagadrai »

Hi,
Iam not very clear on what is your idea of automatic handling.
You have lot of approaches to do it anyway-

you can use a 'flag' for identifying unprocessed records
or use a change data capture stage to process only changed ones.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Welcome aboard.

There's nothing automatic, but you can design recovery.

As noted, designing recovery will require that you keep track of how far the job has got before it fails.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
BugFree
Participant
Posts: 82
Joined: Wed Dec 13, 2006 6:02 am

Post by BugFree »

ray.wurlod wrote:Welcome aboard.

There's nothing automatic, but you can design recovery.

As noted, designing recovery will require that you keep track of how far the job has got before it fails. ...
Ray, are you reffering to the other new post with subject line "Disigning Recovery handling" :? :D
Ping me if I am wrong...
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Look at the timestamps on the two posts. John has responded with a separate thread to ask a separate question, which is the way things should be.

There is no need to quote everything - it only wastes space.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

It's a very risky thing to try and restart an ETL process that stopped midway through and best avoided. For starters it is hard working out how much data was delivered to the target - for a database target you need to take into account array sizes and transactions sizes to discover how many rows were actually saved and how many were rolled back. While your job ended at the 81st row it may have left the target table back at the 60th row. Second the parallel job design and partitioning means you don't know for sure what rows from the source have been processed. You may have one partition that is up to the 81st row but you may have another that has already processed the 82nd row.

It is safer to roll back your changes and process from the beginning or better yet trap you bad data row into a reject link and into an exceptions file so your job keeps processing. You can put reject links onto sequential file stages or lookups or transformers or database target stages to trap errors rather than aborting the job.
Post Reply