Hi,
If there are billions of records in job and the job fails in the middle of execution for some runtime reason.We need to reset and run the job again.what happens in this situation.
Most data was already moved to target.Does it happen like the data in target is all deleted and will start from again.or will start from previous checkpoint. How will the job identify previous job checkpoint\transaction commit record location(in target)
Its been like career break for me, i forgot many real time situations kindly give a clear picture on this situation.Will be helpful.thanks in advance.
bye-
Interview question could not answer,kindly guide
Moderators: chulett, rschirm, roy
Thanks for sharing your response.Usually when the job fails,if it has runtime error,I usually first try to reset the job in director.then restart it.but I am consern to understand situations where part of data is already loaded in target (suppose database).do we need to delete those external,or datastage will do it internally ?if so how it does?
Not going to give you the answer (see KDUKE above).
But think about how DataStage handles the data on a restart.
Think about your target system (DBMS) or other repository (FILE, XML, MQ, special build op, etc...). How do you think IT deals with what DataStage does during a restart?
If you know how DataStage works, and you know how your Target System reacts... maybe you can craft code to overcome certain error scenarios?
But think about how DataStage handles the data on a restart.
Think about your target system (DBMS) or other repository (FILE, XML, MQ, special build op, etc...). How do you think IT deals with what DataStage does during a restart?
If you know how DataStage works, and you know how your Target System reacts... maybe you can craft code to overcome certain error scenarios?
Usually in database stage there is transaction commit text box kind of thing where we can mention the transaction should commit for example for every 3000 records.this count if kept in lower interger value,will reduce the performance of job.
I think there is also an environment variable for it.
My understanding till a point.
Now dbms stages,like db2 the transaction for example job fails at 6040 records in that job the transaction commit is for every 3000 records. So 6000 records are committed and when we restart job it will reevaluate from 6001 th record,40 records after that are actually rollback,not saved in db.
Usually with partitioning and parallel pipeline,how this complexity is looked into in 2 nd run is what I need to recollect,bcos in the second run from source what data will go,...I'm not able to ..wish me good luck.
I think there is also an environment variable for it.
My understanding till a point.
Now dbms stages,like db2 the transaction for example job fails at 6040 records in that job the transaction commit is for every 3000 records. So 6000 records are committed and when we restart job it will reevaluate from 6001 th record,40 records after that are actually rollback,not saved in db.
Usually with partitioning and parallel pipeline,how this complexity is looked into in 2 nd run is what I need to recollect,bcos in the second run from source what data will go,...I'm not able to ..wish me good luck.