Looping Batch Job fails for No reason

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
sarjushah
Premium Member
Premium Member
Posts: 40
Joined: Thu May 12, 2005 3:59 pm

Looping Batch Job fails for No reason

Post by sarjushah »

Hi,

We have a Event table that is populated by another system and the datastage jobs need to read this table and pickup all available evnets and process them. Once it is done done processing that go back to the event table and repeat the process. So the Batch Job will keep looping the whole day and execute the sub jobs and stop when a value of the variable changes in the ini file.

The Batch job loops sucessfully and also exits the loop when we change the value.

The PROBLEM is that that the batch stops randomly for no reason. A subjob would be executed and will be sucessful and after that we will get a message Batch job aborted and attempting cleanup.

Any pointers or examples on how this can be fixed or how a looping batch job be build and run sucessfully for 24 hours.

Thanks
Sarju
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Welcome aboard. :D

There isn't really enough information to enable us to help. If you were to post the batch code - surrounded by Code tags - then we could inspect that.

Also post the English text of what the looping batch is supposed to do, so we can check the code against that. This text should exist in your design documentation, and ideally in the long description of the batch as well.

Does any error/warning message get logged when these unexpected stops occur?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
sarjushah
Premium Member
Premium Member
Posts: 40
Joined: Thu May 12, 2005 3:59 pm

Post by sarjushah »

Hi,

I did some more research and I would like to provide some more details on the problem statement and the problem.

Please find below a few details on the Problem.

What are we trying to achieve:
We have a driver table that is the input for complete process. This table is populated is continuously 24/7 by other systems. We need to build a process that will read this table and pickup what ever is available for processing and process it. After we are done processing we need to come back and again read the driver table and pickup all the records ready for processing any process them and keep repeating the process. If we don't find anything to process in the driver table we skip all the steps and make the job sleep for a defined(in the ini file) amount of time. This whole looping process stops when we change the value in a file to "Y"

Options Tried:
We build a couple of batch jobs
1) A single step looping by it self fails after executing 255-256 GOSUB STATEMENTS.
2) A Full fledge batch job with complete functionality including sleep wait times, and exiting on request(based on value in file). This job was executed and a few runs had data and others didn't. The result was Total number of Loops 35 and Total Number of Loops with data was 24. So this one is also in the range of 250 - 260 GOSUB statement before Aborting.

When the Batch Job Aborts after executing approx 256 gosub statements we get the following message as the reason for the Abort(Please not all subjobs were successful). "Attempting to Cleanup after ABORT raised in stage Batch::D9DCFCMPODS004AD_resv..JobControl".

Input's Needed:
We need to know if there is a max limit in datastage for the number of gosubs that can be executed. If there is a limit what is the alternate way for designing a batch to achieve our goal.
If there is not max limit, why is this batch job aborting

Please find Below the code for the Batch Job we are executing.
##########################################
*************** Code Start for Option 1 ****************

JobName = DSJobName
Delim='/'
LF = char(010)
***************************************************************************************
* Define Functions
***************************************************************************************
Deffun GetIniVar (Arg1) Calling 'DSU.FBGetIniVar'
Deffun GetApplVar (Arg1) Calling 'DSU.FBGetApplVar'
Deffun GetSwitchValue (Arg1) Calling 'DSU.FBGetSwitchValue'
Deffun GetPwd (Arg1, Arg2) Calling 'DSU.FBGetPasswd'
Deffun ReadRestartStep (Arg1, Arg2) Calling 'DSU.FBReadRestartStep'
Deffun WriteRestartStep (Arg1, Arg2, Arg3) Calling 'DSU.FBWriteRestartStep'
Deffun GetLinkStats (Arg1) Calling 'DSU.FBGetLinkStats'
Deffun GetLogMsg (Arg1, Arg2) Calling 'DSU.FBGetLogMsg'
Deffun ResetJobStatus (Arg1) Calling 'DSU.FBResetJobStatus'
Deffun ReadLog (Arg1, Arg2) Calling 'DSU.FBReadLog'
Deffun WriteLog (Arg1, Arg2, Arg3, Arg4) Calling 'DSU.FBWriteLog'
Deffun ValidateLogDir (Arg1) Calling 'DSU.FBValidateLogDir'
Deffun ExecUnix (Arg1) Calling 'DSU.FBExecUnix'
Deffun SendEmail (Arg1, Arg2, Arg3) Calling 'DSU.FBSendEmail'
Deffun SendEmailFile (Arg1, Arg2, Arg3) Calling 'DSU.FBSendEmailFile'
Deffun AuditRowCnt (Arg1, Arg2, Arg3) Calling 'DSU.FBAuditRowCnt'
Deffun GetCurrTimeStamp (Arg1) Calling 'DSU.FBCurrDateTimeStamp'
Deffun CreateFileBackup (Arg1, Arg2, Arg3, Arg4) Calling 'DSU.FBCreateFileBackup'
* Deffun GetRowCount (Arg1, Arg2) Calling 'DSU.FBLinkRowCount'

***************************************************************************************
* Equate Statements that will remain constant for the job
***************************************************************************************
EQU MailSubj LIT "AuditStatus:DSJobName:' job in project ':PROJECT"
EQU AUDIT.HDR LIT "'Job: ':fmt(JobName,'33L'):spaces(8):AuditStart:GetCurrTimeStamp(''):LF"
EQU AUDIT.TRLR LIT "AuditStatus:'Job finished : ':GetCurrTimeStamp(''):LF"
EQU STATUS.FAIL TO '*** ABORTED: '

***************************************************************************************


* Get all the required INI variables from the initialization file passed as a parameter
***************************************************************************************
ARCHIVEDIR = GetIniVar('Env.ARCHIVEDIR')
AUDITDIR = GetIniVar('Env.AUDITDIR')
ACTVTODT = GetIniVar('Env.ACTVTODT')
LOGDIR = GetIniVar('Env.LOGDIR')
REJDIR = GetIniVar('Env.REJDIR')
PARMDIR = GetIniVar('Env.PARMDIR')
HASHDIR = GetIniVar('Env.HASHDIR')
OUTPUTDIR = GetIniVar('Env.OUTPUTDIR')
TEMPDIR = GetIniVar('Env.TEMPDIR')
RESTARTDIR = GetIniVar('Env.RESTARTDIR')
MailTo = GetIniVar('Env.EMAIL_GROUP2')
SrcSystem = JobName[11,2] ;* extract subsystem name from Batch::<jobname>
INPUTDIR = GetIniVar('Env.':SrcSystem:'_INPUTDIR')

* Source, Target and Table params
SRC.DBSID = GetIniVar('PortalProject.SRC_SID')
SRC.DBUSER = GetIniVar('PortalProject.SRC_USERID')
SRC.DBPWD = GetPwd(SRC.DBSID, SRC.DBUSER)
SRC.DBOWNER = GetIniVar('PortalProject.SRC_OWNER')
TGT.DBSID = GetIniVar('PortalProject.TGT_SID')
TGT.DBUSER = GetIniVar('PortalProject.TGT_USERID')
TGT.DBPWD = GetPwd(TGT.DBSID, TGT.DBUSER)
CycleWaitTime1 = GetIniVar('PortalProject.WAIT_TIME1')
CycleWaitTime2 = GetIniVar('PortalProject.WAIT_TIME2')
CycleWaitTime3 = GetIniVar('PortalProject.WAIT_TIME3')
CycleWaitTime4 = GetIniVar('PortalProject.WAIT_TIME4')
AttemptWaitTime = GetIniVar('PortalProject.ATTEMPT_WAIT_TIME')
CycleWaitCount = 0


* Miscellaneous Params
PROJECT = DSGetProjectInfo(DSJ.PROJECTNAME)
BatchJob= "Batch::D9DCFCMPODS004AD"

* Validate Log Directory
NewLOGDIR = ValidateLogDir(LOGDIR)
NewARCHIVEDIR = ValidateLogDir(ARCHIVEDIR)

***************************************************************************************
* Initialize variables
***************************************************************************************
STEPFILE = JobName
CURR.DATE = oconv(date(), 'D-YMD[4,2,2]')
CURR.YYYYMMDD = convert('-','',CURR.DATE)
LogFileName = 'Log_':JobName:'_':CURR.YYYYMMDD:'.txt'
AuditFile = JobName
AuditPath = AUDITDIR:Delim:AuditFile
AuditRec = ''
AuditStatus = ''

***************************************************************************************
* Read RESTART step
***************************************************************************************
gosub ReadRestartStep
gosub WriteAuditHdr

***************************************************************************************
* Main Program start
***************************************************************************************
ON STEP GOTO Step10,CatchAll

********
Step10:
STEP=1 ; gosub WriteRestartStep
CURRSTEP = 'STEP10' ; gosub StepLog
ThisJob = "D9SCFCMPODS004AD010ext"
Rslt = ResetJobStatus(ThisJob)
* Setup D9SCFCMPODS004AD010ext, run it, wait for it to finish, and test for success
hJob1 = DSAttachJob(ThisJob, DSJ.ERRFATAL)
If NOT(hJob1) Then
gosub ErrorHandler
Abort
End
ErrCode = DSSetParam(hJob1, "ThisJob", ThisJob)
ErrCode = DSSetParam(hJob1, "BatchJob", "D9SCFCMPODS004AD")
ErrCode = DSSetParam(hJob1, "RejDir", REJDIR)
ErrCode = DSSetParam(hJob1, "HashDir", HASHDIR)
ErrCode = DSSetParam(hJob1, "Instance", SRC.DBSID)
ErrCode = DSSetParam(hJob1, "UserName", SRC.DBUSER)
ErrCode = DSSetParam(hJob1, "Password", SRC.DBPWD)
ErrCode = DSSetParam(hJob1, "EVENT_TYPE_CD", 'GROUP')
ErrCode = DSSetParam(hJob1, "ATTEMPT_WAIT_TIME", AttemptWaitTime)
ErrCode = DSRunJob(hJob1, DSJ.RUNNORMAL)
ErrCode = DSWaitForJob(hJob1)
Status = DSGetJobInfo(hJob1, DSJ.JOBSTATUS)
If Status = DSJS.RUNFAILED Or Status = DSJS.CRASHED Then
* Fatal Error - No Return
gosub ErrorSub
End


**********
* ... other subjobs
**********


**********
* Start next sampling period
gosub Step10


**********
NormalEnd:

STEP=1 ; gosub WriteRestartStep
CURRSTEP = 'STEP999'
Call DSLogInfo('CURRSTEP=':CURRSTEP, DSJobName)
gosub WriteAuditTrlr
MailFile = AuditPath ;
gosub SendMailFile

*********
CatchAll:
* Cleanup and delete workfiles
* return
* Start next sampling period
* gosub Step10

********
StepLog:
Call DSLogInfo(CURRSTEP, DSJobName)
return

********
ReadRestartStep:
STEP = ReadRestartStep(RESTARTDIR, STEPFILE)
If STEP = 1 Then
AuditStart = 'Started: '
AuditAction='O' End Else
Call DSLogWarn('Restarting STEP=':STEP:', reading RESTARTLOG file', DSJobName)
gosub ReadLog
AuditStart = 'Re-Started: '
AuditAction='A'
End
return

********
WriteRestartStep:
Rslt = WriteRestartStep(RESTARTDIR, STEPFILE, STEP)
return

********
WriteAuditHdr:
AuditRec = AUDIT.HDR
Gosub WriteLog
return

********
WriteAuditTrlr:
AuditRec = AUDIT.TRLR
AuditAction='A'
Gosub WriteLog
return

********
ReadLog:
return

********
WriteLog:
If AuditRec # '' Then
Rslt = WriteLog(AUDITDIR, AuditFile, AuditRec, AuditAction)
End
AuditRec=''
return

********
ErrorSub:
FailedMsg = "Job Failed: ":ThisJob
AuditRec = GetLogMsg(LOGDIR, ThisJob):LF ; * Get log messages for subjob job
gosub ErrorHandler
return

********
ErrorHandler:
FailedMsg = "Job Failed: ":ThisJob
AuditRec<-1> = FailedMsg
AuditStatus = STATUS.FAIL
AuditAction = 'A'
gosub WriteLog
gosub WriteAuditTrlr; MailFile = AuditPath ; * Mail the Audit Log
gosub SendMailFile
Call DSLogFatal(FailedMsg, "JobControl")
return

********
SendMailFile: Rslt = SendEmailFile(MailTo, MailSubj, MailFile)

return

********
SendMail:
Rslt = SendEmail(MailTo, MailSubj, MailBody)


*************** Code End for Option 1 ****************
#########################################


##########################################
*************** Code Start for Option 2 ****************
JobName = DSJobName
Delim='/'
LF = char(010)
***************************************************************************************
* Define Functions
***************************************************************************************
Deffun GetIniVar (Arg1) Calling 'DSU.FBGetIniVar'
Deffun GetApplVar (Arg1) Calling 'DSU.FBGetApplVar'
Deffun GetSwitchValue (Arg1) Calling 'DSU.FBGetSwitchValue'
Deffun GetPwd (Arg1, Arg2) Calling 'DSU.FBGetPasswd'
Deffun ReadRestartStep (Arg1, Arg2) Calling 'DSU.FBReadRestartStep'
Deffun WriteRestartStep (Arg1, Arg2, Arg3) Calling 'DSU.FBWriteRestartStep'
Deffun GetLinkStats (Arg1) Calling 'DSU.FBGetLinkStats'
Deffun GetLogMsg (Arg1, Arg2) Calling 'DSU.FBGetLogMsg'
Deffun ResetJobStatus (Arg1) Calling 'DSU.FBResetJobStatus'
Deffun ReadLog (Arg1, Arg2) Calling 'DSU.FBReadLog'
Deffun WriteLog (Arg1, Arg2, Arg3, Arg4) Calling 'DSU.FBWriteLog'
Deffun ValidateLogDir (Arg1) Calling 'DSU.FBValidateLogDir'
Deffun ExecUnix (Arg1) Calling 'DSU.FBExecUnix'
Deffun SendEmail (Arg1, Arg2, Arg3) Calling 'DSU.FBSendEmail'
Deffun SendEmailFile (Arg1, Arg2, Arg3) Calling 'DSU.FBSendEmailFile'
Deffun AuditRowCnt (Arg1, Arg2, Arg3) Calling 'DSU.FBAuditRowCnt'
Deffun GetCurrTimeStamp (Arg1) Calling 'DSU.FBCurrDateTimeStamp'
Deffun CreateFileBackup (Arg1, Arg2, Arg3, Arg4) Calling 'DSU.FBCreateFileBackup'
Deffun GetRowCount (Arg1, Arg2) Calling 'DSU.FBLinkRowCount'
Deffun StopBatchJob (Arg1) Calling 'DSU.FBExecUnix'
***************************************************************************************
* Equate Statements that will remain constant for the job
***************************************************************************************
EQU MailSubj LIT "AuditStatus:DSJobName:' job in project ':PROJECT"
EQU AUDIT.HDR LIT "'Job: ':fmt(JobName,'33L'):spaces(8):AuditStart:GetCurrTimeStamp(''):LF"
EQU AUDIT.TRLR LIT "AuditStatus:'Job finished : ':GetCurrTimeStamp(''):LF"
EQU STATUS.FAIL TO '*** ABORTED: '

***************************************************************************************


* Get all the required INI variables from the initialization file passed as a parameter
***************************************************************************************
ARCHIVEDIR = GetIniVar('Env.ARCHIVEDIR')
AUDITDIR = GetIniVar('Env.AUDITDIR')
ACTVTODT = GetIniVar('Env.ACTVTODT')
LOGDIR = GetIniVar('Env.LOGDIR')
REJDIR = GetIniVar('Env.REJDIR')
PARMDIR = GetIniVar('Env.PARMDIR')
HASHDIR = GetIniVar('Env.HASHDIR')
OUTPUTDIR = GetIniVar('Env.OUTPUTDIR')
TEMPDIR = GetIniVar('Env.TEMPDIR')
RESTARTDIR = GetIniVar('Env.RESTARTDIR')
MailTo = GetIniVar('Env.EMAIL_GROUP2')
SrcSystem = JobName[11,2] ;* extract subsystem name from Batch::<jobname>
INPUTDIR = GetIniVar('Env.':SrcSystem:'_INPUTDIR')

* Source, Target and Table params
SRC.DBSID = GetIniVar('DentalWebIntegration.SRC_SID')
SRC.DBUSER = GetIniVar('DentalWebIntegration.SRC_USERID')
SRC.DBPWD = GetPwd(SRC.DBSID, SRC.DBUSER)
SRC.DBOWNER = GetIniVar('DentalWebIntegration.SRC_OWNER')
TGT.DBSID = GetIniVar('DentalWebIntegration.TGT_SID')
TGT.DBUSER = GetIniVar('DentalWebIntegration.TGT_USERID')
TGT.DBPWD = GetPwd(TGT.DBSID, TGT.DBUSER)
TGT.DBOWNER = GetIniVar('DentalWebIntegration.TGT_OWNER')
DRIVER.DBSID = GetIniVar('DentalWebIntegration.DRIVER_SID')
DRIVER.DBUSER = GetIniVar('DentalWebIntegration.DRIVER_USERID')
DRIVER.DBPWD = GetPwd(DRIVER.DBSID, DRIVER.DBUSER)
DRIVER.DBOWNER = GetIniVar('DentalWebIntegration.DRIVER_OWNER')
WAITTIME1 = GetIniVar('DentalWebIntegration.WAIT_TIME1')
WAITTIME2 = GetIniVar('DentalWebIntegration.WAIT_TIME2')
WAITTIME3 = GetIniVar('DentalWebIntegration.WAIT_TIME3')
WAITTIME4 = GetIniVar('DentalWebIntegration.WAIT_TIME4')
AttemptWaitTime = 1200
ATTEMPT_NO = 0
Total_Loop_Number = 0
Loops_With_Data = 0

* Miscellaneous Params
PROJECT = DSGetProjectInfo(DSJ.PROJECTNAME)
BatchJob= "Batch::D9DCFCMPODS004AD_resv"

* Validate Log Directory
NewLOGDIR = ValidateLogDir(LOGDIR)
NewARCHIVEDIR = ValidateLogDir(ARCHIVEDIR)

***************************************************************************************
* Initialize variables
***************************************************************************************
STEPFILE = JobName
CURR.DATE = oconv(date(), 'D-YMD[4,2,2]')
CURR.YYYYMMDD = convert('-','',CURR.DATE)
LogFileName = 'Log_':JobName:'_':CURR.YYYYMMDD:'.txt'
AuditFile = JobName
AuditPath = AUDITDIR:Delim:AuditFile
AuditRec = ''
AuditStatus = ''

***************************************************************************************
* Read RESTART step
***************************************************************************************
gosub ReadRestartStep
gosub WriteAuditHdr
Begin Case
Case STEP=1
gosub Step10 ;
Case STEP=2
gosub Step20 ;
Case STEP=3
gosub Step30 ;
Case STEP=4
gosub Step40 ;
Case STEP=5
gosub Step50 ;
Case STEP=6
gosub Step60 ;
Case STEP=7
gosub Step70 ;
Case STEP=8
gosub Step80 ;
Case STEP=9
gosub Step90 ;
Case @True
gosub Finish ;
End Case
***************************************************************************************
* Main Program start
***************************************************************************************
ON STEP GOTO Finish

Step10:
STEP=1 ;
gosub WriteRestartStep
CURRSTEP = 'STEP10' ; gosub StepLog
Total_Loop_Number = Total_Loop_Number + 1
Call DSLogInfo('Total_Loop_Number :':Total_Loop_Number, DSJobName)
ThisJob = "D9SCFCMPODS004AD010ext"
Rslt = ResetJobStatus(ThisJob)

* Setup D9SCFCMPODS004AD010ext, run it, wait for it to finish, and test for success
hJob1 = DSAttachJob(ThisJob, DSJ.ERRFATAL)
If NOT(hJob1) Then
gosub ErrorHandler
Abort
End
ErrCode = DSSetParam(hJob1, "ThisJob", ThisJob)
ErrCode = DSSetParam(hJob1, "BatchJob", BatchJob[8,16])
ErrCode = DSSetParam(hJob1, "RejDir", REJDIR)
ErrCode = DSSetParam(hJob1, "TempDir", TEMPDIR)
ErrCode = DSSetParam(hJob1, "Instance", DRIVER.DBSID)
ErrCode = DSSetParam(hJob1, "UserName", DRIVER.DBUSER)
ErrCode = DSSetParam(hJob1, "Password", DRIVER.DBPWD)
ErrCode = DSSetParam(hJob1, "EVENT_TYPE_CD", 'GroupAddress')
ErrCode = DSSetParam(hJob1, "ATTEMPT_WAIT_TIME", AttemptWaitTime)
ErrCode = DSRunJob(hJob1, DSJ.RUNNORMAL)
ErrCode = DSWaitForJob(hJob1)
Status = DSGetJobInfo(hJob1, DSJ.JOBSTATUS)
If Status = DSJS.RUNFAILED Or Status = DSJS.CRASHED Then
* Fatal Error - No Return
gosub ErrorSub
End
LogMsg = GetLogMsg(LOGDIR, ThisJob)
LinkArray = GetLinkStats(ThisJob)
Rslt = AuditRowCnt(ThisJob, AUDITDIR, JobName)
If Rslt # 0 Then gosub ErrorHandler

If Rslt # 0 Then gosub ErrorHandler

RowCount = GetRowCount ('D9SCFCMPODS004AD010ext','fetch_event_out')

If RowCount= '0' Then
ATTEMPT_NO = ATTEMPT_NO + 1
gosub CALCWAITTIME
End else
ATTEMPT_NO = 0
gosub Step20
End

*********
CatchAll:
*CURRSTEP = 'CatchAll'
* Call DSLogInfo('CURRSTEP=':CURRSTEP, DSJobName)
*
*Status = StopBatchJob('/opt/datastage/Ascential/DataStage/DSEngine/bin/dsjob -stop WAREHOUSE_DEV Batch::D9DCFCMPODS004AD_resv')
* If Status <>'0' Then
* gosub ErrorSub
* End
* If Status = '0' Then
* gosub StepLog
* End
*SLEEP 90
return

********
StepLog:
Call DSLogInfo(CURRSTEP, DSJobName)
return

********
Step20:
STEP=2 ;
gosub WriteRestartStep

CURRSTEP = 'STEP20' ; gosub StepLog
ThisJob = "D9SCFCMPODS004AD020lkp"

Loops_With_Data = Loops_With_Data + 1
Call DSLogInfo('Loops_With_Data :':Loops_With_Data, DSJobName)


Rslt = ResetJobStatus(ThisJob)
* Setup D9SCFCMPODS004AD020cdc, run it, wait for it to finish, and test for success
hJob2 = DSAttachJob(ThisJob, DSJ.ERRFATAL)
If NOT(hJob2) Then
gosub ErrorHandler
Abort
End
ErrCode = DSSetParam(hJob2, "ThisJob", ThisJob)
ErrCode = DSSetParam(hJob2, "ExtractJob", "D9SCFCMPODS004AD010ext")
ErrCode = DSSetParam(hJob2, "BatchJob", BatchJob[8,16])
ErrCode = DSSetParam(hJob2, "TempDir", TEMPDIR)
ErrCode = DSSetParam(hJob2, "RejDir", REJDIR)
ErrCode = DSSetParam(hJob2, "Instance", TGT.DBSID)
ErrCode = DSSetParam(hJob2, "UserName", TGT.DBUSER)
ErrCode = DSSetParam(hJob2, "Password", TGT.DBPWD)
ErrCode = DSRunJob(hJob2, DSJ.RUNNORMAL)
ErrCode = DSWaitForJob(hJob2) Status = DSGetJobInfo(hJob2, DSJ.JOBSTATUS)
If Status = DSJS.RUNFAILED Or Status = DSJS.CRASHED Then
* Fatal Error - No Return
gosub ErrorSub
End
LogMsg = GetLogMsg(LOGDIR, ThisJob)
LinkArray = GetLinkStats(ThisJob)
Rslt = AuditRowCnt(ThisJob, AUDITDIR, JobName)
If Rslt # 0 Then gosub ErrorHandler

gosub Step30 ;
return

********
Step30:
STEP=3 ; gosub WriteRestartStep
CURRSTEP = 'STEP30' ; gosub StepLog
ThisJob = "D9SCFCMPODS004AD030ext"
Rslt = ResetJobStatus(ThisJob)
* Setup D9SCFCMPODS004AD030ext, run it, wait for it to finish, and test for success
hJob3 = DSAttachJob(ThisJob, DSJ.ERRFATAL)
If NOT(hJob3) Then
gosub ErrorHandler
Abort
End
ErrCode = DSSetParam(hJob3, "ThisJob", ThisJob)
ErrCode = DSSetParam(hJob3, "ExtractJob", "D9SCFCMPODS004AD020lkp")
ErrCode = DSSetParam(hJob3, "BatchJob", BatchJob[8,16])
ErrCode = DSSetParam(hJob3, "TempDir", TEMPDIR)
ErrCode = DSSetParam(hJob3, "HashDir", HASHDIR)
ErrCode = DSSetParam(hJob3, "RejDir", REJDIR)
ErrCode = DSSetParam(hJob3, "Instance", SRC.DBSID)
ErrCode = DSSetParam(hJob3, "UserName", SRC.DBUSER)
ErrCode = DSSetParam(hJob3, "Password", SRC.DBPWD)
ErrCode = DSRunJob(hJob3, DSJ.RUNNORMAL)
ErrCode = DSWaitForJob(hJob3)
Status = DSGetJobInfo(hJob3, DSJ.JOBSTATUS)
If Status = DSJS.RUNFAILED Or Status = DSJS.CRASHED Then
* Fatal Error - No Return
gosub ErrorSub
End
LogMsg = GetLogMsg(LOGDIR, ThisJob)
LinkArray = GetLinkStats(ThisJob)
Rslt = AuditRowCnt(ThisJob, AUDITDIR, JobName)
If Rslt # 0 Then gosub ErrorHandler

gosub Step40 ;
return

********
Step40:
STEP=4 ; gosub WriteRestartStep
CURRSTEP = 'STEP40' ; gosub StepLog
ThisJob = "D9SCFCMPODS004AD040upd"
Rslt = ResetJobStatus(ThisJob)
* Setup D9SCFCMPODS004AD040upd, run it, wait for it to finish, and test for success
hJob4 = DSAttachJob(ThisJob, DSJ.ERRFATAL)
If NOT(hJob4) Then
gosub ErrorHandler
Abort
End
ErrCode = DSSetParam(hJob4, "ThisJob", ThisJob)
ErrCode = DSSetParam(hJob4, "ExtractJob", "D9SCFCMPODS004AD030ext")
ErrCode = DSSetParam(hJob4, "BatchJob", BatchJob[8,16])
ErrCode = DSSetParam(hJob4, "TempDir", TEMPDIR)
ErrCode = DSSetParam(hJob4, "RejDir", REJDIR)
ErrCode = DSSetParam(hJob4, "Instance", TGT.DBSID)
ErrCode = DSSetParam(hJob4, "UserName", TGT.DBUSER)
ErrCode = DSSetParam(hJob4, "Password", TGT.DBPWD)
ErrCode = DSRunJob(hJob4, DSJ.RUNNORMAL)
ErrCode = DSWaitForJob(hJob4)
Status = DSGetJobInfo(hJob4, DSJ.JOBSTATUS)
If Status = DSJS.RUNFAILED Or Status = DSJS.CRASHED Then
* Fatal Error - No Return
gosub ErrorSub
End
LogMsg = GetLogMsg(LOGDIR, ThisJob)
LinkArray = GetLinkStats(ThisJob)
Rslt = AuditRowCnt(ThisJob, AUDITDIR, JobName)
If Rslt # 0 Then gosub ErrorHandler

gosub Step50 ;
return


********
Step50:
STEP=5 ; gosub WriteRestartStep
CURRSTEP = 'STEP50' ; gosub StepLog
ThisJob = "D9SCFCMPODS004AD050load"
Rslt = ResetJobStatus(ThisJob)
* Setup D9SCFCMPODS004AD050load, run it, wait for it to finish, and test for success
hJob5 = DSAttachJob(ThisJob, DSJ.ERRFATAL)
If NOT(hJob5) Then
gosub ErrorHandler
Abort
End
ErrCode = DSSetParam(hJob5, "ThisJob", ThisJob)
ErrCode = DSSetParam(hJob5, "ExtractJob", "D9SCFCMPODS004AD030ext")
ErrCode = DSSetParam(hJob5, "BatchJob", BatchJob[8,16])
ErrCode = DSSetParam(hJob5, "TempDir", TEMPDIR)
ErrCode = DSSetParam(hJob5, "RejDir", REJDIR)
ErrCode = DSSetParam(hJob5, "Instance", TGT.DBSID)
ErrCode = DSSetParam(hJob5, "UserName", TGT.DBUSER)
ErrCode = DSSetParam(hJob5, "Password", TGT.DBPWD)
ErrCode = DSRunJob(hJob5, DSJ.RUNNORMAL)
ErrCode = DSWaitForJob(hJob5)
Status = DSGetJobInfo(hJob5, DSJ.JOBSTATUS)
If Status = DSJS.RUNFAILED Or Status = DSJS.CRASHED Then
* Fatal Error - No Return
gosub ErrorSub
End
LogMsg = GetLogMsg(LOGDIR, ThisJob)
LinkArray = GetLinkStats(ThisJob)
Rslt = AuditRowCnt(ThisJob, AUDITDIR, JobName)
If Rslt # 0 Then gosub ErrorHandler

gosub Step60 ;
return

********
Step60:
STEP=6 ; gosub WriteRestartStep
CURRSTEP = 'STEP60' ; gosub StepLog
ThisJob = "D9SCFCMPODS004AD060upd"
Rslt = ResetJobStatus(ThisJob)
* Setup D9SCFCMPODS004AD060upd, run it, wait for it to finish, and test for success
hJob6 = DSAttachJob(ThisJob, DSJ.ERRFATAL)
If NOT(hJob6) Then
gosub ErrorHandler
Abort
End
ErrCode = DSSetParam(hJob6, "ThisJob", ThisJob)
ErrCode = DSSetParam(hJob6, "ExtractJob", "D9SCFCMPODS004AD030ext")
ErrCode = DSSetParam(hJob6, "BatchJob", BatchJob[8,16])
ErrCode = DSSetParam(hJob6, "TempDir", TEMPDIR)
ErrCode = DSSetParam(hJob6, "RejDir", REJDIR)
ErrCode = DSSetParam(hJob6, "Instance", TGT.DBSID)
ErrCode = DSSetParam(hJob6, "UserName", TGT.DBUSER)
ErrCode = DSSetParam(hJob6, "Password", TGT.DBPWD)
ErrCode = DSRunJob(hJob6, DSJ.RUNNORMAL)
ErrCode = DSWaitForJob(hJob6)
Status = DSGetJobInfo(hJob6, DSJ.JOBSTATUS)
If Status = DSJS.RUNFAILED Or Status = DSJS.CRASHED Then
* Fatal Error - No Return
gosub ErrorSub
End
LogMsg = GetLogMsg(LOGDIR, ThisJob)
LinkArray = GetLinkStats(ThisJob)
Rslt = AuditRowCnt(ThisJob, AUDITDIR, JobName)
If Rslt # 0 Then gosub ErrorHandler

gosub Step70 ;
return

********
Step70:
STEP=7 ; gosub WriteRestartStep
CURRSTEP = 'STEP70' ; gosub StepLog
ThisJob = "D9SCFCMPODS004AD080xfm"
Rslt = ResetJobStatus(ThisJob)
* Setup D9SCFCMPODS004AD080xfm, run it, wait for it to finish, and test for success
hJob7 = DSAttachJob(ThisJob, DSJ.ERRFATAL)
If NOT(hJob7) Then
gosub ErrorHandler
Abort
End
ErrCode = DSSetParam(hJob7, "ThisJob", ThisJob)
ErrCode = DSSetParam(hJob7, "BatchJob", BatchJob[8,16])
ErrCode = DSSetParam(hJob7, "HashDir", HASHDIR)
ErrCode = DSSetParam(hJob7, "RejDir", REJDIR)
ErrCode = DSRunJob(hJob7, DSJ.RUNNORMAL)
ErrCode = DSWaitForJob(hJob7)
Status = DSGetJobInfo(hJob7, DSJ.JOBSTATUS)
If Status = DSJS.RUNFAILED Or Status = DSJS.CRASHED Then
* Fatal Error - No Return
gosub ErrorSub
End
LogMsg = GetLogMsg(LOGDIR, ThisJob)
LinkArray = GetLinkStats(ThisJob)
Rslt = AuditRowCnt(ThisJob, AUDITDIR, JobName)
If Rslt # 0 Then gosub ErrorHandler

gosub Step80 ;
return

********
Step80:
STEP=8 ; gosub WriteRestartStep CURRSTEP = 'STEP80' ; gosub StepLog
ThisJob = "D9SCFCMPODS004AD090upd"
Rslt = ResetJobStatus(ThisJob)
* Setup D9SCFCMPODS003AD090upd, run it, wait for it to finish, and test for success
hJob8 = DSAttachJob(ThisJob, DSJ.ERRFATAL)
If NOT(hJob8) Then
gosub ErrorHandler
Abort
End
ErrCode = DSSetParam(hJob8, "ThisJob", ThisJob)
ErrCode = DSSetParam(hJob8, "ExtractJob", "D9SCFCMPODS004AD010ext")
ErrCode = DSSetParam(hJob8, "RefJob", "D9SCFCMPODS004AD080xfm")
ErrCode = DSSetParam(hJob8, "BatchJob", BatchJob[8,16])
ErrCode = DSSetParam(hJob8, "TempDir", TEMPDIR)
ErrCode = DSSetParam(hJob8, "HashDir", HASHDIR)
ErrCode = DSSetParam(hJob8, "RejDir", REJDIR)
ErrCode = DSSetParam(hJob8, "Instance", DRIVER.DBSID)
ErrCode = DSSetParam(hJob8, "UserName", DRIVER.DBUSER)
ErrCode = DSSetParam(hJob8, "Password", DRIVER.DBPWD)
ErrCode = DSRunJob(hJob8, DSJ.RUNNORMAL)
ErrCode = DSWaitForJob(hJob8)
Status = DSGetJobInfo(hJob8, DSJ.JOBSTATUS)
If Status = DSJS.RUNFAILED Or Status = DSJS.CRASHED Then
* Fatal Error - No Return gosub ErrorSub
End
LogMsg = GetLogMsg(LOGDIR, ThisJob)
LinkArray = GetLinkStats(ThisJob)
Rslt = AuditRowCnt(ThisJob, AUDITDIR, JobName)
If Rslt # 0 Then gosub ErrorHandler

gosub Step90 ;
return


********
Step90:

STEP=9 ; gosub WriteRestartStep
CURRSTEP = 'STEP90' ; gosub StepLog
ThisJob = "D9SCFCMPODS004AD100load"
Rslt = ResetJobStatus(ThisJob)
* Setup D9SCFCMPODS004AD100load, run it, wait for it to finish, and test for success
hJob9 = DSAttachJob(ThisJob, DSJ.ERRFATAL)
If NOT(hJob9) Then
gosub ErrorHandler
Abort
End
ErrCode = DSSetParam(hJob9, "ThisJob", ThisJob)
ErrCode = DSSetParam(hJob9, "ExtractJob", "D9SCFCMPODS004AD080xfm")
ErrCode = DSSetParam(hJob9, "BatchJob", BatchJob[8,16])
ErrCode = DSSetParam(hJob9, "TempDir", TEMPDIR)
ErrCode = DSSetParam(hJob9, "HashDir", HASHDIR)
ErrCode = DSSetParam(hJob9, "RejDir", REJDIR)
ErrCode = DSSetParam(hJob9, "Instance", DRIVER.DBSID)
ErrCode = DSSetParam(hJob9, "UserName", DRIVER.DBUSER)
ErrCode = DSSetParam(hJob9, "Password", DRIVER.DBPWD)
ErrCode = DSRunJob(hJob9, DSJ.RUNNORMAL)
ErrCode = DSWaitForJob(hJob9)
Status = DSGetJobInfo(hJob9, DSJ.JOBSTATUS)
If Status = DSJS.RUNFAILED Or Status = DSJS.CRASHED Then
* Fatal Error - No Return
gosub ErrorSub
End
LogMsg = GetLogMsg(LOGDIR, ThisJob)
LinkArray = GetLinkStats(ThisJob)
Rslt = AuditRowCnt(ThisJob, AUDITDIR, JobName)
If Rslt # 0 Then gosub ErrorHandler
gosub CALCWAITTIME ;
return

********
ReadRestartStep:
STEP = ReadRestartStep(RESTARTDIR, STEPFILE)
If STEP = 1 Then
AuditStart = 'Started: '
AuditAction='O' End Else
Call DSLogWarn('Restarting STEP=':STEP:', reading RESTARTLOG file', DSJobName)
gosub ReadLog
AuditStart = 'Re-Started: '
AuditAction='A'
End
return

********
WriteRestartStep:
Rslt = WriteRestartStep(RESTARTDIR, STEPFILE, STEP)
return

********
WriteAuditHdr:
AuditRec = AUDIT.HDR
Gosub WriteLog
return

********
WriteAuditTrlr:
AuditRec = AUDIT.TRLR
AuditAction='A'
Gosub WriteLog
return
********
ReadLog:
return

********
WriteLog:
If AuditRec # '' Then
Rslt = WriteLog(AUDITDIR, AuditFile, AuditRec, AuditAction)
End
AuditRec=''
return

********
ErrorSub:
FailedMsg = "Job Failed: ":ThisJob
AuditRec = GetLogMsg(LOGDIR, ThisJob):LF ; * Get log messages for subjob job
gosub ErrorHandler
return

********
ErrorHandler:
FailedMsg = "Job Failed: ":ThisJob
AuditRec<-1> = FailedMsg
AuditStatus = STATUS.FAIL
AuditAction = 'A'
gosub WriteLog
gosub WriteAuditTrlr; MailFile = AuditPath ; * Mail the Audit Log

gosub SendMailFile
Call DSLogFatal(FailedMsg, "JobControl")
return

********
SendMailFile: Rslt = SendEmailFile(MailTo, MailSubj, MailFile)

return

********
SendMail:
Rslt = SendEmail(MailTo, MailSubj, MailBody)

**********
CALCWAITTIME:

EXIT_CODE = GetSwitchValue('PortalProject.ExitSwitch')
Call DSLogInfo('Exit_Code ':EXIT_CODE , DSJobName)

If EXIT_CODE <> 'Y' Then
Call DSLogInfo('ATTEMPT_NO :':ATTEMPT_NO, DSJobName)

Begin Case
Case ATTEMPT_NO=0
SLEEPTIME=WAITTIME1
Case ATTEMPT_NO=1
SLEEPTIME=WAITTIME2
Case ATTEMPT_NO=2
SLEEPTIME=WAITTIME3
Case @True
SLEEPTIME=WAITTIME4
End Case

SLEEP SLEEPTIME
gosub Step10 ;
End ELSE

gosub NormalEnd

END

return

**********
NormalEnd:

Call DSLogInfo('Entered Normal End', DSJobName)
STEP=1 ;
gosub WriteRestartStep
CURRSTEP = 'STEP999'
Call DSLogInfo('CURRSTEP=':CURRSTEP, DSJobName)
gosub WriteAuditTrlr
MailFile = AuditPath ;
gosub SendMailFile
Call DSLogInfo('After Send Mail', DSJobName)
gosub Finish

*return

************
Finish:
*return


*************** Code End for Option 2 ****************
#########################################
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

You seem to be building but not unwinding a call stack with your GoSub statements - as highlighted by the Return statements commented out at the bottom of Option 2 for example.

One way to unwind the call stack is the following technique at your final return point.

Code: Select all

Main.Exit:
      Return to MainExit
Ideally, however, each internal subroutine should effect a clean return to the statement immediately following the GoSub statement that invoked it.

I have only glanced through your code thus far, but that point jumped up at me. Hopefully I'll get some more time later. It would have been far nicer had you posted formatted code, surrounded by Code tags (as per my example above).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply