Fatal Error via email

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
vijayrc
Participant
Posts: 197
Joined: Sun Apr 02, 2006 10:31 am
Location: NJ

Fatal Error via email

Post by vijayrc »

Friends,
Just wondering on how to get this one done.
When a job fails, is there a way to just capture the 'Fatal' log entry in the job log and send it out as an email ?
:roll:
-Vijay
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Are you happy to write a routine?
You can call it from a Routine activity in the controlling job sequence.
Use DSGetNewestLogId() to get the most recent fatal error, or DSGetLogSummary() to get all of them.
Then use DSSendMail() to send the email.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

And at the unix level you can detect for job failure and get the FATAL error message by using dsjob -logsum with TYPE as FATAL, I believe. Store that in a file and then send an email using mailx.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

I just wrote a script last week. May be useful for you.
You can use this code to fetch the warning or fatal error occured the job's latest entry. You can output to standard output or to a file.

Code: Select all

#!/usr/bin/env ksh
#Script to extract the Latest Fatal / Warning log from the job to a file.
#In addition to that, Oracle errors in ORA-nnnn format will be extracted

#Input parameters

Project=$1
Job=$2
Filename=$3

#Internal Parameter declaration
FatalEvent='Yes'
WarningEvent='Null'

#Clear the file if already exist.
echo > $Filename

#Function to fetch the timestamp in YYYMMDDHHMISS format
TimeStamp() {
Year=$9
Month=$6
if [ $Month = 'Jan' ]
    then Month=01
elif [ $Month = 'Feb' ]
    then Month=02
elif [ $Month = 'Mar' ]
    then Month=03
elif [ $Month = 'Apr' ]
    then Month=04
elif [ $Month = 'May' ]
    then Month=05
elif [ $Month = 'Jun' ]
    then Month=06
elif [ $Month = 'Jul' ]
    then Month=07
elif [ $Month = 'Aug' ]
    then Month=08
elif [ $Month = 'Sep' ]
    then Month=09
elif [ $Month = 'Oct' ]
    then Month=10
elif [ $Month = 'Nov' ]
    then Month=11
elif [ $Month = 'Dec' ]
    then Month=12
fi

Day=$7
Hour=`echo $8|cut -c'1-2'`
Minute=`echo $8|cut -c'4-5'`
Second=`echo $8|cut -c'7-8'`

JobTime=${Year}${Month}${Day}${Hour}${Minute}${Second}

echo $JobTime
}

cd `cat /.dshome`
#Job Start Time stamp for the current run
JobStartTime=`dsjob -jobinfo $Project $Job | grep 'Job Start Time'`
JobStartTime=`TimeStamp $JobStartTime`
echo JobStartTime :  $JobStartTime

#Job End Time stamp for current run
JobEndTime=`dsjob -jobinfo $Project $Job | grep 'Last Run Time'`
JobEndTime=`TimeStamp $JobEndTime`
echo JobEndTime :  $JobEndTime

#Latest available Fatal even id for the given job
FatalEventId=`dsjob -lognewest $Project $Job FATAL | awk '{print $4}'`
echo Fatal Evenid $FatalEventId

if (( $FatalEventId != 0 )) #If there is no fatal error at all...
then
     #Timestamp of Fatal error event
     `dsjob -logdetail $Project $Job $FatalEventId > ${Filename}_temp`
     FatalTime="Dummy1 Dummy2 "`cat ${Filename}_temp | grep "Time"` #Padded with required number of parameters
     FatalTime=`TimeStamp $FatalTime`

     #Check for the event wehter its belongs to current run
     if (( $FatalTime >= $JobStartTime ))
     then
          MainEventId=$FatalEventId
     else
         FatalEvent='No'
     fi
else
     FatalEvent='No'
fi

#If there is not Fatal erro for the current run...
if ([ $FatalEvent = 'No' ])
then
     #Latest available Warning event id
     WarningEventId=`dsjob -lognewest $Project $Job WARNING | awk '{print $4}'`
     if (( $WarningEventId != 0 ))
     then

         #Check for the event wehter its belongs to current run
          `dsjob -logdetail $Project $Job $WarningEventId > ${Filename}_temp` 
          LogDetail=`cat ${Filename}_temp | grep "Time"`
          WarningTime="Dummy1 Dumm2 "$LogDetail
          WarningTime=`TimeStamp $WarningTime`
          if (( $WarningTime >= $JobStartTime ))
          then
               MainEventId=$WarningEventId
          else
               echo Warning is not for current run.
               WarningEvent='No'
          fi
     else
          WarningEvent='No'
     fi
fi


if ([ $FatalEvent = 'No' ] && [ $WarningEvent = 'No' ])
then
     echo There is neither Fatal error nor Warning for this current run.
else
      #To explore 2 events up and down in addition to the current FATAL or WARNING event to get more detail 
      EndLoop=`expr $MainEventId + 2`
      EndJobCheck=`dsjob -lognewest $Project $Job | awk '{print $4}'`
      #If the FATAL or WARNING event is last but one
      if (( $EndJobCheck < $EndLoop )) 
      then
          EndLoop=$EndJobCheck
      fi
      i=`expr $MainEventId - 2`
      while (( $i <= $EndLoop )); do
             `dsjob -logdetail $Project $Job $i > ${Filename}_temp`
              Type=`cat ${Filename}_temp | grep "Type" | awk '{print $3}'`
              if ([ $Type = 'FATAL' ] || [ $Type = 'WARNING' ])
              then
                  `cat ${Filename}_temp >> $Filename`
		  #echo `echo $LogDetail >> ${Job}${JobStartTime}-${JobEndTime}\n`
                  echo Oracle Error If Any...`echo $LogDetail | sed 's/\(.*\)\(ORA-[0-9]\{5,5\}:\)\(.*\)/\2\3/'`
              fi
          i=`expr $i + 1`

          #Removing temp file created
          rm ${Filename}_temp
       done
fi
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
abc123
Premium Member
Premium Member
Posts: 605
Joined: Fri Aug 25, 2006 8:24 am

Post by abc123 »

Hello everyone, I am looking to do the exact same thing as the original poster. I have a few questions about Kumar_s's script.

Q1: Line 1,2,3: Your script expects input parameters of Project, Job and Filename. If I have a job sequence with 10 jobs and the 5th one failed, how do I know what "Job" is?

Q2: Line 81: What is $4, the 4th parameter?

Q3: Line 15,16,42: You are expecting the year, month, timestamp and day values to be passed. Shouldn't you be just taking the current system values?

Q4: If I want to call this when a sequence job aborts, where would I put this code?

I would appreciate it if anybody could answer these questions. Thanks.

-------------------------------------------------------------------------------------
1)#!/usr/bin/env ksh
2)#Script to extract the Latest Fatal / Warning log from the job to a file.
3)#In addition to that, Oracle errors in ORA-nnnn format will be extracted

4)#Input parameters

5)Project=$1
6)Job=$2
7)Filename=$3

8)#Internal Parameter declaration
9)FatalEvent='Yes'
10)WarningEvent='Null'

11)#Clear the file if already exist.
12)echo > $Filename

13)#Function to fetch the timestamp in YYYMMDDHHMISS format
14)TimeStamp() {
15)Year=$9
16)Month=$6
17)if [ $Month = 'Jan' ]
18) then Month=01
19)elif [ $Month = 'Feb' ]
20) then Month=02
21)elif [ $Month = 'Mar' ]
22) then Month=03
23)elif [ $Month = 'Apr' ]
24) then Month=04
25)elif [ $Month = 'May' ]
26) then Month=05
27)elif [ $Month = 'Jun' ]
28) then Month=06
29)elif [ $Month = 'Jul' ]
30) then Month=07
31)elif [ $Month = 'Aug' ]
32) then Month=08
33)elif [ $Month = 'Sep' ]
34) then Month=09
35)elif [ $Month = 'Oct' ]
36) then Month=10
37)elif [ $Month = 'Nov' ]
38) then Month=11
39)elif [ $Month = 'Dec' ]
40) then Month=12
41)fi

42)Day=$7
43)Hour=`echo $8|cut -c'1-2'`
44)Minute=`echo $8|cut -c'4-5'`
45)Second=`echo $8|cut -c'7-8'`

46)JobTime=${Year}${Month}${Day}${Hour}${Minute}${Second}

47)echo $JobTime
48)}

49)cd `cat /.dshome`
50)#Job Start Time stamp for the current run
51)JobStartTime=`dsjob -jobinfo $Project $Job | grep 'Job Start Time'`
52)JobStartTime=`TimeStamp $JobStartTime`
53)echo JobStartTime : $JobStartTime

54)#Job End Time stamp for current run
55)JobEndTime=`dsjob -jobinfo $Project $Job | grep 'Last Run Time'`
56)JobEndTime=`TimeStamp $JobEndTime`
57)echo JobEndTime : $JobEndTime

58)#Latest available Fatal even id for the given job
59)FatalEventId=`dsjob -lognewest $Project $Job FATAL | awk '{print $4}'`
60)echo Fatal Evenid $FatalEventId

61)if (( $FatalEventId != 0 )) #If there is no fatal error at all...
62)then
63) #Timestamp of Fatal error event
64) `dsjob -logdetail $Project $Job $FatalEventId > ${Filename}_temp`
65) FatalTime="Dummy1 Dummy2 "`cat ${Filename}_temp | grep "Time"` #Padded with required number of parameters
FatalTime=`TimeStamp $FatalTime`

66) #Check for the event whether it belongs to current run
67) if (( $FatalTime >= $JobStartTime ))
68) then
69) MainEventId=$FatalEventId
70) else
71) FatalEvent='No'
72) fi
73)else
74) FatalEvent='No'
75)fi

76)#If there is no Fatal error in the current run...
77)if ([ $FatalEvent = 'No' ])
78)then
79) #Latest available Warning event id
80) WarningEventId=`dsjob -lognewest $Project $Job WARNING |
81)awk '{print $4}'`
82) if (( $WarningEventId != 0 ))
83) then

84) #Check for the event wehter its belongs to current run
85) `dsjob -logdetail $Project $Job $WarningEventId > ${Filename}_temp`
86) LogDetail=`cat ${Filename}_temp | grep "Time"`
87) WarningTime="Dummy1 Dumm2 "$LogDetail
88) WarningTime=`TimeStamp $WarningTime`
89) if (( $WarningTime >= $JobStartTime ))
90) then
91) MainEventId=$WarningEventId
92) else
93) echo Warning is not for current run.
94) WarningEvent='No'
95) fi
96) else
97) WarningEvent='No'
98) fi
99)fi


100)if ([ $FatalEvent = 'No' ] && [ $WarningEvent = 'No' ])
101)then
102) echo There is neither Fatal error nor Warning for this current run.
103)else
104) #To explore 2 events up and down in addition to the current FATAL or WARNING event to get more detail
105) EndLoop=`expr $MainEventId + 2`
106) EndJobCheck=`dsjob -lognewest $Project $Job | awk '{print $4}'`
107) #If the FATAL or WARNING event is last but one
108) if (( $EndJobCheck < $EndLoop ))
109) then
110) EndLoop=$EndJobCheck
111) fi
112) i=`expr $MainEventId - 2`
113) while (( $i <= $EndLoop )); do
114) `dsjob -logdetail $Project $Job $i > ${Filename}_temp`
115) Type=`cat ${Filename}_temp | grep "Type" | awk '{print $3}'`
116) if ([ $Type = 'FATAL' ] || [ $Type = 'WARNING' ])
117) then
118) `cat ${Filename}_temp >> $Filename`
119) #echo `echo $LogDetail >> ${Job}${JobStartTime}-${JobEndTime}\n`
120) echo Oracle Error If Any...`echo $LogDetail | sed 's/\(.*\)\(ORA-[0-9]\{5,5\}:\)\(.*\)/\2\3/'`
fi
121) i=`expr $i + 1`

122) #Removing temp file created
123) rm ${Filename}_temp
124) done
125)fi
-------------------------------------------------------------------------------------
Roopanwita
Participant
Posts: 125
Joined: Mon Sep 11, 2006 4:22 am
Location: India

Post by Roopanwita »

Hi,
I got a query for the post...
which parameter is $4?
Can I use DSRUNJOB script of Ascential to run the sequence?
sud
Premium Member
Premium Member
Posts: 366
Joined: Fri Dec 02, 2005 5:00 am
Location: Here I Am

Post by sud »

Roopanwita wrote:Hi,
I got a query for the post...
which parameter is $4?
Can I use DSRUNJOB script of Ascential to run the sequence?
The "$4" used with awk just prints the fourth column from the output, in this case will print only the fatal or warning ID, so you don't have to give that as input. This script (kumar's post) is a shell script and needs to be invoked from the sequence in the event of a failure and hence has to be incorporated in your job sequence.

By the way, did you try including the job status in the email using the built in functionality?
It took me fifteen years to discover I had no talent for ETL, but I couldn't give it up because by that time I was too famous.
Roopanwita
Participant
Posts: 125
Joined: Mon Sep 11, 2006 4:22 am
Location: India

Post by Roopanwita »

Thanks for clarification. By using built in stage ,I am able to send only job status( finished/finished with warning/aborted) only..So I am trying to write a Routine/Script which will capture the error msg also from log...
While searching I came across this script...

I have one more qn,if the job is trying to capture log of job,so basically it will read from DS log file. So in param file I should mention the path also...
I may be wrong ,so can you please clarify it..

Thanks in advance..
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Roopanwita wrote:By using built in stage ,I am able to send only job status( finished/finished with warning/aborted) only..
:? I honestly don't believe this to be true. It's been awhile, but from what I remember it actually does include the failure message(s) from any problem job when you enable that option. If I get a chance, I'll test that out on my system.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Roopanwita
Participant
Posts: 125
Joined: Mon Sep 11, 2006 4:22 am
Location: India

Post by Roopanwita »

Thanks for reply.
Than I am not aware of it.Can you please guide me ...
Thanks ...
sud
Premium Member
Premium Member
Posts: 366
Joined: Fri Dec 02, 2005 5:00 am
Location: Here I Am

Post by sud »

Roopanwita wrote:Thanks for reply.
Than I am not aware of it.Can you please guide me ...
Thanks ...
Well, do this: in your job sequence, take a notification activity from the palette and open it's properties (the place where you have to mention the to and from addresses as well as the subject and body) and look for the check box of "Include job status in email" and switch it on and then see what happens.
It took me fifteen years to discover I had no talent for ETL, but I couldn't give it up because by that time I was too famous.
Roopanwita
Participant
Posts: 125
Joined: Mon Sep 11, 2006 4:22 am
Location: India

Post by Roopanwita »

Thanks for your help.
This process is working fine,but my requirement is different.
Here error msg sent is static.I want to capture error msg from job log & sent through mail notification.

Can you please help me out.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

No, while the body text you supply in the stage would be 'static', checking the 'Include job status' option would dynamically include the error text as well.

Have you actually tried it with a failing sequence? Did your experience not match the advice given?
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply