Page 1 of 2

Log files of a Datastage Job

Posted: Fri Nov 26, 2004 3:01 am
by rajeev_prabhuat
Hi,

I am working in DS 7.X and using Parallel extender properties, i am new to DS 7.x.

I developed 4 jobs and i had run 4 jobs y'day evening and went home. Each job loads about 1 million rows. When I came to see the same this morning, only 400,000 rows were loaded and the 4 jobs aborted. I tried to view the log for one of the jobs, but on the Director, it just says "Retrieving" and the process is hung. This is possibly due to a huge log file.

My query is: apart from Director, can we see the log file manually? Is the log file located on the server. If so, let me know where in the directory is it located?

Regards,
Rajeev Prabhu

Posted: Fri Nov 26, 2004 3:28 am
by ray.wurlod
DataStage job logs are located on the server.

However they are not log files, as you have assumed. They are database tables in the repository.

If you determine the job number of your job, from the DS_JOBS table, then the log for that job is a table called RT_LOGnnn, where nnn is the job number.

You can view them from the operating system using the command line interface dsjob with its -log... options.

You can view them from the Administrator client command window with a regular query, for example

Code: Select all

SELECT * FROM RT_LOGnnn;
Search the forum for more information about viewing and archiving DataStage job logs.

Posted: Fri Nov 26, 2004 4:29 am
by rajeev_prabhuat
Hi,

Ray Thanks you.

But when i was going through the DS documents that we get during instation, i came to see that log files are stored in hash files. Is there anyway to reduce the size of the log files?.

Regards,
Rajeev Prabhu
ray.wurlod wrote:DataStage job logs are located on the server.

However they are not log files, as you have assumed. They are database tables in the repository.

If you determine the job number of your job, from the DS_JOBS table, then the log for that job is a table called RT_LOGnnn, where nnn is the job number.

You can view them from the operating system using the command line interface dsjob with its -log... options.

You can view them from the Administrator client command window with a regular query, for example

Code: Select all

SELECT * FROM RT_LOGnnn;
Search the forum for more information about viewing and archiving DataStage job logs.

Posted: Fri Nov 26, 2004 2:52 pm
by ray.wurlod
Purge the log files of old entries on a regular basis. Enable auto-purge wherever possible.

Create job designs that don't generate warnings or unnecessary informational messages.

Periodically (ideally after purging), compress the internal space in the hashed files using the command

Code: Select all

RESIZE RT_LOGnnn * * *
either from the Administrator client command window or from the DataStage shell (dssh).

Posted: Tue Nov 30, 2004 10:51 pm
by rajeev_prabhuat
H,

Ray as you told we have table for log for each job, can we take this data from the table and populate into our oracle table to monitor our job.

My requirment is something like this, we have 42 jobs and all the jobs are one to one mapping, we want to store the job name, its start time, end time, status and if error what is the error that has happned. Then i thought of what you had told about the table in Universe. If we can take please let ke know how we can take.

Regards,
Rajeev
rajeev_prabhuat wrote:Hi,

Ray Thanks you.

But when i was going through the DS documents that we get during instation, i came to see that log files are stored in hash files. Is there anyway to reduce the size of the log files?.

Regards,
Rajeev Prabhu
ray.wurlod wrote:DataStage job logs are located on the server.

However they are not log files, as you have assumed. They are database tables in the repository.

If you determine the job number of your job, from the DS_JOBS table, then the log for that job is a table called RT_LOGnnn, where nnn is the job number.

You can view them from the operating system using the command line interface dsjob with its -log... options.

You can view them from the Administrator client command window with a regular query, for example

Code: Select all

SELECT * FROM RT_LOGnnn;
Search the forum for more information about viewing and archiving DataStage job logs.

Posted: Tue Nov 30, 2004 11:54 pm
by ray.wurlod
Hashed file is a method of implementing database tables in DataStage, UniVerse, UniData and other databases.

Columns in the log files are:

Code: Select all

@ID          Numeric event identifier (or "//..." for control records)
TYPE         Enumerated severity level of message
WAVE.NO      Internal number to identify run instance
TIMESTAMP    Date and time when event logged
MSG.ARGS     Parameters to add to message text (multi-valued)
MSG.TEXT     Fixed part of message text
FULL.TEXT    MSG.TEXT with parameters substituted from MSG.ARGS
SEVERITY     TYPE decoded into a text string
The first three are INTEGER, the remainder are VARCHAR.

Posted: Wed Dec 01, 2004 12:34 am
by rajeev_prabhuat
Hi,

Ray, but how can we acess this table in Universe, can we acess witht he same userid and password or is there any other way.

Can i get the start and end time of the job from the log, from the column TIMESTAMP, as specified below, if so that will be good.

Regards,
Rajeev Prabhu
ray.wurlod wrote:Hashed file is a method of implementing database tables in DataStage, UniVerse, UniData and other databases.

Columns in the log files are:

Code: Select all

@ID          Numeric event identifier (or "//..." for control records)
TYPE         Enumerated severity level of message
WAVE.NO      Internal number to identify run instance
TIMESTAMP    Date and time when event logged
MSG.ARGS     Parameters to add to message text (multi-valued)
MSG.TEXT     Fixed part of message text
FULL.TEXT    MSG.TEXT with parameters substituted from MSG.ARGS
SEVERITY     TYPE decoded into a text string
The first three are INTEGER, the remainder are VARCHAR.

Posted: Wed Dec 01, 2004 2:40 am
by ray.wurlod
You can create a DataStage job that retrieves the log information using the metadata I supplied to retrieve using either a UV stage or a hashed file stage. If you use a hashed file stage, the key column should be called AT.ID rather than @ID. Either stage type can constraint on any column, in the usual way.

From there you can do whatever you like with the contents of the log file.

If you feel up to the challenge, you can learn about the log interrogation functions from on-line help. They also have equivalents in the dsjob command line options.
  • DSGetLogSummary
    DSGetNewestLogId
    DSGetLogEntry
If you are getting large log files, remember to purge them regularly. If a log reaches 2GB, and preventive action (switching to 64-bit internal addressing) has not been taken, the log file becomes corrupted and your job is guaranteed to abort. :cry:

Posted: Mon Dec 13, 2004 7:20 am
by rajeev_prabhuat
Hi Ray,

If am using the DSGetLogSummary, to get the log details how do i get the details, should i write a routine ?.

Regards,
Rajeev Prabhu
ray.wurlod wrote:You can create a DataStage job that retrieves the log information using the metadata I supplied to retrieve using either a UV stage or a hashed file stage. If you use a hashed file stage, the key column should be called AT.ID rather than @ID. Either stage type can constraint on any column, in the usual way.

From there you can do whatever you like with the contents of the log file.

If you feel up to the challenge, you can learn about the log interrogation functions from on-line help. They also have equivalents in the dsjob command line options.
  • DSGetLogSummary
    DSGetNewestLogId
    DSGetLogEntry
If you are getting large log files, remember to purge them regularly. If a log reaches 2GB, and preventive action (switching to 64-bit internal addressing) has not been taken, the log file becomes corrupted and your job is guaranteed to abort. :cry:

Posted: Mon Dec 13, 2004 2:34 pm
by ray.wurlod
You WILL need to write a routine to use the DSGetLogEntry() function. The function can only retrieve the details of a single logged event.

However, why are you using these functions at all in a DataStage job? Use a UV stage or hashed file stage to select from the log, and you automatically get everything useful that's in the log. Send this to any destination of your choice.

Posted: Tue Dec 14, 2004 3:04 am
by rajeev_prabhuat
Hi Ray,

What are the basic requirments that you need to acess the table through UV db, i am tried careating the metada of the same that you had given but i was not able to get the select work sucessfully, it was showing error saying the LOG_RT26 (this is a log table for a paraticular job) is not available, i tried it through command line and got the LOG name.

Regards,
Rajeev Prabhu
ray.wurlod wrote:You can create a DataStage job that retrieves the log information using the metadata I supplied to retrieve using either a UV stage or a hashed file stage. If you use a hashed file stage, the key column should be called AT.ID rather than @ID. Either stage type can constraint on any column, in the usual way.

From there you can do whatever you like with the contents of the log file.

If you feel up to the challenge, you can learn about the log interrogation functions from on-line help. They also have equivalents in the dsjob command line options.
  • DSGetLogSummary
    DSGetNewestLogId
    DSGetLogEntry
If you are getting large log files, remember to purge them regularly. If a log reaches 2GB, and preventive action (switching to 64-bit internal addressing) has not been taken, the log file becomes corrupted and your job is guaranteed to abort. :cry:

Posted: Tue Dec 14, 2004 2:08 pm
by ray.wurlod
The fundamental requirement is to get the table name correct! :oops:

SELECT TIMESTAMP, SEVERITY, FULL.TEXT FROM RT_LOG26 ;

Posted: Tue Dec 14, 2004 9:51 pm
by rajeev_prabhuat
Hi Ray,

Thanks you.

See yesterday i had created an routing which gives the details fo the job (Job name, start time, end time, elapsed time, rows read, rows written, the status of the job) and along with that i have created a array variable to collect the DSLogSummary data in the routine but it is give me error value -13, the variable that i have declared is as follows:

LogEntries[10] = DSGetLogSummary(JobHandle,DSJ.LOGINFO,JobStarted,JobEnded,0)

I think this is not the way i shoudl get the detials in the array, can you give me the details how i can get the details in array and how i can split it and take it.

Regards,
Rajeev Prabhu

ray.wurlod wrote:The fundamental requirement is to get the table name correct! :oops:

SELECT TIMESTAMP, SEVERITY, FULL.TEXT FROM RT_LOG26 ;

Posted: Tue Dec 14, 2004 10:14 pm
by ray.wurlod
Need to see your whole routine. Post it here, surrounded by Code tags (select the whole code, then click on the Code button above).

The JobHandle variable must have been assigned by an earlier DSAttachJob() call.
The result from DSGetLogSummary is a "dynamic array". Fields in dynamic arrays are referred to using angle bracket notation.

Result = DSGetLogSummary(...)
FirstField = Result<1>
SecondField = Result<2>

and so on.

You can also use the MATPARSE statement (refer to DataStage BASIC manual) to load the contents of a dynamic array into a dimensioned array. Dimensioned array elements are accessed with the index value in parentheses (not square brackets).

Posted: Wed Dec 15, 2004 12:31 am
by rajeev_prabhuat
Hi Ray,

I am attaching the code for your perusal.

Code: Select all

  $IFNDEF JOBCONTROL.H 
$INCLUDE DSINCLUDE JOBCONTROL.H 
$ENDIF 

ErrorCode = 0;* set this to non-zero to stop the stage/job

ReportType = 0
LogEntries = 0
DirName = InputArg
Suffix = ".txt"
JobHandle = DSJ.ME
JobName = DSGetJobInfo(JobHandle,DSJ.JOBNAME)
JobInvId = DSJobInvocationId
if JobInvId ne "" then
   JobName:= ".":JobInvId
end

JobStarted = convert(" :-","_",DSGetJobInfo(JobHandle,DSJ.JOBSTARTTIMESTAMP))
JobEnded = convert(" :-","_",DSGetJobInfo(JobHandle,DSJ.JOBEOTTIMESTAMP))

Alias = ""
if JobInvId eq "" then
  Alias = DSGetIdForJob(JobName,ErrorCode)
end

if Alias ne "" then
  FileName = JobName:"_":Alias
end
else
  FileName = JobName:"_":JobStarted
end

FileName := Suffix

ReportText = DSMakeJobReport(JobHandle,ReportType,"LF")
*- Getting the required counts from the Job Links
varSrcCount1 = DSGetLinkInfo(JobHandle,'Sequential_File_4','In',DSJ.LINKROWCOUNT)

LogEntries[10] = DSGetLogSummary(JobHandle,DSJ.LOGINFO,JobStarted,JobEnded,0)

varSrcCount2 = 0
varInsCount=0
varSrcCount = varSrcCount1 + varSrcCount2
varUpdCount = 0
varDelCount = 0
Audit_info='"1",'
Audit_info := '"':Field(trims(Field(ReportText,CHAR(10),3)),' ',5):'",'
Audit_info :='"':'LOADING':'",'
Audit_info := '"':right(trims(Field(ReportText,CHAR(10),5)),19):'",'
Audit_info := '"':right(trims(Field(ReportText,CHAR(10),6)),19):'",'
Audit_info := '"':right(trims(Field(ReportText,CHAR(10),7)),8):'",'
Audit_info := '"':right(Field(trims(Field(ReportText,CHAR(10),8)),' ',2),0):'",'
Audit_info := varSrcCount:','
Audit_info := varInsCount:','
Audit_info := varUpdCount:','
Audit_info := varDelCount:','
Audit_info := LogEntries[4]:'"'

Openpath DirName to t.fvar then
  write Audit_info to t.fvar, FileName else
  call DSLogWarn("Failed to write file ":FileName:" to directory ":DirName, "Routine_call_check")
end
end
else
call DSLogWarn("Failed to open directory1 ":DirName, "Routine_call_check")
end
close t.fvar
 
*- Audit Code
AuditText =DSMakeJobReport(JobHandle,1,"LF")
AuditFileName := "Audit_":FileName
Openpath DirName to t.fvar then
write AuditText to t.fvar, AuditFileName else
call DSLogWarn("Failed to write file ":AuditFileName:" to directory ":DirName, "Routine_call_check")
end
end
else
call DSLogWarn("Failed to open directory ":DirName, "Routine_call_check")
end
close t.fvar
Hope you can give me an input on this.

Regards,
Rajeev Prabhu
ray.wurlod wrote:Need to see your whole routine. Post it here, surrounded by Code tags (select the whole code, then click on the Code button above).

The JobHandle variable must have been assigned by an earlier DSAttachJob() call.
The result from DSGetLogSummary is a "dynamic array". Fields in dynamic arrays are referred to using angle bracket notation.

Result = DSGetLogSummary(...)
FirstField = Result<1>
SecondField = Result<2>

and so on.

You can also use the MATPARSE statement (refer to DataStage BASIC manual) to load the contents of a dynamic array into a dimensioned array. Dimensioned array elements are accessed with the index value in parentheses (not square brackets).