Page 1 of 1

How can we import data from RT_LOG files

Posted: Tue Jul 09, 2013 11:35 am
by jreddy
Looking for suggestions on how to retrieve and view data from the RT_LOGXXX files from a job - thanks in advance.

It looks like our log has exceeded the 2GB limit so takes forever and multiple attempts to open from director..

Posted: Tue Jul 09, 2013 11:51 am
by ArndW
The RT_LOGnnn files are DataStage hashed files. If you've gone over the 2Gb limit you will have a corrupted file which is most likely easier to fix by truncating it than by attempting to fix it. The quickest way to fix the file is to delete all the entries from it from the director (Job -> Clear Log -> immediate purge/clear all entries).

If you need to read the entries you could either write a server job to read this hashed file or you could go through the server API functions DSGetLogInfo() to retrieve messages.

Posted: Tue Jul 09, 2013 4:01 pm
by ray.wurlod
Of course it takes a long time. How long does it take you to read through a 2GB document? And the log has to be sorted.

The moral of this story is to purge old entries from job logs regularly.

Incidentally, if you can eventually read the log, then it's not corrupted. So it may not yet have reached its 2GB limit.

Posted: Tue Jul 09, 2013 5:00 pm
by rameshrr3
Incidentally ,even a single job run processing a large number of rows with 2/3 warning entries for every record will cause a log file to go beyond the 2 GIG limit , and you may see a strange error with the BLINK ( backwards link ) in director if you attempt to open this jobs log. The solution is to kill all client sessions , log on to the server UV command line (UVSH) , and find the log file for that job and clear the log file from uv using CLEAR.FILE .... To see what warning was generated, run the job in debug mode with all stages set to stop after a few rows ( 10 or 50) using the limits tab of the job run dialog, keep increasing this threshold till the first warning ( or set of warnings ) appear(s). Sometimes bad rows start only much later in the input data .

Posted: Wed Jul 17, 2013 10:23 am
by jreddy
Thank you all - Every once in a while this job seems to have data issues and all 400K or 700K rows from input are rejected and that is when the logs get filled up. Because of this load would abort with -99 error, director cannot open the log etc.. But unless I know what those errors are we won't be able to find root cause and fix it forever.. so that is the reason I had asked this question.

However, the solution that I figured out was to read the RT_LOGXXXX file as source and dump it into a sequential file - I looked for ways to import the metadata of the RT_LOG hash file, but couldn't so what I did was built some table definition myself.. with different datatypes and sizes until I was able to read the data.. I then dumped it into a sequential file and was able to see all the error messages..

I am posting this solution in case anyone else has a similar problem. I will mark this post resolved as well. Thanks again for all your responses.

Posted: Wed Jul 17, 2013 11:54 am
by chulett
One of my rules is to never run anything with unlimited errors. Keep the warning limit small so you can get some error information logged but you don't blow out the logs in the process.

Posted: Wed Jul 17, 2013 1:32 pm
by rameshrr3
Although not recommended by anyone[myself included :P] , you can possibly create a Q pointer in your projects VOC file to the job's log file and then import it as an UniVerse file definition to read its records - using a hashed file stage.

Posted: Wed Jul 17, 2013 4:45 pm
by ray.wurlod
:shock:
There should already be an "F" pointer in the VOC for each log hashed file.