Using a routine to write to Oracle tables

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
Yohanes
Participant
Posts: 2
Joined: Wed Jan 29, 2003 10:12 pm
Location: Australia

Using a routine to write to Oracle tables

Post by Yohanes »

Hi, All!

I'm using DataStage 6. I'd like to store some statistics of a job into an Oracle table (8.1.7). The statistics would be job name, start date/time, end date/time, job status, number of processed rows (main input).

I'm thinking to use Before/After-job subroutine. Ideally, Before-job subroutine will insert a new record containing job name, start date/time and job status into the table, and After-job subroutine will update the record with end date/time, job status and number of processed rows.

Questions:
1. Is this the best way to achive this? Any other suggestion?
If so..
2. Can I use a routine to insert/update into an Oracle table? How exactly is this done? Examples would be helpful.
3. How do I obtain job status and number of processed rows?

Thanks in advance.
- Yohanes -
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

There are several ways to do this, most of them can be found by doing searches through the archive for SQLPLUS or similar strings.

1) Output you information to a text file from a routine using the sequential file commands. Run a job that loads the file into Oracle.
2) Shell out to the operating system and run the SQLPLUS command passing it an insert statement.
strLogin = dbLogin : "/" : dbPassword : "@" : dbName
strInsert = "insert into blah blah blah"
Call DSExecute(Unix, "echo " : strInsert : " | sqlplus " : strLogin, strFlag, Status)
3) Using the BASIC SQL Client Interface (BCI) to run database and recordset commands. Do a search through the forum for previous messages on this.

Vincent McBurney
Data Integration Services
www.intramatix.com
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

You could also look at doing this with Job Control with an optional second job.

The job control can get the information you need from the original job and/or the job's log. For example, look at DSGetLinkInfo() for a way to pull processed rows from a job. Check the online help for all DSGet* routines to give you a better idea of what kind of information can be gathered rather easily. Then either code something as Vincent suggests (BCI/Sql), or pass the information into a second job as parameters and have the second job update your statistics table.

-craig
Yohanes
Participant
Posts: 2
Joined: Wed Jan 29, 2003 10:12 pm
Location: Australia

Post by Yohanes »

Actually, our requirement is to log the info on job execution, on real-time. Or, at least, the info is to be logged upon job completion.

I've searched the archive on job logging. Others have had the same or similar queries, but I could't find any answers. If anybody has dealt with this requirement before, I appreciate if you would let me know.

Anyway, I'd first like to explore the option of using Before/After routine to achieve this. So, does a routine has capabilities to write into Oracle tables? If it does, how exactly is it done?

Thanks again!
rasi
Participant
Posts: 464
Joined: Fri Oct 25, 2002 1:33 am
Location: Australia, Sydney

Post by rasi »

Hi,

Use the Aggregrator stage in your job and in the transformation capture the required info you need to log. Inside Aggregrator use the last function to insert into the oracle table.

I am using this techique and it is quite efficient.

Thanks
Rasi
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Updating Oracle table can be performed using BCI functions within BASIC, as Vincent suggested. However, to use these, you require a valid ODBC licence (on UNIX, the supplied ODBC drivers are only licensed to be used by DataStage, and generate a warning to this effect if used).

Here's one suggestion about obtaining the information needed.

$IFNDEF JOBCONTROL.H
$INCLUDE DSINCLUDE JOBCONTROL.H
$ENDIF

hJob = DSJ.ME ; * if current job

JobName = DSGetJobInfo(hJob, DSJ.JOBNAME)
JobStart = DSGetJobInfo(hJob, DSJ.JOBSTARTTIMESTAMP)
JobFinishEvent = DSGetNewestLogId(hJob, DSJ.LOGSTARTED)
LogEntry = DSGetLogEntry(hJob, EventID)
JobFinished = Field(LogEntry, "", 1, 1)
RowCount = DSGetStageInfo(hJob, StageName, DSJ.STAGEINROWNUM)
* You need to provide the name of the active stage that is
* accepting the "main input".


Ray Wurlod
Education and Consulting Services
ABN 57 092 448 518
ariear
Participant
Posts: 237
Joined: Thu Dec 26, 2002 2:19 pm

Post by ariear »

You can check also the DSMakeJobReport function.
nigel
Premium Member
Premium Member
Posts: 35
Joined: Tue May 07, 2002 2:48 am
Location: South Africa/UK
Contact:

Post by nigel »

Hi

Looks like all the info you require is stored in the RT_STATUSxx file for the job where xx is the job number.
This file contains several types of entries, the valid types for your requirement being:

JOB => which gives you job stats of start time and end time
STAGE => number of rows processed on each link into the stage

The key to these files is a combination of JobName + StageName(if applicable) and wave number. The wave number, I think, can be seen as the job run number, so the greater the wave number the more recent the job run.

You could then use a UV stage to access this file and with some clever logic extract the info and then populate the Oracle target.

This is just an idea (which I have not fully explored) and based on my understanding of your problem. I could of course be very mistaken [:I] and I'm sure Ray (the master guru) would highlight any errors in my logic.

I hope this helps in some way

Regards

Nigel

Nigel
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Two glitches in the logic spring immediately to mind.

First is that, while generally true, the highest wave number is not necessarily the most recent run.

Second, which is more of a show stopper, is that the metadata for the RT_STATUSxx table is not published, which makes it kinda difficult (without hacking) to use a UV stage.

A further difficulty that might arise is in the case where a job has aborted, and been unable to update its RT_STATUSxx table. (This is often seen in Director as a job whose status remains as Running for a long time, even though all of its active stages have finished.) You would not necessarily be retrieving accurate information until you'd reset the job (which affect the wave number).

As a general rule, "peeking under the covers" is discouraged. There is no certainty that the DataStage Repository structure remains constant from release to release; it underwent substantial revision between 4.x and 5.x, and again between 5.x and 6.x. Who knows what is in store in 7.x?


Ray Wurlod
Education and Consulting Services
ABN 57 092 448 518
Post Reply