DataStage Job

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
shyju
Participant
Posts: 58
Joined: Thu May 19, 2005 1:00 am

DataStage Job

Post by shyju »

Hi all!

I designed a DateStage job which is a simple load of data from a sequential file with hash file as a look up and loading it into one of the tables in Redbrick database. While running this job, Though the data gets loaded completely into the table but while viewing the same job in designer by setting the "Show Performance Statistics", The job is actually running. However it gets completed after some time. But the infomation about completion of the job in DataStage designer is quite misleading. Can anyone explain the same?
ogmios
Participant
Posts: 659
Joined: Tue Mar 11, 2003 3:40 pm

Re: DataStage Job

Post by ogmios »

Can you supply the actual "end messages" that you see in your job, and what you would actually expect. The original mail is a bit confusing.

Ogmios
In theory there's no difference between theory and practice. In practice there is.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

The row counts are captured at regular intervals while the stage is running. However, the job can not actually finish until rb_tmu (the Red Brick bulk loader) returns an exit status. So there is a period during which rb_tmu is executing in which DataStage is not processing any more rows, but is merely awaiting the exit status from rb_tmu. It's this that you're observing, and it's perfectly normal.

If you want to prove this, change "automatic load" to false. DataStage will finish promptly, but you will need to make some other arrangement (such as an Execute Command activity in a job sequence) for performing the actual load into Red Brick.

You might also compare with the timestamps in the Red Brick activity log for the load.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
shyju
Participant
Posts: 58
Joined: Thu May 19, 2005 1:00 am

Post by shyju »

Thx Ray for the detailed explaination. One more doubt to add on... While using the ODBC stage instead of Red Brick Bulk load, I happen to face the same kind of scenario. Can you please explain this?
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Assuming that you're using 0 rows per transaction, much the same kind of explanation. You send all the rows, but hold off sending a "commit" until the end. It's not until that point that Red Brick can start actually loading rows into tables and updating indices. Your DataStage job waits for the "all OK".
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Best scenario, especially if you have lots of indexing on the Red Brick table, is to use the bulk load stage with automatic load disabled, then use the parallel bulk loader (rb_ptmu) from an after-stage or after-job subroutine. This allows separate processes to load the table and the indexes, in parallel.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply