Hi all!
I designed a DateStage job which is a simple load of data from a sequential file with hash file as a look up and loading it into one of the tables in Redbrick database. While running this job, Though the data gets loaded completely into the table but while viewing the same job in designer by setting the "Show Performance Statistics", The job is actually running. However it gets completed after some time. But the infomation about completion of the job in DataStage designer is quite misleading. Can anyone explain the same?
DataStage Job
Moderators: chulett, rschirm, roy
Re: DataStage Job
Can you supply the actual "end messages" that you see in your job, and what you would actually expect. The original mail is a bit confusing.
Ogmios
Ogmios
In theory there's no difference between theory and practice. In practice there is.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
The row counts are captured at regular intervals while the stage is running. However, the job can not actually finish until rb_tmu (the Red Brick bulk loader) returns an exit status. So there is a period during which rb_tmu is executing in which DataStage is not processing any more rows, but is merely awaiting the exit status from rb_tmu. It's this that you're observing, and it's perfectly normal.
If you want to prove this, change "automatic load" to false. DataStage will finish promptly, but you will need to make some other arrangement (such as an Execute Command activity in a job sequence) for performing the actual load into Red Brick.
You might also compare with the timestamps in the Red Brick activity log for the load.
If you want to prove this, change "automatic load" to false. DataStage will finish promptly, but you will need to make some other arrangement (such as an Execute Command activity in a job sequence) for performing the actual load into Red Brick.
You might also compare with the timestamps in the Red Brick activity log for the load.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Assuming that you're using 0 rows per transaction, much the same kind of explanation. You send all the rows, but hold off sending a "commit" until the end. It's not until that point that Red Brick can start actually loading rows into tables and updating indices. Your DataStage job waits for the "all OK".
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Best scenario, especially if you have lots of indexing on the Red Brick table, is to use the bulk load stage with automatic load disabled, then use the parallel bulk loader (rb_ptmu) from an after-stage or after-job subroutine. This allows separate processes to load the table and the indexes, in parallel.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.