One minute to finish job. Isnt it long?

Archive of postings to DataStageUsers@Oliver.com. This forum intended only as a reference and cannot be posted to.

Moderators: chulett, rschirm

Locked
admin
Posts: 8720
Joined: Sun Jan 12, 2003 11:26 pm

One minute to finish job. Isnt it long?

Post by admin »

Hello,

There is an interesting (at least for me :) issue related to time DataStage is spending while finishing job. I can see in job log:

3:58:18 AM 7/9/01 Info euShipmentProcess.ProcessShipmentInfo:
DSD.StatgeRun Active stage finishing. (...)
3:59:24 AM 7/9/01 Control Finished Job euShipmentProcess.
3:59:24 AM 7/9/01 RunJob (euShipmentControl)
admin
Posts: 8720
Joined: Sun Jan 12, 2003 11:26 pm

Post by admin »

Theres lots of things that might happen after the last active stage finishes and before the job finishes. These include, but are not limited
to:
execution of after-job subroutine
freeing of memory allocated for hashed files, aggregators, sort, etc.
cleaning up of disk files associated with allocated memory
updating of status and log tables
closing of DataStage repository tables, incl updating DTM and DTA in O/S
closing of open files
closing connections to databases (perhaps waiting for network
connections)
deleting active stage records from &PH& (based on pid, so needing a
search)
deleting old run records from &PH&
auto-purging job log depending on purge settings
updating process metadata into MetaStage hub

So theres a lot of work that has to occur. Closing is one of the most expensive operations, since the entire directory tree has to be traversed to update date/time accessed. So, no, 66 seconds is not long. Jobs with many and/or large memory-based hashed files can take tens of minutes.


-----Original Message-----
From: AROUSTAMIAN,VARDAN (Non-HP-Cupertino,ex1) [mailto:vardan_aroustamian@non.hp.com]
Sent: Tuesday, 10 July 2001 05:24
To: datastage-users@oliver.com
Subject: One minute to finish job. Isnt it long?


Hello,

There is an interesting (at least for me :) issue related to time DataStage is spending while finishing job. I can see in job log:

3:58:18 AM 7/9/01 Info euShipmentProcess.ProcessShipmentInfo:
DSD.StatgeRun Active stage finishing. (...)
3:59:24 AM 7/9/01 Control Finished Job euShipmentProcess.
3:59:24 AM 7/9/01 RunJob (euShipmentControl)
admin
Posts: 8720
Joined: Sun Jan 12, 2003 11:26 pm

Post by admin »

However, one thing worth checking is the &PH& directory. Excessive numbers of old files here can slow down the job close down time (or at least it can in 3.5, anyway).

I suggest that you look in the &PH& directory at a time when there are no jobs running. If there are a lot of files there, delete them. It might help.

-----Original Message-----
From: Ray Wurlod [SMTP:ray.wurlod@Informix.Com]
Sent: Tuesday, July 10, 2001 9:57 AM
To: datastage-users@oliver.com
Subject: RE: One minute to finish job. Isnt it long?

Theres lots of things that might happen after the last active stage finishes and before the job finishes. These include, but are not limited
to:
execution of after-job subroutine
freeing of memory allocated for hashed files, aggregators, sort, etc.
cleaning up of disk files associated with allocated memory
updating of status and log tables
closing of DataStage repository tables, incl updating DTM and DTA in O/S
closing of open files
closing connections to databases (perhaps waiting for network
connections)
deleting active stage records from &PH& (based on pid, so needing a
search)
deleting old run records from &PH&
auto-purging job log depending on purge settings
updating process metadata into MetaStage hub

So theres a lot of work that has to occur. Closing is one of the most expensive operations, since the entire directory tree has to be traversed to update date/time accessed. So, no, 66 seconds is not long. Jobs with many and/or large memory-based hashed files can take tens of minutes.


-----Original Message-----
From: AROUSTAMIAN,VARDAN (Non-HP-Cupertino,ex1) [mailto:vardan_aroustamian@non.hp.com]
Sent: Tuesday, 10 July 2001 05:24
To: datastage-users@oliver.com
Subject: One minute to finish job. Isnt it long?


Hello,

There is an interesting (at least for me :) issue related to time DataStage is spending while finishing job. I can see in job log:

3:58:18 AM 7/9/01 Info euShipmentProcess.ProcessShipmentInfo:
DSD.StatgeRun Active stage finishing. (...)
3:59:24 AM 7/9/01 Control Finished Job euShipmentProcess.
3:59:24 AM 7/9/01 RunJob (euShipmentControl)
admin
Posts: 8720
Joined: Sun Jan 12, 2003 11:26 pm

Post by admin »

Thank you, Ray and David!

I removed about 300 old files (out of 310) from &PH& in development and "finishing" time jumped from 58 seconds back to 5 seconds! In production I removed already about 100 old files and going to remove another 250 files created during yesterday/today run and will check "finishing" time tomorrow.

Then I searched "&PH&" in documentation and found this in "Troubleshooting" section of "DataStage Administrators Guide".

Job Termination Problems
If you experience delays in the termination of a DataStage job when it is run, empty the &PH& directory. There is a &PH& directory in each DataStage project directory, which contains information about active stages that is used for diagnostic purposes. The &PH& directory is added to every time a job is run, and needs periodic cleaning out.

Looks more like maintenance than troubleshooting.

Regards

Vardan


-----Original Message-----
From: David Barham [mailto:David.Barham@Anglocoal.com.au]
Sent: Monday, July 09, 2001 7:18 PM
To: datastage-users@oliver.com
Subject: RE: One minute to finish job. Isnt it long?


However, one thing worth checking is the &PH& directory. Excessive numbers of old files here can slow down the job close down time (or at least it can in 3.5, anyway).

I suggest that you look in the &PH& directory at a time when there are no jobs running. If there are a lot of files there, delete them. It might help.

-----Original Message-----
From: Ray Wurlod [SMTP:ray.wurlod@Informix.Com]
Sent: Tuesday, July 10, 2001 9:57 AM
To: datastage-users@oliver.com
Subject: RE: One minute to finish job. Isnt it long?

Theres lots of things that might happen after the last active stage finishes and before the job finishes. These include, but are not limited
to:
execution of after-job subroutine
freeing of memory allocated for hashed files, aggregators, sort, etc.
cleaning up of disk files associated with allocated memory
updating of status and log tables
closing of DataStage repository tables, incl updating DTM and DTA in O/S
closing of open files
closing connections to databases (perhaps waiting for network
connections)
deleting active stage records from &PH& (based on pid, so needing a
search)
deleting old run records from &PH&
auto-purging job log depending on purge settings
updating process metadata into MetaStage hub

So theres a lot of work that has to occur. Closing is one of the most expensive operations, since the entire directory tree has to be traversed to update date/time accessed. So, no, 66 seconds is not long. Jobs with many and/or large memory-based hashed files can take tens of minutes.


-----Original Message-----
From: AROUSTAMIAN,VARDAN (Non-HP-Cupertino,ex1) [mailto:vardan_aroustamian@non.hp.com]
Sent: Tuesday, 10 July 2001 05:24
To: datastage-users@oliver.com
Subject: One minute to finish job. Isnt it long?


Hello,

There is an interesting (at least for me :) issue related to time DataStage is spending while finishing job. I can see in job log:

3:58:18 AM 7/9/01 Info euShipmentProcess.ProcessShipmentInfo:
DSD.StatgeRun Active stage finishing. (...)
3:59:24 AM 7/9/01 Control Finished Job euShipmentProcess.
3:59:24 AM 7/9/01 RunJob (euShipmentControl)
admin
Posts: 8720
Joined: Sun Jan 12, 2003 11:26 pm

Post by admin »

If you do a CleanupJob jobname from TCL in a project it clears the jobs status file, log file, and also the &PH& for that project. You can issue a CLEAR.FILE &PH& from TCL in a project just about at anytime you want. If you clear &PH& from NT or UNIX, you have to deal with the directory paths underneath &PH&, not to mention dealing with the "&" character in UNIX is fun. I recommend that you setup a CRON or NT batch script to periodically empty this directory. I have most of my clients do so on a nightly basis, when no DataStage jobs are running. You have to be aware that &PH& is capturing any phantom messages that a job is generating, usually due to programming errors on your part. Clearing this mid-run can lose important messaging. Most of the time these messages relate to variables unassigned.

-Ken
admin
Posts: 8720
Joined: Sun Jan 12, 2003 11:26 pm

Post by admin »

In release 4.x of DataStage there is much more automatic cleanup of &PH&. At these releases you should only retain one file per job, and that relating to the most successful completion. If a job aborts, there may be additional files, which (again at 4.x) are cleaned up when the job is reset.
Locked