High Quantity of Jobs In One Project

kcbland · Post by **kcbland** » Mon Mar 24, 2003 11:58 am

I was curious as to anyone with a lot of jobs in a single project. Here's some of my observations:

1. 500+ jobs in a project causes a long refresh time in the DataStage Director. During this refresh, your Director client is completely locked up. Any edit windows open are hung until the refresh completes.
2. Increasing the refresh interval to 30 seconds mitigates the occurrence of refresh, but does not lessen the impact of the refresh.
3. The usage analysis links on import add a lot of overhead to the import process.
4. Compiling a routine can take minutes, even one a 1 line routine, depending on how many jobs there are and how many jobs use the routine.
5. A Director refresh will hang a Monitor dialog box until the refresh completes.

The client with the problem is on Solaris 2.7, running DS 5.1. There are no users on the server, there's ample cpus (12) and memory (30gb) with no load.

My questions would be:
1. Anyone on a release with similar observations?
2. Anyone know of a work around?
3. Any releases that address any of the noted observations?

Dividing jobs across multiple projects would not be a favorable choice. It's the only solution so far I see as a way to lessen the unworkable refresh, but I'd like to find a way to keep jobs streams together to make packaging of release versions easier. This client does not use MetaStage, so there's no worry about breaking linkages.

Thanks,
-Ken

datastage · Post by **datastage** » Mon Mar 24, 2003 3:48 pm

I'm currently using DS 5.1 and have similar problems. I had it anytime with a large number of jobs in a project. I don't think the Server's OS and configuration has any effect, I believe it is purely client machine, so more CPU or RAM on your client might help, but I still think as a developer you are mostly doomed when going over 500 jobs.

The best advice I can give is to break down your jobs into mulitple categories. Besides the refresh interval setting, it appears Director will automatically refresh when the status of a job in your view changes...ie, if another user runs a job (status then changes to running) it will refresh your screen regardless of how close it is to your refresh interval. Thus, even at 30 seconds there are times when it refreshes 3 times in a 10 seconds and I know of nothing you can do to eliminate your pain

kcbland · Post by **kcbland** » Mon Mar 24, 2003 10:10 pm

I actually have all of the jobs categorized, and it makes no difference. Any job running activity outside the immediate folder still causes the horrendous refresh hang time. Apparently the focus is not on the jobs in the immediate category folder, so a job starting or finishing somewhere else triggers this entire project refresh.

Using filters is an option, if I could figure out the exact pattern matching string that will display the jobs I want. This works, unfortunately, I haven't thought up a clever way to deal with the "AND AND AND OR" scenario.

It seems that there absolutely needs to be optimization about the Director client.

kjanes · Post by **kjanes** » Tue Mar 25, 2003 8:58 am

We have close to 500 jobs in a single project and did not have any issues on 5.0 or 6.0. We are running on 4cpu and 8cpu AIX boxes. Our clients are 1Ghz with 256MB RAM and 100Mbs Ethernet cards. The AIX boxes use fiber channels and some of the network swithces are probably 1Gbs.
We can run 5sec refresh rates in director without a problem.

There are many components that could contribute to an issue like this.

One thing to try might be doing a project "cleanup" in DS Admin. From what I can tell, it reorgs project information to improve performance. If there are a lot of changes going on in the project, maybe it will help. I use it periodically. It is a command button on the "Projects" tab under DS Admin. All users need to be out of the project for it to work. Login to DS Admin, select a project from the list and then click "Clean up".

**** You may want to backup your DataStage Project to a .dsx just in case something strange happens. *****

I hope some of this helps. I thought some network infrastructure details would give you something to compare against.

kcbland · Post by **kcbland** » Tue Mar 25, 2003 5:35 pm

I've eliminated what I believe are all of the variables. After several replies on the Oliver maillist pointed me to the archives on the www.tools4datastage.com companion website, I think the problem is related to release 5.2 and earlier.

Just to give an idea, my problem is with a 1400 job project. There's not a network problem, because you can watch your dsapi slave tied to you director session and it goes nuts. It will sit there consuming a whole cpu while it refreshes. If you don't know what I'm talking about, take a look at the Unix Survival Guide I have published on this site.

Figure out which dsapi_slave process is connected to your Director session. Fire up either top or prstat and watch this process. When you do a ctrl-r you see this process consume a cpu for quite awhile. I timed it on a fresh install into a project, with no server load, and it took almost 2 minutes. The dsapi_slave went to 8.3% (100% of 1 cpu on a 12 cpu box), indicative that this process was fully utilizing a cpu, no network wait, no i/o bottlenecks.

This leads inescapably to the conclusion that the logic within DataStage's Director client was requesting data and crunching it in such a manner to produce a long refresh time, even though I was focused on a folder containing 10 jobs. My next conclusion was that the Director was doing more than just getting the current state of those 10 jobs.

My only recourse is to recommend an upgrade be raised in priority. The problem there is that upgrades are destructive and that's a whole other can of worms. I don't see an upgrade anytime soon so any peeks and pokes into things would sure be appreciated. I looked at optimizing some of the hash files (DS_JOBOBJECTS, etc) to see if this has any improvement.

In addition, the massive number of jobs and there thousands of supporting hash files must induce a huge overhead as the Director has to sit and open/select/scan/close these over and over has to be putting a ton of time into the refresh. I'm going to investigate manually cleaning out some of the supporting hash files for jobs to see if this mitigates the issue.

I'll let everyone know my findings...
-Ken

ray.wurlod · Post by **ray.wurlod** » Wed Mar 26, 2003 11:21 pm

FYI, the project cleanup performs two main tasks. It identifies and removes any orphan files from incompletely deleted jobs and so on, and it rebuilds all the indexing on the DataStage Repository. No actual re-organization takes place and, in general, the indexes are robust and don't actually need re-building. Cleanup project is really only needed if there is evidence that an index has become corrupted, for example you can't find a component that you are certain is there.

datastage · Post by **datastage** » Mon Mar 31, 2003 3:24 pm

Is the project cleanup / rebuilding indexes safe? For some reason I've always been afraid of running it in fear of causing more trouble than help.

kcbland · Post by **kcbland** » Mon Mar 31, 2003 4:17 pm

It can't hurt unless users are designing jobs, but I think it tells you that users are in. It's pretty innocuous, though not a fix all. If you ever delete a job and you get a lot of messages saying unable to delete RT_LOGxxxx or the like, then this helps clean that up. It's not critical to cleanup improperly removed jobs, because importing a new copy of the job assigns a new job number, and therefore different supporting hash file names.

Good luck!
-Ken

ariear · Post by **ariear** » Mon Apr 07, 2003 1:34 pm

Client side of Version 5.2.1 took care of the known refresh problem. Bare in mind that Version 5 introduced automatic refresh in designer (repository view) so there can be double refresh efforts.

datastage · Post by **datastage** » Tue May 20, 2003 1:02 pm

After upgrading from 5.1 to 6.0 performance issues surrounding a large number of jobs in a project have all improved greatly with one exception: In sequencers it takes even longer to open Job Activity stages. My only guess is the time required to select all jobs and sequencers and sort them for the drop down box.

I can't think of any user-workarounds are ways to tune in order to fix this issue (at least Director is much faster)