Page 1 of 2

Maximum number of jobs per project...

Posted: Wed Jul 20, 2011 6:07 pm
by soportesis
Hello,

I have a project in DataStage 7.5.2 EE on UNIX with more than 1000 jobs, but I'm not sure if having many jobs into a project is a best practice.
  • Are there some recommendation about of the maximum number of jobs per project?

    What risks are involved in a project that has many jobs?

    Do you know where can I learn more about this topic?
Tnks a lot.

Posted: Wed Jul 20, 2011 9:15 pm
by ray.wurlod
It depends to some extent upon what flavour of UNIX you are on, and on the limit (if any) on the number of subdirectories that can be created in a directory.

1000 jobs is probably OK - you can go up to somewhere near 4000. However, performance problems start kicking in in areas like job startup and job creation. They take longer than in a project with fewer jobs.

Another risk is the total volume of entries in job logs, which are all stored within the project directory - the danger is that of filling the disk that contains the project directory. Splitting the project into multiple projects on the same disk will not ameliorate this situation; splitting into multiple projects on separate disks definitely will.

Posted: Thu Jul 21, 2011 12:43 am
by pandeesh
ray.wurlod wrote:It depends to some extent upon what flavour of UNIX you are on, and on the limit (if any)
What about SunOS?
is there any limit for that?

Posted: Thu Jul 21, 2011 12:49 am
by ray.wurlod
I believe the limit on Solaris generally to be 32K subdirectories per directory.

Posted: Thu Jul 21, 2011 8:55 am
by soportesis
Tnks Ray, it was very good information.

Posted: Mon Jul 25, 2011 4:55 am
by PhilHibbs
ray.wurlod wrote:I believe the limit on Solaris generally to be 32K subdirectories per directory.
And each job requires 6 directories, which gives the ~5000 job limit mentioned earlier.

Posted: Mon Jul 25, 2011 6:38 am
by pandeesh
PhilHibbs wrote:And each job requires 6 directories, which gives the ~5000 job limit mentioned earlier.
What are those 6?
i remember RTSTATUS,RTCONFIG and RTLOG which is a hashed file.
Correct me if i am wrong.
And what will happen if we exceed the maximum limit?(While attempting to create the 5001th job if max limit is 5000)

Thanks

Posted: Mon Jul 25, 2011 6:42 am
by chulett
What are the six? Why not simply check and answer the question yourself?

Posted: Mon Jul 25, 2011 6:51 am
by pandeesh
yes craig.

I am able to see the below inside project directory

1)RT_BP
2)RT_BP.O
3)RT_CONFIG(Hashed file)
4)RT_STATUS(Hashed file)
5)RT_LOG(Hashed file)

Am i missing anything?
Thanks

Posted: Mon Jul 25, 2011 7:38 am
by PhilHibbs
Er, it's possible that I got it wrong and that it's only 5 directories. Or, there may be a sixth possibility that doesn't always get created for all jobs. I'll see if I can dig up the thread where I learned about all this.

Posted: Mon Jul 25, 2011 8:41 am
by pandeesh
PhilHibbs wrote:Er, it's possible that I got it wrong and that it's only 5 directories. Or, there may be a sixth possibility that doesn't always get created for all jobs. I'll see if I can dig up the thread where I learned about all this.
I am really curious to know in which scenario, the sixth will create?
if you get any link, please share with me.
Thanks

Posted: Mon Jul 25, 2011 10:13 am
by PhilHibbs
I found a couple of threads that mention this but none of them explain exactly what the 6 directories are.

e.g. viewtopic.php?t=103027, viewtopic.php?t=116443

Posted: Mon Jul 25, 2011 11:01 am
by pandeesh
I guess DS_TEMP is the 6th one..but I don't know under which circumstances it will be created..correct me if I am wrong

Posted: Mon Jul 25, 2011 5:00 pm
by ray.wurlod
There could be as many as eight:
  • RT_BPnnn - source code from BASIC components

    RT_BPnnn.O - compiled BASIC components

    RT_CONFIGnnn - run-time configuration

    RT_LOGnnn - job logs

    RT_SCnnn - osh and C++ components

    RT_STATUSnnn - run-time status of jobs and resources

    DS_TEMPnnn - transient components during design

    RT_QSnnn - QualityStage components

Posted: Mon Jul 25, 2011 10:25 pm
by pandeesh
Good info Ray!! generally it varies from 5 to 8.
From my understanding RT_SCnnn will be created only if the job conatins any shared containers(SC).RT_QSnnn will be created only if datastage jobs are integrated with Qualitystage.
Correct me if my understanding is incorrect.
Could you elaborate little bit about DS_TEMPnnn?Under which circumstances DS_TEMPnnn will be created?
Thanks