Heavy I/O - need help with job design
Posted: Tue Jul 13, 2010 6:49 am
We have a master sequence that calls a number of sequences, all of which have five jobs. The first job creates a dataset, then the next four jobs read the dataset, and create 16 files (complex flat files) each. The last four jobs of each sub-sequence are identical, except that the constraints are hard coded (where a key = 1, 2, 3, 4, etc. all the way up to 64). The file name has the key number hard coded on the end as well (i.e., #target_dir##file_name#_1).
Obviously this is heavy on the I/O. We are having issues with resource constraints (I don't have the exact error messages because the job logs have been deleted, and the last couple of attempts to run this have failed due to source data not being found - I'm assured that this does fail with resource constraints, though), and I'm looking at the design. In my head, it would make more sense to have one write job for each sub-sequence, passing in a parameter to use in the constraint and the target file name, and then in the sequence, loop from 1 to 64 calling the job and passing the current loop number as the parameter value.
While this makes sense to me from a design perspective (less hard coding, only have to change in one place), I'm not sure if it would help or hinder in terms of I/O usage. Does anyone have any thoughts that they could share?
(and yes, I have done searches here, but haven't found anything yet that helps. I cannot get to the operating system myself on this development server (you need to see the rolling of the eyes that went with that statement), and was hoping to gain some ground before attempting to get with someone who can look at some statistics while this is running)
Thanks.
Obviously this is heavy on the I/O. We are having issues with resource constraints (I don't have the exact error messages because the job logs have been deleted, and the last couple of attempts to run this have failed due to source data not being found - I'm assured that this does fail with resource constraints, though), and I'm looking at the design. In my head, it would make more sense to have one write job for each sub-sequence, passing in a parameter to use in the constraint and the target file name, and then in the sequence, loop from 1 to 64 calling the job and passing the current loop number as the parameter value.
While this makes sense to me from a design perspective (less hard coding, only have to change in one place), I'm not sure if it would help or hinder in terms of I/O usage. Does anyone have any thoughts that they could share?
(and yes, I have done searches here, but haven't found anything yet that helps. I cannot get to the operating system myself on this development server (you need to see the rolling of the eyes that went with that statement), and was hoping to gain some ground before attempting to get with someone who can look at some statistics while this is running)
Thanks.