Is big parallel job, with number of stages efficient compare to a job divided in 2 or 3 different jobs ?
I always feel simpler the better and easy to debug and maintain. Is it true for parallel jobs also? or its the otherway..
Performance question
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 232
- Joined: Sat May 07, 2005 2:49 pm
- Location: USA
Hi amsh,
I feel that the big parallel job is better than splitting it into 2 or 3 smaller jobs. If you split the big job into smaller jobs and then run them in sequence, possibly using a job sequencer to achieve the same result. The overhead in initializing,starting,processing,writing the log and not to mention, the overhead in passing the status of the previous job to the next job is more. I am pretty sure there are exceptions to it, but this is my take on it.
Thanks,
Naveen
I feel that the big parallel job is better than splitting it into 2 or 3 smaller jobs. If you split the big job into smaller jobs and then run them in sequence, possibly using a job sequencer to achieve the same result. The overhead in initializing,starting,processing,writing the log and not to mention, the overhead in passing the status of the previous job to the next job is more. I am pretty sure there are exceptions to it, but this is my take on it.
Thanks,
Naveen
-
- Participant
- Posts: 232
- Joined: Sat May 07, 2005 2:49 pm
- Location: USA
In Big Parallel job you are not parking the temp file to the database/dataset/sequential etc. This makes a big difference because of less I/O. Whenever you want to debug the big you can break that into small pieces and fix it and put it back into your BIG Job
Regards
Siva
Listening to the Learned
"The most precious wealth is the wealth acquired by the ear Indeed, of all wealth that wealth is the crown." - Thirukural By Thiruvalluvar
Siva
Listening to the Learned
"The most precious wealth is the wealth acquired by the ear Indeed, of all wealth that wealth is the crown." - Thirukural By Thiruvalluvar
-
- Participant
- Posts: 3593
- Joined: Thu Jan 23, 2003 5:25 pm
- Location: Australia, Melbourne
- Contact:
I have found parallel jobs to have more stages in them then server jobs. Better performance is achieved by not writing to disk so our jobs try to carry the data further. There is also a tendency to move functions away from external products such as the RDBMS engine and Unix scripts into parallel stages. Where I might have done a join in a DB stage in a server job I find the parallel sort and joins can be more efficient. Same goes for a parallel sort versus a Unix sort.
There is still a requirement for a robust approach to rollback and recovery. We still land our data to a staging area prior to processing and land them again to load ready files prior to database loads.
There is still a requirement for a robust approach to rollback and recovery. We still land our data to a staging area prior to processing and land them again to load ready files prior to database loads.
Certus Solutions
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn