Ver4 and Ver 6 Performance

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Define performance!

Release 6.0's Parallel Extender does allow you to run (and control) multiple data streams in parallel which, particularly for large data sets, means that they can get finished faster.


Ray Wurlod
Education and Consulting Services
ABN 57 092 448 518
kjanes
Participant
Posts: 144
Joined: Wed Nov 06, 2002 2:16 pm

Post by kjanes »

Parallel Extender under version 6 costs extra money if you do not already own it. It is not exactly cheap. Development under the Parallel canvas is a little bit different as well. Designing Parallel jobs uses different components/methods from traditional DS programming. Also, you cannot simply upgrade existing DS jobs from server type jobs to "Parallel". To some extent, they would need to be re-written. There is a method to reduce a re-write using shared containers but I am not quite sure that it is beneficial.

Aside from the Parallel Extender aspect of v6, there are overall DataStage enhancements that should be beneficial. With the upgrade from v4 to 6 you may see some performance improvements because of the DataStage Engine enhancements between 4 and 6. The usability alone is a good reason to upgrade. There are features in 6 that are very useful.

Row Buffering may help performance as well as some of the new stages with v6.



Kevin Janes
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

Version 5 introduced the following:
Hashed File Stage and Shared Memory Disk Caching
The Hashed File stage has been enhanced to use a write-shared disk cache available in the DataStage server. This improves performance by using in-memory cached files and lets a single instance of a file be shared between any number of links and processes in a DataStage job. By default, the shared disk cache is not enabled.

There may also be performance improvements in the newer versions of ODBC drivers, native database stages and the newer version of Universe.

There are plenty of benefits to developers such as the multiple instance jobs, shared containers, version control and the drag and drop designer repository.


Vincent McBurney
Data Integration Services
www.intramatix.com
Post Reply