Page 1 of 1

White Paper on DataStage Enterprise Edition

Posted: Sun Nov 07, 2004 7:45 pm
by ecclesr
Hi

I am just starting the process of creating a short position paper on DataStage Enterprise Edition for my manager.

We are currently not a Enterprise Edition site

I would like try an cover the following in the paper
- Platform that is runs on
- An example design of a job before and after conversion to parallel
designed job
- Effort for migrate and convert to parallel job designs
- Some of the issues people have encounted is such a migration
- Benfits

I have searched the Ascential library - but found that a frustrating exercise

Any content, pointers the member can provide would be much apreciated in me put something on paper on this topic

You can forward to my email address

Thanking you in Advance

Ross Ecclesfield

Posted: Sun Nov 07, 2004 9:16 pm
by vmcburney
Firstly you are on Windows which means you cannot use Enterprise Edition until version 7.5.1 is released at the start of the next year. That will be the first version of parallel jobs that runs on Windows platforms.

Secondly you need multiple CPUs to get any benefit, preferably four or more. You also need a very high data volume to justify the move. You may be able to remove your current bottlenecks with a more efficient design and multiple instance jobs.

Third, compare the move to Enterprise with a move to RTI. This might get you the scalability you are looking for without a big hardware or development effort.

Now to answer your questions. DataStage Enterprise is certified for Unix and Linux platforms. At the start of next year there will be versions for Windows and Mainframe Unix System Services. I usually view the platforms from e.services but it's currently down.

For example jobs go to Ascential devnet and have a look for something from the upload/download lists.

The effort to migrate is considerable, you are rewriting all servers jobs into parallel jobs. You may be able to limit this work by just migrating those jobs that are a bottleneck and handle a lot of data and run a combination of server and parallel jobs.

Details of the benefits of the parallel architecture are at http://www.ascential.com/products/platf ... ility.html
This page has further links to Ascential Grid Computing, ETL Benchmark etc.

Posted: Mon Nov 08, 2004 12:15 am
by ray.wurlod
A white paper on PX, differences from server and related issues got written almost two years ago by BigPoppa (since departed from the DataStage world) and me. Its publication on the then Tools4DataStage site was blocked by Ascential on "legal grounds".

Posted: Mon Nov 08, 2004 1:04 am
by ecclesr
Hi Ray and Vincent

Thank you for your replies. So far I have been unable to find any documents showing even the simiplest example of a sequential job converted to a parallel job, with a simple explaination of the steps and effort required.

I have worked my way around the ascential site and devnet pages and found them to be the most frustating I have ever come across and come up with nothing - like trying to get blood out of a stone.

Ross Ecclesfield

Posted: Mon Nov 08, 2004 5:23 am
by vmcburney
Yep, that's Ascential documentation. You'll be pleased to hear they have hired technical writers for how to documents in upcoming releases. The parallel job developers guide is probably the best piece of documentation with a good description of parallel architecture, if you don't have a copy (it gets installed with the enterprise edition) I'm sure we can get one to you. That will answer most of your questions about how parallel processing works.

If you have a DataStage install CD you might find it in the documentation directory.

Posted: Mon Nov 08, 2004 2:33 pm
by ray.wurlod
You will find it there. The best bits are Chapters 1 and 2 of parjdev.pdf.