DataStage vs. COBOL performance benchmarks

A forum for discussing DataStage<sup>®</sup> basics. If you're not sure where your question goes, start here.

Moderators: chulett, rschirm, roy

Post Reply
ptc3bluedevil
Participant
Posts: 4
Joined: Wed Mar 26, 2008 2:34 pm

DataStage vs. COBOL performance benchmarks

Post by ptc3bluedevil »

I'm doing a high-level plan for a historical data conversion from mainframe to DB2. I have used DataStage in the past, but my current client has often used custom COBOL code to perform conversions. I want to outline some pros and cons for my client before a decision is made.

Some of the common disadvantages to COBOL do not apply here. The client has a sufficient number of COBOL programmers to perform a conversion. It is a one-time, historical conversion, so maintainability over the years is not a huge concern. On the other hand, other divisions of the organization use DataStage, so it is an option.

I believe that performance at high volumes should be an important factor in the decision. Some of the datasets to be converted will contain > 100 million records. Does anyone have performance benchmarks for loading to DB2 using COBOL code vs. DataStage? Or does IBM have some information available?
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Don't have any benchmarks but my experience suggests that well-written COBOL would probably beat DataStage fairly easily. So the question boils down to just how good these COBOL programmers are, particularly with respect to accessing (remote?) DB2 databases.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
stuartjvnorton
Participant
Posts: 527
Joined: Thu Apr 19, 2007 1:25 am
Location: Melbourne

Post by stuartjvnorton »

IBM have a product called Gladstone (sp?) that IIRC is at heart a COBOL code generator.
So they might just have the sort of benchmarks you need.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

IBM have a product called DataStage (mainframe edition) that can generate good quality COBOL as well as the JCL to compile and run it.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
FranklinE
Premium Member
Premium Member
Posts: 739
Joined: Tue Nov 25, 2008 2:19 pm
Location: Malvern, PA

Post by FranklinE »

The phrase "mainframe to DB2" leaves me a bit confused. Is this sequential files to IBM DB2 v9? Are they ISAM or VSAM files? And the biggest question of all, for me: What is your maintenance interface for DB2? Is it BMC, or something else?

Nothing beats the bulk unload/load utilities for performance with DB2 tables. So unless you can clarify to show my error, you really need to consider this third option.

My not-so humble opinion about the code base is that there is no advantage either way. I don't know anything about the run-time environment for DS in mainframe, but if it's at all similar to Unix, I would choose COBOL, and use it to prepare the data for a formatted bulk load. If you don't have the bulk option, then I must defer to those with DS-mainframe experience.
Franklin Evans
"Shared pain is lessened, shared joy increased. Thus do we refute entropy." -- Spider Robinson

Using mainframe data FAQ: viewtopic.php?t=143596 Using CFF FAQ: viewtopic.php?t=157872
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

Because this is a migration I would expect code written by the legacy team who already have the knowledge of the data and libraries to extract it will be faster at getting the data out of the legacy database so Cobol would be the way to go. In terms of loading to target (where we have incomplete information) you may be able to accelerate it via DataStage - especially if you can use Cobol copybooks to read complex flat files and deliver it to relational DB2 tables. This extract team would identify filtering and archiving and deliver the data in a format you can use.

Any method you use will probably load the data to the target DB2 in the same way - bulk loads. DataStage may handle the intermediate transformation and parallelism of those loads more efficiently and can synchronise the partitioning of data transformation to match the partitioning of the target DB2 tables. I would use Cognos to get the data out of the legacy system, I would use scripts to load it to the target if there is minimal data quality cleansing and transformation or use DataStage if there is a lot.
Post Reply