I'm doing a high-level plan for a historical data conversion from mainframe to DB2. I have used DataStage in the past, but my current client has often used custom COBOL code to perform conversions. I want to outline some pros and cons for my client before a decision is made.
Some of the common disadvantages to COBOL do not apply here. The client has a sufficient number of COBOL programmers to perform a conversion. It is a one-time, historical conversion, so maintainability over the years is not a huge concern. On the other hand, other divisions of the organization use DataStage, so it is an option.
I believe that performance at high volumes should be an important factor in the decision. Some of the datasets to be converted will contain > 100 million records. Does anyone have performance benchmarks for loading to DB2 using COBOL code vs. DataStage? Or does IBM have some information available?
DataStage vs. COBOL performance benchmarks
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 4
- Joined: Wed Mar 26, 2008 2:34 pm
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Don't have any benchmarks but my experience suggests that well-written COBOL would probably beat DataStage fairly easily. So the question boils down to just how good these COBOL programmers are, particularly with respect to accessing (remote?) DB2 databases.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 527
- Joined: Thu Apr 19, 2007 1:25 am
- Location: Melbourne
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
The phrase "mainframe to DB2" leaves me a bit confused. Is this sequential files to IBM DB2 v9? Are they ISAM or VSAM files? And the biggest question of all, for me: What is your maintenance interface for DB2? Is it BMC, or something else?
Nothing beats the bulk unload/load utilities for performance with DB2 tables. So unless you can clarify to show my error, you really need to consider this third option.
My not-so humble opinion about the code base is that there is no advantage either way. I don't know anything about the run-time environment for DS in mainframe, but if it's at all similar to Unix, I would choose COBOL, and use it to prepare the data for a formatted bulk load. If you don't have the bulk option, then I must defer to those with DS-mainframe experience.
Nothing beats the bulk unload/load utilities for performance with DB2 tables. So unless you can clarify to show my error, you really need to consider this third option.
My not-so humble opinion about the code base is that there is no advantage either way. I don't know anything about the run-time environment for DS in mainframe, but if it's at all similar to Unix, I would choose COBOL, and use it to prepare the data for a formatted bulk load. If you don't have the bulk option, then I must defer to those with DS-mainframe experience.
Franklin Evans
"Shared pain is lessened, shared joy increased. Thus do we refute entropy." -- Spider Robinson
Using mainframe data FAQ: viewtopic.php?t=143596 Using CFF FAQ: viewtopic.php?t=157872
"Shared pain is lessened, shared joy increased. Thus do we refute entropy." -- Spider Robinson
Using mainframe data FAQ: viewtopic.php?t=143596 Using CFF FAQ: viewtopic.php?t=157872
-
- Participant
- Posts: 3593
- Joined: Thu Jan 23, 2003 5:25 pm
- Location: Australia, Melbourne
- Contact:
Because this is a migration I would expect code written by the legacy team who already have the knowledge of the data and libraries to extract it will be faster at getting the data out of the legacy database so Cobol would be the way to go. In terms of loading to target (where we have incomplete information) you may be able to accelerate it via DataStage - especially if you can use Cobol copybooks to read complex flat files and deliver it to relational DB2 tables. This extract team would identify filtering and archiving and deliver the data in a format you can use.
Any method you use will probably load the data to the target DB2 in the same way - bulk loads. DataStage may handle the intermediate transformation and parallelism of those loads more efficiently and can synchronise the partitioning of data transformation to match the partitioning of the target DB2 tables. I would use Cognos to get the data out of the legacy system, I would use scripts to load it to the target if there is minimal data quality cleansing and transformation or use DataStage if there is a lot.
Any method you use will probably load the data to the target DB2 in the same way - bulk loads. DataStage may handle the intermediate transformation and parallelism of those loads more efficiently and can synchronise the partitioning of data transformation to match the partitioning of the target DB2 tables. I would use Cognos to get the data out of the legacy system, I would use scripts to load it to the target if there is minimal data quality cleansing and transformation or use DataStage if there is a lot.
Certus Solutions
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn