What are the recommendations for upgrading v8.5 to v9.1?

A forum for discussing DataStage<sup>®</sup> basics. If you're not sure where your question goes, start here.

Moderators: chulett, rschirm, roy

Post Reply
jfarber1
Premium Member
Premium Member
Posts: 6
Joined: Mon Oct 08, 2012 11:34 am

What are the recommendations for upgrading v8.5 to v9.1?

Post by jfarber1 »

My project is scoping out the upgrade and I am in charge. I've searched the forum regarding the technical obstacles required to upgrade.

Should I upgrade to 8.7 first and then use the "in-place" tool to upgrade to 9.1 OR just dive right into 9.1? Those who have done the upgrade from 8.5 to 9.1, what issues did you find as you performed the upgrade?
Jason
rkashyap
Premium Member
Premium Member
Posts: 532
Joined: Fri Dec 02, 2011 12:02 pm
Location: Richmond VA

Post by rkashyap »

What is your current DataStage version?
jfarber1
Premium Member
Premium Member
Posts: 6
Joined: Mon Oct 08, 2012 11:34 am

Post by jfarber1 »

currentVersion = 8.5.0.2
Jason
PaulVL
Premium Member
Premium Member
Posts: 1315
Joined: Fri Dec 17, 2010 4:36 pm

Post by PaulVL »

Jump in head first.

You are most likely going to be upgrading your OS as well right?

The majority of your time will be regression testing of the jobs. 9.1 is a lot stricter on metadata rules. You need to invest time into fixing sloppy legacy practices (if you had sloppy code). Nobody from the peanut gallery can guestimate that for you.

Because it's a major upgrade, you might not want to have downtime on your existing PROD environment, so that might mean a new set of servers. That is the safer option be it costs money.
jfarber1
Premium Member
Premium Member
Posts: 6
Joined: Mon Oct 08, 2012 11:34 am

Post by jfarber1 »

Thanks for the response.

We don't need to upgrade our OS. The OS is compatible (Windows 2008 Server Enterprise x64 SP2).

Can someone recommend a data integrity testing solution? (i.e how would I go about comparing the data results for each job in 8.5 versus 9.1 on a very accurate and efficient level)
Jason
asorrell
Posts: 1707
Joined: Fri Apr 04, 2003 2:00 pm
Location: Colleyville, Texas

Post by asorrell »

I'd say a clean 9.1 install is always preferred. You can't do an upgrade in place from 8.5 to 8.7 anyway, so if you have to do a clean install, just go for 9.1.

As far as testing, that depends a lot on the application design and business requirements. I've upgraded healthcare and banking sites where it has to be signed off as "bit-for-bit" compatible for millions of rows (or if not, each exception has to be documented and justified). I've done others where the critical jobs were highly scrutinized, and others were only spot-checked.

If your data sources all support multiple data pulls (ie: they don't change between pulls and the pulls are non-destructive), then you can run jobs in parallel and compare results at the sequential file (UNIX cmp or equivalent) or database table level (using SQL).

Comparison is relatively easy if jobs have parameters setup for schemas / tables and paths / filenames. This allows for substitution of alternatives for targets. If the job details are hardcoded then you'll need to add time to the schedule to make them parameters (which they should have been in the first place!).

If your pulls are destructive (ie: delete's occur after successful pulls), then you'll need to either comment those out temporarily (always risky if you accidentally forget to restore later) or setup a duplicate source environment to use for testing.

If your testing source is constantly changing, and you can't get static copies for testing, then you'll have to get very creative. I was able to overcome that in the past by adding timestamp restrictions, etc., but it isn't easy.

No matter what, you are going to need a lot of additional database table space during testing. Your DBA will also need to make sure that the setup process is repeatable, because you'll probably need to "re-synch" several times during testing.
Andy Sorrell
Certified DataStage Consultant
IBM Analytics Champion 2009 - 2020
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

I would jump straight to 9.1.2 with fix packs and skip 8.7.

You can do a co-exist upgrade where 8.5 and 9.1 run on the same servers for a period of time so you can regression test the results. Strength here is that you have time to test and your production server does not have a long outage. Drawback is stressed machine and the need for a new set of Port numbers that may not be accessible and possible differences in the pre-requisites and db between the two versions.

You can do the uninstall re-install. This works okay in a test environment where you have an 8.7 test environment and separate 9.1 test environment and you do parallel runs for regression testing. This has the benefit of retaining your port numbers. This has the drawback that your production server may have to be unavailable for a couple days as you go through the uninstall and reinstall process.
jfarber1
Premium Member
Premium Member
Posts: 6
Joined: Mon Oct 08, 2012 11:34 am

Post by jfarber1 »

Thanks you for the recommendations vmcburney and asorrell. I'm currently attempting to replicate the environment on a local VM. So far, i've had one obstacale that I think I will overcome. I could not install because of this error: "information server plug-in "com.ibm.cic.agent.core.nativeinstalladapter.win32" was unable instantiate" I think i've fixed this by installing IBM Installation Manager. There are plugins/tools that the server requires for the install.

Anyway, I have another question: Is it better to re-write the jobs from scratch? This way we can redefine best practices and perhaps use new features. This is a unique opportunity to refactor datastage code, but i'm curious if ya'll think it's simply not worth doing.
Jason
jfarber1
Premium Member
Premium Member
Posts: 6
Joined: Mon Oct 08, 2012 11:34 am

Post by jfarber1 »

Part of the upgrade, we are looking into source control which we currently don't have. What is the level of effort for using RCC and CVS? Are there other source control solutions that are easier? I want to keep a historical record of the job and sequence versions.
Jason
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

jfarber1 wrote:Is it better to re-write the jobs from scratch?
Forgot about this question - no. Goodness no. Unless... perhaps... you have a very small number of jobs and you focus more on 'best practices' because you currently live in the Wild West and you need a lot of that refactoring you mentioned. :wink:

If you think a job would benefit from any 'new features', then update it.

My .02 on that subject.
-craig

"You can never have too many knives" -- Logan Nine Fingers
asorrell
Posts: 1707
Joined: Fri Apr 04, 2003 2:00 pm
Location: Colleyville, Texas

Post by asorrell »

-However-

It is a good time to make Best Practice changes since:

1) You can bury some of the conversion costs as part of the upgrade
2) You are going to have to test every job anyway.
3) Some changes (ie: parameter sets) are easier to do in an entire job group.


So do what you can, but I agree with Craig, don't just rewrite everything. Concentrate on areas where you'll get a lot of impact and leave the small, static feeds that don't do much alone.
Andy Sorrell
Certified DataStage Consultant
IBM Analytics Champion 2009 - 2020
zulfi123786
Premium Member
Premium Member
Posts: 730
Joined: Tue Nov 04, 2008 10:14 am
Location: Bangalore

Post by zulfi123786 »

The pain area while I was performing an upgrade was when the job is coded to pick a random one amongst the duplicates and this feed was used in multiple places and we had to document every such instance.
- Zulfi
Post Reply