What are the recommendations for upgrading v8.5 to v9.1?

jfarber1 · Post by **jfarber1** » Mon Apr 07, 2014 10:26 am

My project is scoping out the upgrade and I am in charge. I've searched the forum regarding the technical obstacles required to upgrade.

Should I upgrade to 8.7 first and then use the "in-place" tool to upgrade to 9.1 OR just dive right into 9.1? Those who have done the upgrade from 8.5 to 9.1, what issues did you find as you performed the upgrade?

rkashyap · Post by **rkashyap** » Mon Apr 07, 2014 11:33 am

What is your current DataStage version?

jfarber1 · Post by **jfarber1** » Mon Apr 07, 2014 11:39 am

currentVersion = 8.5.0.2

PaulVL · Post by **PaulVL** » Mon Apr 07, 2014 12:41 pm

Jump in head first.

You are most likely going to be upgrading your OS as well right?

The majority of your time will be regression testing of the jobs. 9.1 is a lot stricter on metadata rules. You need to invest time into fixing sloppy legacy practices (if you had sloppy code). Nobody from the peanut gallery can guestimate that for you.

Because it's a major upgrade, you might not want to have downtime on your existing PROD environment, so that might mean a new set of servers. That is the safer option be it costs money.

jfarber1 · Post by **jfarber1** » Mon Apr 07, 2014 1:41 pm

Thanks for the response.

We don't need to upgrade our OS. The OS is compatible (Windows 2008 Server Enterprise x64 SP2).

Can someone recommend a data integrity testing solution? (i.e how would I go about comparing the data results for each job in 8.5 versus 9.1 on a very accurate and efficient level)

IBM Analytics Champion 2009 - 2020 · Post by **asorrell** » Mon Apr 07, 2014 3:11 pm

I'd say a clean 9.1 install is always preferred. You can't do an upgrade in place from 8.5 to 8.7 anyway, so if you have to do a clean install, just go for 9.1.

As far as testing, that depends a lot on the application design and business requirements. I've upgraded healthcare and banking sites where it has to be signed off as "bit-for-bit" compatible for millions of rows (or if not, each exception has to be documented and justified). I've done others where the critical jobs were highly scrutinized, and others were only spot-checked.

If your data sources all support multiple data pulls (ie: they don't change between pulls and the pulls are non-destructive), then you can run jobs in parallel and compare results at the sequential file (UNIX cmp or equivalent) or database table level (using SQL).

Comparison is relatively easy if jobs have parameters setup for schemas / tables and paths / filenames. This allows for substitution of alternatives for targets. If the job details are hardcoded then you'll need to add time to the schedule to make them parameters (which they should have been in the first place!).

If your pulls are destructive (ie: delete's occur after successful pulls), then you'll need to either comment those out temporarily (always risky if you accidentally forget to restore later) or setup a duplicate source environment to use for testing.

If your testing source is constantly changing, and you can't get static copies for testing, then you'll have to get very creative. I was able to overcome that in the past by adding timestamp restrictions, etc., but it isn't easy.

No matter what, you are going to need a lot of additional database table space during testing. Your DBA will also need to make sure that the setup process is repeatable, because you'll probably need to "re-synch" several times during testing.

vmcburney · Post by **vmcburney** » Tue Apr 08, 2014 12:17 am

I would jump straight to 9.1.2 with fix packs and skip 8.7.

You can do a co-exist upgrade where 8.5 and 9.1 run on the same servers for a period of time so you can regression test the results. Strength here is that you have time to test and your production server does not have a long outage. Drawback is stressed machine and the need for a new set of Port numbers that may not be accessible and possible differences in the pre-requisites and db between the two versions.

You can do the uninstall re-install. This works okay in a test environment where you have an 8.7 test environment and separate 9.1 test environment and you do parallel runs for regression testing. This has the benefit of retaining your port numbers. This has the drawback that your production server may have to be unavailable for a couple days as you go through the uninstall and reinstall process.

jfarber1 · Post by **jfarber1** » Wed Apr 09, 2014 10:41 am

Thanks you for the recommendations vmcburney and asorrell. I'm currently attempting to replicate the environment on a local VM. So far, i've had one obstacale that I think I will overcome. I could not install because of this error: "information server plug-in "com.ibm.cic.agent.core.nativeinstalladapter.win32" was unable instantiate" I think i've fixed this by installing IBM Installation Manager. There are plugins/tools that the server requires for the install.

Anyway, I have another question: Is it better to re-write the jobs from scratch? This way we can redefine best practices and perhaps use new features. This is a unique opportunity to refactor datastage code, but i'm curious if ya'll think it's simply not worth doing.

jfarber1 · Post by **jfarber1** » Tue Apr 15, 2014 6:22 am

Part of the upgrade, we are looking into source control which we currently don't have. What is the level of effort for using RCC and CVS? Are there other source control solutions that are easier? I want to keep a historical record of the job and sequence versions.

chulett · Post by **chulett** » Tue Apr 15, 2014 7:00 am

jfarber1 wrote:Is it better to re-write the jobs from scratch?

Forgot about this question - no. Goodness no. Unless... perhaps... you have a very small number of jobs and you focus more on 'best practices' because you currently live in the Wild West and you need a lot of that refactoring you mentioned.

If you think a job would benefit from any 'new features', then update it.

My .02 on that subject.

IBM Analytics Champion 2009 - 2020 · Post by **asorrell** » Tue Apr 15, 2014 12:15 pm

-However-

It is a good time to make Best Practice changes since:

1) You can bury some of the conversion costs as part of the upgrade
2) You are going to have to test every job anyway.
3) Some changes (ie: parameter sets) are easier to do in an entire job group.

So do what you can, but I agree with Craig, don't just rewrite everything. Concentrate on areas where you'll get a lot of impact and leave the small, static feeds that don't do much alone.

zulfi123786 · Post by **zulfi123786** » Tue Apr 15, 2014 12:21 pm

The pain area while I was performing an upgrade was when the job is coded to pick a random one amongst the duplicates and this feed was used in multiple places and we had to document every such instance.