Page 1 of 1

Where to run IA?

Posted: Fri Apr 15, 2016 5:13 am
by qt_ky
We have the all the Information Server products, including Information Governance Catalog, running all tiers on a single SMP server (one per environment).

We are following an old IBM recommendation (perhaps from 8 to 10 years ago, pre-IGC), which is to run all IA tasks from a non-production Information Server, so that the resource intensive IA tasks will not disrupt production DataStage and QualityStage and ISD jobs.

Does that logic still hold true today? I have got an impression that now it is best to do all the IGC work and run the IA tasks in the production server only, so that when you publish IA results, the IGC users can view them, as they are all in the same metadata repository.

What is the modern answer to get all the benefits in production without IA tasks being disruptive?

Posted: Fri Apr 15, 2016 10:51 am
by PaulVL
I'd be more worried about running IA against a PROD target database / table.

Sure your IADB will be used to juggle the numbers, but that interaction with the initial data source can be heavy.

DBAs will slap you if you do that in PROD without their consent. Political rain will fall.

They will frown at you if you do it in non-prod.

Security waivers would be required to copy the prod data to non prod database of course. Data Masking concerns may apply... but... the impact on the target source is the big issue IMHO.

Posted: Fri Apr 15, 2016 12:21 pm
by JRodriguez
Now days the strategy for IA tasks is the same as 10 years ago, it depends on the importance of the results or finding from IA tasks for the Company. The tool is more agil these days, you have a bit more control, but still the pain is present

I would say that the strategy should be base on the importance on the governance processs in your company, how fast you would need to react base on results, exceptions, metrics, etc from IA tasks. If you would need to react to exceptions thrown by a DQ Rule because it will cause a major issue to the company (Payments systems,Compliance, fines from Feds regulators, loss of credibilty, etc) then I would say that all IA tasks should be executed in production against production source/target with the proper communication to all parties involved. Of course, this will required more resources on the box hosting your IGC/IA tool

The most disruptive ones are the profiling tasks and those could be scheduled in times where they are less disruptive. DQ rules could be call/executed directly from DataStage processes where we could leverage the DS advantages....