Page 1 of 1

Adding a Datastage Engine to an existing Infosphere Service

Posted: Wed Jun 11, 2014 10:21 am
by nvalia
Hi All,

Enterprise Edition - v8.7

We are hitting a performance bottleneck with multiple projects using the same ETL server for Data Integration requirement to load to both OLTP and DW databases.
Current topology is all 3 tiers on same Server!

Approach 1
(post job optimization and environment upgrades)
Move Engine Tier to separate server (or maybe have 2 Engines instead of 1)

Is this doable in an already existing environment or needs a full re-install?
Do we need to alter existing job designs in anyway?
(I believe recompilation may be required)
Any other considerations if we go this route

Approach 2
Separate the ETL servers itself, one catering to the OLTP cycle, which takes most of the time and resources currently, and the other to DW cycles
(Using this approach we could add more Datastage projects in the DW cycle in the future as needed)

Will this need purchasing a new Infopshere Information Server License?
Should migrating jobs to the new server cause a major challenge or should be more like an export/import/compile jobs (and test!)
Any other considerations if we go this route

Thanks,
VN

Posted: Wed Jun 11, 2014 4:45 pm
by ray.wurlod
No matter what you do, if you increase the number of processors (CPU cores) you will be up for increased licensing cost. Licensing is based on "processor value units" which also take into account processor type and speed.

Job designs should not need to be changed, or even re-compiled. What will need to be changed is the configuration file(s).

Posted: Wed Jun 11, 2014 7:25 pm
by qt_ky
Having all 3 server tiers on one computer (default install) is not necessarily bad.

Do you know for certain what is causing the bottleneck? If it's a database I/O bottleneck, then investing in additional DataStage servers and PVU licenses may not help your situation much. Many times, job designs can be optimized, database settings tuned, and then all of a sudden the existing server has room to grow.

Posted: Thu Jun 12, 2014 9:13 am
by nvalia
Yes we plan to ensure that job optimization and other techniques as you mentioned are validated first before we decide on either approach.

If we have Services and Engine Tier separated out, I assume we should be looking at more CPU/Physical Memory on the Engine, as this will do most of the processing, compared to on the Services Tier or they should both be same? Is there any guideline around this?
(I do understand this will need more licenses)

Thanks,
VN

Posted: Thu Jun 12, 2014 2:52 pm
by qt_ky
See the middle of this topic for a sizing estimate process/document description.

See the middle of this topic for a related IBM Redbook link.

Posted: Mon Jun 16, 2014 8:39 am
by nvalia
1. If we have 2 Engine Tiers, can we restrict jobs from a specific project to run on a particular engine only while other project(s) to run on the 2nd Engine, by leveraging the Configuration file?

2. If we want to add a new Engine tier, does this need a Re-Install?

3. Any other considerations if we take this approach?

Thanks,
VN

Posted: Mon Jun 16, 2014 5:07 pm
by ray.wurlod
1. Yes.

2. Yes on the new engine tier (obviously). You would also have to configure the services tier so that it "owns" the additional engine machine.

3. The main other consideration is setting up a trusted relationship between the engine tiers if you're going to use the cluster to run parallel jobs. A secondary consideration will be to avoid too much re-partitioning, since that now means that rows have to be moved across the network.