Some IA tasks invoke Datastage?

This forum contains ProfileStage posts and now focuses at newer versions Infosphere Information Analyzer.

Moderators: chulett, rschirm

Post Reply
truenorth
Participant
Posts: 139
Joined: Mon Jan 18, 2010 4:59 pm
Location: San Antonio

Some IA tasks invoke Datastage?

Post by truenorth »

Don't know if the IA Redbook December 2007 edition is obsolete, but it states on page 28:
The IBM WebSphere Information Analyzer Handler is an implementation of the ISF Handler/Agent framework. It implements the interfaces required to participate in that pattern. Its primary responsibility it so receive messages from the Authoring Service, establish the connection to the DataStage instance, invoke the OSHGenerator for the creation of the appropriate analysis flow, and provide command and control against that analysis flow.

The IBM WebSphere Information Analyzer Handler uses the proprietary
Java API (DS4J) into the DataStage Engine database (previously known as Universe). The DataStage job executes as a Run Now job initiated by the IBM WebSphere Information Analyzer Handler. In other words, the ISF Scheduling Service is invoked by IBM WebSphere Information Analyzer (either as Run Now or scheduled) for the analysis task, which is then executed as a DataStage Run Now job initiated by the IA Handler through the DS4J API.

The Orchestrate Shell Generator (OSH) Generator is responsible for taking the definition of the analysis flow to be executed (with all of the parameters used to fill in the corresponding execution flow template) and create the appropriate OSH script, which is then laid down within the PXEngine execution environment, and executed as part of the IA Handler command and control structure. In other words, it puts a script wrapper around the IBM WebSphere Information Analyzer job created by the Authoring Service so that it can be delivered to DataStage as a DataStage parallel job.

Note: The OSHGenerator is invoked for the column analysis, multi-column primary key analysis, and data sampling because these tasks require an IBM WebSphere DataStage job to be run. The other analyses, such as foreign key analysis, cross domain analysis, and baseline analysis, take advantage of the column analysis data that is stored in the IADB and do not require an IBM WebSphere DataStage job to be run.
Is this accurate?
Todd Ramirez
Sr Consultant, Data Quality
San Antonio TX
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

It is accurate. It is still accurate.

Of course, all DataStage parallel jobs run osh. Information Analyzer bypasses the graphical design interface and instead generates appropriate osh scripts directly.

Because all required frequencies are - once column analysis has been completed - in IADB, the other analyses can be perfomed as relatively simple queries directly against IADB. The results, as always, are stored in XMETA.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
truenorth
Participant
Posts: 139
Joined: Mon Jan 18, 2010 4:59 pm
Location: San Antonio

Post by truenorth »

Many thanks, Ray!
Todd Ramirez
Sr Consultant, Data Quality
San Antonio TX
Post Reply