Issue in design metadata for Datastage jobs

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
arnabdey
Participant
Posts: 50
Joined: Wed Jan 10, 2007 5:56 am

Issue in design metadata for Datastage jobs

Post by arnabdey »

Hi

I have a set of Datastage jobs with me. Some of them access a DB2 database through ODBC stage and others through DB2 enterprise stage. But I have observed that the design time metadata for the tables is only stored for tables accessed through ODBC stage and not for access through DB2 Enterprise stage.

I arrived at this conclusion after trying to view the lineage of these jobs in the Metadata Workbench. Only those jobs for which we have ODBC stage, we are able to trace the lineage, not for the other ones. Please help in this regards.

Thanks
Arnab
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

You are (probably) simply observing the difference between "Table Definitions" and "Shared Table Definitions". In 8.x a new object, called a Shared Table and also known as a Physical Data Resource, was created to provide a "collection" of columns that is used among DataStage, Information Analyzer, FastTrack, Metadata Workbench and Business Glossary. The newer import mechanisms (Connectors & Bridges) automatically create these objects. The older import mechanisms (Plugin imports, XML, etc.) still create just Table Definitions....[if desired, you can use the "Create Shared Table from Table" to "push" these collections of columns into a Shared Table.]

It seems confusing at first, but ultimately the Shared Table is the identifier that is used across the tools, and enables lineage to "other" tooling that is outside of Information Server (Cognos, BI, etc.).

DataStage is unique because of its past --- the Table Definition is not even necessary...it is, in reality, a temporary holding place for columns. The other tools live and breathe on these Shared Tables for their columnar metadata.

What does that mean for your lineage?

For starters, there are many applications in DataStage where Shared Tables are unimportant for lineage. Perform Advanced Services/Automated Services against your DataStage project, and go into a Job that you know is linked to many upstream Jobs. Click on it and then click on one of its final Stages (say --- DB2 API). Perform a data lineage report directly from there. This is a "Stage" level report. It works only on DataStage Job design metadata and doesn't care whether you have Table Definitions "or" Shared Table Definitions.

Shared Tables are important when you want to see lineage from a "specific" physical location (this table in this database on this host), or when you want lineage to or from a Cognos report or other imported BI tool.

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
arnabdey
Participant
Posts: 50
Joined: Wed Jan 10, 2007 5:56 am

Post by arnabdey »

Hi Ernie, Thanks for your response.. But I had two questions

1. How should I create/import metadata into the shared table definitions and use the same for my jobs in the source/target db2 stages.

2. In case I export these whole set of jobs to another environment what steps do we need to follow in order to ensure that the lineage works well?

Thanks
Arnab
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

What sort of lineage are you looking for? If you are looking for lineage "thru the jobs" to identify transformations and jobs called one after the other, then you do not need to do anything...... that lineage works regardless of the tables. Export the jobs and the lineage will work the same, with or without any tables.

If you want lineage thru to the "physical" tables (needed mostly only if you want to connect to a BI report as noted above), today you will need to import those tables via bridges, or use the "create shared table from table" in the new environment.

Later this year I know that istool is being updated to handle the movement of Shared Tables.

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
Post Reply