Usage analysis in Datastage

dr.murthy · Post by **dr.murthy** » Thu Jul 12, 2012 12:03 am

Hi,

I have an requirement like to identify the list of tables and sequential files that used in a datastage jobs across the project.
for example TableA was used in a five jobs out of five this table was used as a source in two jobs , used as a target in one job , used as reference in two jobs.

Is there any technology to identify this info.

Thanks advance

ArndW · Post by **ArndW** » Thu Jul 12, 2012 1:54 am

Since table names can be parameters there isn't an easy generic way to do this. If you always use the same table column definitions then you can see where that has been used (but it won't tell you whether as source or target).

I would think about doing an export of the project into either .dsx or .xml and then parse that file looking for the keywords and table names.

You could program part this using the API calls.
- Select all jobs
- For each job,
- get the list of passive stages of type sequential and <your datbase>

This list could be used to scan the export file.

dr.murthy · Post by **dr.murthy** » Thu Jul 12, 2012 2:20 am

Thanks Arndw, my table names were not hardcorded .
can i find this information from metadata work bench

ArndW · Post by **ArndW** » Thu Jul 12, 2012 4:19 am

I'm not sure about the metadata workbench in this case; if your table names are not hardcoded but parameterized, then you could get the table names from the runtime job logs.

ray.wurlod · Post by **ray.wurlod** » Thu Jul 12, 2012 4:39 am

IF you have been diligent with handling your metadata, only ever loading table definitions from the Repository, then you can quite simply perform a usage analysis on the table definition and learn from that when jobs have loaded it (and therefore, presumably, are using it).