Page 1 of 1

DataStage Internal Tables

Posted: Sun May 07, 2006 5:22 am
by RCMURTHY
How to Get all table details like DS_AUDIT ,DS_JOBS and how to describe internal tables.

Posted: Sun May 07, 2006 6:57 am
by chulett
Welcome!

Keeping in mind the fact that these 'table details' are purposefully not published, you can find alot of information regarding them posted to these forums. Searching on the individual table names would be one way to start. You'll soon see who the Rock Stars* are. :wink:

Oh, and you would use LIST DICT tablename to describe them.

* hint: not me

Posted: Sun May 07, 2006 2:33 pm
by ray.wurlod
The tables that comprise the repository are deliberately incompletely described. "They" reserve the right to change them, and do so at most major releases. Further, some tables contain multiple record structures.
On this forum, and by inspecting Kim Duke's ETL toolkit, you can find out pretty much all that it is possible to know. Beware that this structure ceases to exist at the next release ("Hawk"); the Repository will henceforth be "open" with your choice of database, provided (initially) that you choose one of DB2, Oracle or SQL Server.

Posted: Mon May 08, 2006 4:46 am
by kumar_s
So is the Hawk's Repository is built on a 3rd party database, and not on UV like Informatica?

Posted: Mon May 08, 2006 6:54 am
by ArndW
The Hawk repository is no longer the DataStage flavor of UniVerse. Informatica has never used UniVerse files and cannot access them (without the 3rd party ODBC drivers).

Posted: Mon May 08, 2006 6:57 am
by kumar_s
Thanks for the repsonse.
BTW I actually ment Informatica uses 3rd party database like oracle to build and maintain its repository.

Posted: Mon May 08, 2006 2:33 pm
by ray.wurlod
In Hawk the same repository is used by not only DataStage but also InformationAnalyzer (formerly AuditStage, ProfileStage and some of the QualityStage investigation pieces), QualityStage and MetaStage. And you won't have direct access to the repository at all - there is a layer of services between you and it.

Posted: Tue May 23, 2006 8:45 am
by bradh
I am creating a DataStage job that will extract job name and last modification date from the DS_AUDIT file, sort the data and create a .csv file that I will mail to my manager. He needs the modification date for auditing. I understand that DS_AUDIT may change in the future. Does anyone see anything wrong with getting a list of jobs and change dates this way?


Thanks,

Brad

Posted: Tue May 23, 2006 9:13 am
by gateleys
Use the command LIST DICT DS_AUDIT to describe the DS_AUDIT file. You will see what are the fields in that file. Now, you can use either a hashed file or a UV stage to get the required fields, which in your case would be KEY (for job) and DTM (for Modification Timestamp).

gateleys

Posted: Tue May 23, 2006 10:22 am
by ArndW
Brad,

as long as you and/or the customer is aware that this method will cease to function (without modification) at some point in the next release then go ahead.

Posted: Tue May 23, 2006 11:47 am
by gateleys
Hey Arnd,
Is there any other way that we can collect the statistics other than from DS_AUDIT? If NO and also since the current repository will be massively revised in the next version, then we will be left with 2 options:
1. Wait for the client to upgrade to Hawk, and then figure out a way by going through it docs. Till then, tell the client that we cannot mess with DS_AUDIT since it will cease to exist in the next release.
2. Tell our client that the next release is no good/too expensive and use DS_AUDIT.


gateleys

Posted: Tue May 23, 2006 6:44 pm
by kduke
DS_AUDIT has been explained in several posts by Ray. EtlStats includes about 60 scripts most report on these tables like DS_JOBS and DS_JOBOBJECTS but a few also look at RT_LOG and RT_STATUS. There are lots of jobs included which extract this information to create your own metadata repository. These are all examples of how to use this information to make your ETL more powerful because you can intergrate row counts or job metadata.

I promise the next version will be much easier to install and use. I plan on cleaning up the jobs as well. A lot of these jobs you may only use once in a while. I have some jobs which I will share which extract all the table names which are cleared or truncated. I also extract all the table names from the system tables so I can find which tables are not truncated by comparing the results of the 2 jobs.

I did similar things with job sequences. I wanted to know which parameters were used in a sequence. To extract this and write your own SQL against the results to see if all developers setup each sequence the same. How consistent are your developers? Why not find out for sure instead of guessing.