Capturing Job Information? Pls Help asap!
Moderators: chulett, rschirm, roy
Capturing Job Information? Pls Help asap!
Hi,
In our project, we need to develop an ETL auditing process as a part of ensuring Data Quality. For that we need to develop an automated process (either an ETL job or routine) that captures all information about the jobs under our project. The information are to be stored separately in different database tables as below (immediately after the development phase is over).
ETL Job
Job Name
Job Description
Stage
Stage Name
Stage Type (Aggregator, transformer, etc.)
Stage Description
Stage Link Relationship
Link Type (input, output)
Reject Link Flag
Link
Link Name
How to achieve this? Anybody done it in your project? Please share your thoughts!
In our project, we need to develop an ETL auditing process as a part of ensuring Data Quality. For that we need to develop an automated process (either an ETL job or routine) that captures all information about the jobs under our project. The information are to be stored separately in different database tables as below (immediately after the development phase is over).
ETL Job
Job Name
Job Description
Stage
Stage Name
Stage Type (Aggregator, transformer, etc.)
Stage Description
Stage Link Relationship
Link Type (input, output)
Reject Link Flag
Link
Link Name
How to achieve this? Anybody done it in your project? Please share your thoughts!
Vignesh.
"A conclusion is simply the place where you got tired of thinking."
"A conclusion is simply the place where you got tired of thinking."
Hi,
You can try checking the DSMakeJobReport Routine, In case it is not yet there in 7.0.2 go over the DSGetJobInfo, DSGetLinkInfo and such for the relevant options to get the info you need.
Depending on the info you need you might also need to use the DSGetLogEntry/Summery and such to extract things regarding the last run time of the job in question.
on top of this try searching for more info in this forum (I don't have 7.0.2 avail to check your exact options
).
IHTH,
You can try checking the DSMakeJobReport Routine, In case it is not yet there in 7.0.2 go over the DSGetJobInfo, DSGetLinkInfo and such for the relevant options to get the info you need.
Depending on the info you need you might also need to use the DSGetLogEntry/Summery and such to extract things regarding the last run time of the job in question.
on top of this try searching for more info in this forum (I don't have 7.0.2 avail to check your exact options
![Sad :(](./images/smilies/icon_sad.gif)
IHTH,
Roy R.
Time is money but when you don't have money time is all you can afford.
Search before posting:)
Join the DataStagers team effort at:
http://www.worldcommunitygrid.org
![Image](http://www.worldcommunitygrid.org/images/logo.gif)
Time is money but when you don't have money time is all you can afford.
Search before posting:)
Join the DataStagers team effort at:
http://www.worldcommunitygrid.org
![Image](http://www.worldcommunitygrid.org/images/logo.gif)
Vineshgra,
this is not difficult to do using the user interface calls.
He main workhorse routine to get this information is DSGetJobInfo, plus using the other DSGet... routines to find your detail information. I have done this in a basic function to populate the client's tables (the tables are of different setup and format, but the content is essentially the same).
this is not difficult to do using the user interface calls.
He main workhorse routine to get this information is DSGetJobInfo, plus using the other DSGet... routines to find your detail information. I have done this in a basic function to populate the client's tables (the tables are of different setup and format, but the content is essentially the same).
Roy and Arnd,
Thanks for your quick response! To be true, I never tried these DataStage in-built basic routines. I just want to know if these routines can give the details like type of the stage (Transformer, Sorter, Aggregator, etc) and stage-link relationships (like In_Trns is the input link and Out_Trns is the output link to the transformer Trns_record). If it is very possible, then I can go ahead and implement the idea. Also this is the first time, I will be working with those routines. So I just want to know if there is anything specific to be considered before using those routines.
Thanks!
Thanks for your quick response! To be true, I never tried these DataStage in-built basic routines. I just want to know if these routines can give the details like type of the stage (Transformer, Sorter, Aggregator, etc) and stage-link relationships (like In_Trns is the input link and Out_Trns is the output link to the transformer Trns_record). If it is very possible, then I can go ahead and implement the idea. Also this is the first time, I will be working with those routines. So I just want to know if there is anything specific to be considered before using those routines.
Thanks!
Vignesh.
"A conclusion is simply the place where you got tired of thinking."
"A conclusion is simply the place where you got tired of thinking."
If you would upgrade then you could use EtlStats. This is a set of jobs and routines to get most of this. It updates a set of tables in the target like ETL_JOB, ETL_ROW_HIST and ETL_LINK. Everything you need except the 2 items below is already built for you.
If you download DwNav from my web site it will create a series of hash files. The auth code was posted yesterday on here. To get row counts execute a TCL command
This will populate hash files:
These all have dictionary items. Import their metadata and load them to a table of your choice. This code will work in versions 5,6 and 7 of DataStage. It stores the Link Type above as well. I have posted code from these if you still want to build it yourself. Do a search. The hash files with HIST in the name have the job run time as part of the key. This allows you to compare one run to the previous run. This is much easier in your target database so copy the data there. If you go this route then let me know and I will show you to get all of what you need.
This is valuable information keep trying to you get it all. Post your solution and what you are still missing and we will help finish it.
These should be easy to add. I am not sure of all these can be loaded from the routine calls supplied by Ascential. If I get some time this week. I will help isolate the last 2 fields you need. There is a post maybe from you on trying to get EtlStats to import into 7.0. Do a search. There are lots of posts about EtlStats. Nobody else seems to have a solution which they are willing to share with DSX. Shame really.Stage Link Relationship
Link Type (input, output)
Reject Link Flag
If you download DwNav from my web site it will create a series of hash files. The auth code was posted yesterday on here. To get row counts execute a TCL command
Code: Select all
DWN_RT_ROW_COUNT.uvb JobName
Code: Select all
DWN_RT_JOB_STAT_HIST
DWN_RT_JOB_STAT
DWN_RT_ROW_COUNT_HIST
DWN_RT_ROW_COUNT
DWN_RT_PARAM_HIST
DWN_RT_PARAM
This is valuable information keep trying to you get it all. Post your solution and what you are still missing and we will help finish it.
Mamu Kim
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
All of the information Vignesh seeks can be found in the DataStage Repository - and therefore can be reported on using the Reporting Tool (Documentation Assistant) or MetaStage.
In current versions, indeed, all the information is in a single table, but this is one of those tables for which complete column definitions have not been published.
In current versions, indeed, all the information is in a single table, but this is one of those tables for which complete column definitions have not been published.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 3593
- Joined: Thu Jan 23, 2003 5:25 pm
- Location: Australia, Melbourne
- Contact:
I don't think parallel jobs can be exported to the Reporting Assistant. They certainly do not turn up on the "Jobs" tab when I run it. MetaStage is a better option and by the next release we will probably see the Reporting Assistant disappearing altogether to be replaced by better metadata reporting straight out of DataStage.
Certus Solutions
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn
-
- Participant
- Posts: 9
- Joined: Sat Feb 12, 2005 5:24 pm
Using shared container....?
An easy way to achieve this would be to create a shared container having a triggered input from the jobs you are trying to audit. I'm afraid this would be just patchwork, but it still works. Include a few parameters which you can tune to the stages and links that are in the job and then use the BASIC job functions in a transformer stage in the shared container to capture the job parameters like the job name, transform stage, linkrowcount, etc. Also include a flatfile or a database stage to write the data into the tables in the database inside the shared container. But this would entail adding this shared container stage in all the existing job as well as in all the jobs developed in the future.
Hope this helps.
Cheers.![Smile :)](./images/smilies/icon_smile.gif)
Hope this helps.
Cheers.
![Smile :)](./images/smilies/icon_smile.gif)