I am currently working on Datastage 6.0 with Parallel extender. The job requires to produce report in a text file with the following format stating the number of records processed, number of records rejected ( rejection can be based on business rules) and the number of records passed to the next job in queue. I have also gone through the Data Stage Reporting tool and couldn't find anything which meets my criteria.Any pointers as to how I can implement this in Datastage 6.0 parallel extender job will be highly appriciated.
Thanks in advance for your help
Regards
Pinkesh Kothana
Technical Specialist
Infosys Technologies Ltd.
Datastage reports
Moderators: chulett, rschirm, roy
-
- Premium Member
- Posts: 385
- Joined: Tue Oct 07, 2003 4:55 am
Pinkesh ,
There is no built-in mechanism for such statistics in Datastage but there are methods that you can use to write it yourself such as DSGetJobInfo and DSGetLinkInfo. From there you can get information such as starting time, end time and number of rows processed in each link.
Another option is using the dsjob command to query the logs. But again , the information is buried there so you have to dig )
The only tool I know that can give you such functionality is Metastage.
HTH,
Amos
There is no built-in mechanism for such statistics in Datastage but there are methods that you can use to write it yourself such as DSGetJobInfo and DSGetLinkInfo. From there you can get information such as starting time, end time and number of rows processed in each link.
Another option is using the dsjob command to query the logs. But again , the information is buried there so you have to dig )
The only tool I know that can give you such functionality is Metastage.
HTH,
Amos
Metstage would only be useful for this if the Process metabroker is installed.
Regards,
Michael
Regards,
Michael
Mike Hester
mhester@petra-ps.com
mhester@petra-ps.com
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
A DIY approach is to create a server job that consists purely of job control code that interrogates either the job, link and stage properties or the log file in order to produce the requisite report. This is a fairly straightforward DataStage BASIC programming task.
On Ascential's Programming with DataStage BASIC class (available on demand) you are shown how to read the job log file from the repository.
On Ascential's Programming with DataStage BASIC class (available on demand) you are shown how to read the job log file from the repository.
-
- Premium Member
- Posts: 385
- Joined: Tue Oct 07, 2003 4:55 am
Datastage reports
Thanks a lot for your information.
Is there any simple way to get these results for ex. to store the counts in some variables (i don't know where) and then in an after job subroutine we can write a shell script providing these values?
Regards
Pinkesh
Is there any simple way to get these results for ex. to store the counts in some variables (i don't know where) and then in an after job subroutine we can write a shell script providing these values?
Regards
Pinkesh
Datastage reports
Thanks Amos.
Appreciate if you please tell me how to use these methods.
Atually I am new to this Data Stage tool.
Again thanks a lot for your time.
Regards
Pinkesh
Appreciate if you please tell me how to use these methods.
Atually I am new to this Data Stage tool.
Again thanks a lot for your time.
Regards
Pinkesh
Amos.Rosmarin wrote:Pinkesh ,
There is no built-in mechanism for such statistics in Datastage but there are methods that you can use to write it yourself such as DSGetJobInfo and DSGetLinkInfo. From there you can get information such as starting time, end time and number of rows processed in each link.
Another option is using the dsjob command to query the logs. But again , the information is buried there so you have to dig )
The only tool I know that can give you such functionality is Metastage.
HTH,
Amos
Re: Datastage reports
Why do that? Just create a Buildop stage, and do all the counting in C++, and spit out the new data to a separate flat file.pkothana wrote:Is there any simple way to get these results for ex. to store the counts in some variables (i don't know where) and then in an after job subroutine we can write a shell script providing these values?
Instead of rejecting the records by dropping it, use the reject link, or spit out a record with a dummy value to this buildop stage.
Utilizing the buildop stage open up a lot of possibilities for you that require some careful design to optimize the flow. But this is a complete Parallel solution that you're seeking. (Of course, the buildop stage output will have to be aggregated if you don't care for data per node).
-T.J.
Developer of DataStage Parallel Engine (Orchestrate).