Datastage reports

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
pkothana
Participant
Posts: 50
Joined: Tue Oct 14, 2003 6:12 am

Datastage reports

Post by pkothana »

I am currently working on Datastage 6.0 with Parallel extender. The job requires to produce report in a text file with the following format stating the number of records processed, number of records rejected ( rejection can be based on business rules) and the number of records passed to the next job in queue. I have also gone through the Data Stage Reporting tool and couldn't find anything which meets my criteria.Any pointers as to how I can implement this in Datastage 6.0 parallel extender job will be highly appriciated.

Thanks in advance for your help

Regards
Pinkesh Kothana
Technical Specialist
Infosys Technologies Ltd.
Amos.Rosmarin
Premium Member
Premium Member
Posts: 385
Joined: Tue Oct 07, 2003 4:55 am

Post by Amos.Rosmarin »

Pinkesh ,

There is no built-in mechanism for such statistics in Datastage but there are methods that you can use to write it yourself such as DSGetJobInfo and DSGetLinkInfo. From there you can get information such as starting time, end time and number of rows processed in each link.

Another option is using the dsjob command to query the logs. But again , the information is buried there so you have to dig :))

The only tool I know that can give you such functionality is Metastage.


HTH,
Amos
mhester
Participant
Posts: 622
Joined: Tue Mar 04, 2003 5:26 am
Location: Phoenix, AZ
Contact:

Post by mhester »

Metstage would only be useful for this if the Process metabroker is installed.

Regards,

Michael
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

A DIY approach is to create a server job that consists purely of job control code that interrogates either the job, link and stage properties or the log file in order to produce the requisite report. This is a fairly straightforward DataStage BASIC programming task.

On Ascential's Programming with DataStage BASIC class (available on demand) you are shown how to read the job log file from the repository.
Amos.Rosmarin
Premium Member
Premium Member
Posts: 385
Joined: Tue Oct 07, 2003 4:55 am

Post by Amos.Rosmarin »

Another idea is to create an 'After job routine' that does what Ray suggested .... again some ds-basic programming + some overhead on the job.

It makes the task much easier if you have a good maning convention, so when you query the links and stages you can identify the different objects.
pkothana
Participant
Posts: 50
Joined: Tue Oct 14, 2003 6:12 am

Datastage reports

Post by pkothana »

Thanks a lot for your information.

Is there any simple way to get these results for ex. to store the counts in some variables (i don't know where) and then in an after job subroutine we can write a shell script providing these values?


Regards
Pinkesh
pkothana
Participant
Posts: 50
Joined: Tue Oct 14, 2003 6:12 am

Datastage reports

Post by pkothana »

Thanks Amos.
Appreciate if you please tell me how to use these methods.
Atually I am new to this Data Stage tool.

Again thanks a lot for your time.

Regards
Pinkesh
Amos.Rosmarin wrote:Pinkesh ,

There is no built-in mechanism for such statistics in Datastage but there are methods that you can use to write it yourself such as DSGetJobInfo and DSGetLinkInfo. From there you can get information such as starting time, end time and number of rows processed in each link.

Another option is using the dsjob command to query the logs. But again , the information is buried there so you have to dig :))

The only tool I know that can give you such functionality is Metastage.


HTH,
Amos
Teej
Participant
Posts: 677
Joined: Fri Aug 08, 2003 9:26 am
Location: USA

Re: Datastage reports

Post by Teej »

pkothana wrote:Is there any simple way to get these results for ex. to store the counts in some variables (i don't know where) and then in an after job subroutine we can write a shell script providing these values?
Why do that? Just create a Buildop stage, and do all the counting in C++, and spit out the new data to a separate flat file.

Instead of rejecting the records by dropping it, use the reject link, or spit out a record with a dummy value to this buildop stage.

Utilizing the buildop stage open up a lot of possibilities for you that require some careful design to optimize the flow. But this is a complete Parallel solution that you're seeking. (Of course, the buildop stage output will have to be aggregated if you don't care for data per node).

-T.J.
Developer of DataStage Parallel Engine (Orchestrate).
Post Reply