Row Count

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

pravin1581
Premium Member
Premium Member
Posts: 497
Joined: Sun Dec 17, 2006 11:52 pm
Location: Kolkata
Contact:

Row Count

Post by pravin1581 »

Hi All,

We need to develop a job which will give us the information of the number of records loaded in the final table.
The design will be something like this:

Code: Select all


  Source table -----------Transformer------------------Target Table
                                        |
                                        |
                                        |
                                      File

The information will be trapped in the file , such as table name, count of records inserted and count of records rejected.

The file will be generated for all the tables and finally it will loaded in a table which will hold infomation of the table names, record inserted and record rejected.

Thanks.
turash
Premium Member
Premium Member
Posts: 51
Joined: Fri Apr 06, 2007 10:09 am
Location: Edison, NJ

Re: Row Count

Post by turash »

pravin1581 wrote:Hi All,

We need to develop a job which will give us the information of the number of records loaded in the final table.
The design will be something like this:

Code: Select all


  Source table -----------Transformer------------------Target Table
                                        |
                                        |
                                        |
                                      File

The information will be trapped in the file , such as table name, count of records inserted and count of records rejected.

The file will be generated for all the tables and finally it will loaded in a table which will hold infomation of the table names, record inserted and record rejected.

Thanks.

in count and out count can be obtained by querying source and target table respectively.

You can provide another link from transformer and put the rejected rows in flat file. count these records either by unix shell script or create a stage variable and sum those rejected.
pravin1581
Premium Member
Premium Member
Posts: 497
Joined: Sun Dec 17, 2006 11:52 pm
Location: Kolkata
Contact:

Re: Row Count

Post by pravin1581 »

turash wrote:
pravin1581 wrote:Hi All,

We need to develop a job which will give us the information of the number of records loaded in the final table.
The design will be something like this:

Code: Select all


  Source table -----------Transformer------------------Target Table
                                        |
                                        |
                                        |
                                      File

The information will be trapped in the file , such as table name, count of records inserted and count of records rejected.

The file will be generated for all the tables and finally it will loaded in a table which will hold infomation of the table names, record inserted and record rejected.

Thanks.

in count and out count can be obtained by querying source and target table respectively.

You can provide another link from transformer and put the rejected rows in flat file. count these records either by unix shell script or create a stage variable and sum those rejected.
I want both the records , the inserted data count and rejected count, if any.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Pravin,

I prefer to let the job finish and then use the DSGetJobInfo(), DSGetLinkInfo() and DSGetStageInfo() routines from a job sequence or DS/BASIC routine to get this information.

If you really wish to do this using just the stages shown and no after-job routines or other calls, you would need to write the output row number for each row down your file link. If this row number were being written as data to a hashed file then you could use the destructive write aspect of hashed file and just have the highest value stored, along with the number of rejected writes detected as well.
pravin1581
Premium Member
Premium Member
Posts: 497
Joined: Sun Dec 17, 2006 11:52 pm
Location: Kolkata
Contact:

Post by pravin1581 »

ArndW wrote:Pravin,

I prefer to let the job finish and then use the DSGetJobInfo(), DSGetLinkInfo() and DSGetStageInfo() routines from a job sequence or DS/BASIC routine to get this information.

If you really wish to do this using just the stages shown and no after-job routines or other calls, you would need to write the output row number for each row down your file link. If this row number were being written as data to a hashed file then you could use the destructive write aspect of hashed file and just have the highest value stored, along with the number of rejected writes detected as well.
We are trying to use @INROWNUM and @OUTROWNUM to capture the count of records coming in the Transformer and going out of it , but it is not giving the desired result.
syed_subhaan
Participant
Posts: 8
Joined: Sat Apr 22, 2006 2:28 am
Location: Bangalore
Contact:

Post by syed_subhaan »

Best thing to do would be to have a after job routine.
Some thing like

Code: Select all

Source table -----------Copy------------------Target Table 
                                                        | 
                                                        | 
                                                        | 
                                                      Copy 
Get the Link counts and write it into a flat file in the format as required
Thanks & Regards,
Subhani Basha Syed.
pravin1581
Premium Member
Premium Member
Posts: 497
Joined: Sun Dec 17, 2006 11:52 pm
Location: Kolkata
Contact:

Post by pravin1581 »

syed_subhaan wrote:Best thing to do would be to have a after job routine.
Some thing like

Code: Select all

Source table -----------Copy------------------Target Table 
                                                        | 
                                                        | 
                                                        | 
                                                      Copy 
Get the Link counts and write it into a flat file in the format as required

I want the count from the Transformer and want to avoid the usage of any customised routine.
pravin1581
Premium Member
Premium Member
Posts: 497
Joined: Sun Dec 17, 2006 11:52 pm
Location: Kolkata
Contact:

Post by pravin1581 »

pravin1581 wrote:
syed_subhaan wrote:Best thing to do would be to have a after job routine.
Some thing like

Code: Select all

Source table -----------Copy------------------Target Table 
                                                        | 
                                                        | 
                                                        | 
                                                      Copy 
Get the Link counts and write it into a flat file in the format as required

I want the count from the Transformer and want to avoid the usage of any customised routine.

We have tried with @INROWNUM and @OUTROWNUM and it is giving appropriate result, but the catch is Transformer needs to be in Sequential mode and i dont want that.
dsnovice
Participant
Posts: 38
Joined: Thu Jul 22, 2004 11:56 pm
Location: Mclean, VA
Contact:

Post by dsnovice »

The routines DSGetJobInfo(), DSGetLinkInfo() and DSGetStageInfo() are data stage given routines. these are much better way to get information. and they will not hinder performance. This is kind of a standard practice in most places.

One way of implementing this is to use a multiple instance job with defined parameters such as the jobname, link name , reject link name, stage name etc defined. You can use these routines directly in the transformer and insert the outputs into target table.

This multiple instance job can be called in the job sequence that is calling the job under consideration. Or you can set up the Audit job as final step and run the job multiple times.

Thank you,
pravin1581
Premium Member
Premium Member
Posts: 497
Joined: Sun Dec 17, 2006 11:52 pm
Location: Kolkata
Contact:

Post by pravin1581 »

dsnovice wrote:The routines DSGetJobInfo(), DSGetLinkInfo() and DSGetStageInfo() are data stage given routines. these are much better way to get information. and they will not hinder performance. This is kind of a standard practice in most places.

One way of implementing this is to use a multiple instance job with defined parameters such as the jobname, link name , reject link name, stage name etc defined. You can use these routines directly in the transformer and insert the outputs into target table.

This multiple instance job can be called in the job sequence that is calling the job under consideration. Or you can set up the Audit job as final step and run the job multiple times.

Thank you,
Thanks for the reply , but these Macros are not available in PX Transformer.
gateleys
Premium Member
Premium Member
Posts: 992
Joined: Mon Aug 08, 2005 5:08 pm
Location: USA

Post by gateleys »

pravin1581 wrote: Thanks for the reply , but these Macros are not available in PX Transformer.
Try to call those DataStage functions from outside regular PX jobs, that is via Routine Activitiy in a sequence job.
gateleys
pravin1581
Premium Member
Premium Member
Posts: 497
Joined: Sun Dec 17, 2006 11:52 pm
Location: Kolkata
Contact:

Post by pravin1581 »

gateleys wrote:
pravin1581 wrote: Thanks for the reply , but these Macros are not available in PX Transformer.
Try to call those DataStage functions from outside regular PX jobs, that is via Routine Activitiy in a sequence job.
But that means a separate set of jobs , where as i need it in the same job, I have tried out with @INROWNUM and @OUTROWNUM to get the valus coming in the transformer and going out of the transformer and then using an Aggr stage to get the Max value of both, but the problem is Transformer needs to be run in the sequential mode.

Thanks
gateleys
Premium Member
Premium Member
Posts: 992
Joined: Mon Aug 08, 2005 5:08 pm
Location: USA

Post by gateleys »

pravin1581 wrote:but the problem is Transformer needs to be run in the sequential mode.
That's because your design doesn't allow for a parallel execution. And you don't want to do it any other way.

Here's the analogy: Guy wants to shoot a deer, such that it pierces through its heart. Oh yes, and he wants it alive.

Sorry about the graphic explanation. Just thought its OK on a Halloween Eve.:twisted:
gateleys
pravin1581
Premium Member
Premium Member
Posts: 497
Joined: Sun Dec 17, 2006 11:52 pm
Location: Kolkata
Contact:

Post by pravin1581 »

gateleys wrote:
pravin1581 wrote:but the problem is Transformer needs to be run in the sequential mode.
That's because your design doesn't allow for a parallel execution. And you don't want to do it any other way.

Here's the analogy: Guy wants to shoot a deer, such that it pierces through its heart. Oh yes, and he wants it alive.

Sorry about the graphic explanation. Just thought its OK on a Halloween Eve.:twisted:
Thanks for the analogy, so I need to write a Before/After Subroutine and call via a routine activity in the job sequence after the job runs.

Thanks.
pravin1581
Premium Member
Premium Member
Posts: 497
Joined: Sun Dec 17, 2006 11:52 pm
Location: Kolkata
Contact:

Post by pravin1581 »

pravin1581 wrote:
gateleys wrote:
pravin1581 wrote:but the problem is Transformer needs to be run in the sequential mode.
That's because your design doesn't allow for a parallel execution. And you don't want to do it any other way.

Here's the analogy: Guy wants to shoot a deer, such that it pierces through its heart. Oh yes, and he wants it alive.

Sorry about the graphic explanation. Just thought its OK on a Halloween Eve.:twisted:
Thanks for the analogy, so I need to write a Before/After Subroutine and call via a routine activity in the job sequence after the job runs.

Thanks.
But I guess there is a problem in that, I need the informations in the file, No. of rejected recorsd needs to be a calculated field , In - Out and how will I pass the arguments such as Job Name, Link name, Stage Name in the job.
Post Reply