Page 1 of 1

Matching Row Count

Posted: Thu Oct 25, 2007 2:23 pm
by pravin1581
Hi All,

I need to develop a job that will give me the number of rows pulled from the source tables and the number of rows loaded into the final table and whether they are equal or not.

Posted: Thu Oct 25, 2007 2:57 pm
by ray.wurlod
Excellent. Noting your use of the first person personal pronoun, I say "go for it".

Posted: Thu Oct 25, 2007 3:14 pm
by pravin1581
ray.wurlod wrote:Excellent. Noting your use of the first person personal pronoun, I say "go for it".
Any sort of help will be appreciated.

Posted: Thu Oct 25, 2007 9:54 pm
by ray.wurlod
Create an after-job subroutine that uses DSGetLinkInfo() to retrieve the row counts, then compares them. What you need to do with that number appears not to be in your specification.

Re: Matching Row Count

Posted: Fri Oct 26, 2007 12:19 am
by sanath1234
pravin1581 wrote:Hi All,

I need to develop a job that will give me the number of rows pulled from the source tables and the number of rows loaded into the final table and whether they are equal or not.
use peek stage or an environment variable called APT_DUMP_SCORE which will return the number of records u collected with this u can solve the problem...
say cheers always................

Posted: Fri Oct 26, 2007 12:56 am
by bikan
You might probably be interested in getting the count from the command line utility, if you want to use the counts for file balancing.

Use dsjob -linkinfo <project> <job> <stage> <link> to get the link count

Posted: Fri Oct 26, 2007 12:57 am
by bikan
You might probably be interested in getting the count from the command line utility, if you want to use the counts for file balancing.

Use dsjob -linkinfo <project> <job> <stage> <link> to get the link count

Posted: Mon Oct 29, 2007 8:06 am
by pravin1581
bikan wrote:You might probably be interested in getting the count from the command line utility, if you want to use the counts for file balancing.

Use dsjob -linkinfo <project> <job> <stage> <link> to get the link count
But using dsjob gives the record count from a particular link.

Posted: Mon Oct 29, 2007 10:48 pm
by ray.wurlod
So? Using a consistent stage and link naming convention in your jobs can ease that difficulty immensely. For example contemplate suffixes such as _INP, _OUT, _REJ, _LKP (or prefixes if you prefer) on your links that connect to the "outside world".

Posted: Tue Oct 30, 2007 5:43 am
by pravin1581
ray.wurlod wrote:So? Using a consistent stage and link naming convention in your jobs can ease that difficulty immensely. For example contemplate suffixes such as _INP, _OUT, _REJ, _LKP (or prefixes if you prefer) on your links that connect to the "outside world".
I think, I misinterpreted the problem. The situation is I need to verify the summation of columns from 2 tables. I have approached the problem in this way : We have taken the reqd. column from the table and by an aggregator we have found out the sum same for the coulmn also and now we need to compare the carry that information in the file, values of both the columns and their difference in summation.

Posted: Tue Oct 30, 2007 7:32 am
by ray.wurlod
Ah, that's rather a different problem. It is solved using a "fork join" technique - split your data into two streams using a Copy stage, run one stream through the Aggregator stage, and re-unite them using a Join stage. The join key and the grouping key are probably the same.

Code: Select all

------> Copy stage ---------------> Join Stage ------>
          |                               ^
          |                               |
          +------> Aggregator Stage ------+

Posted: Tue Oct 30, 2007 8:41 am
by pravin1581
ray.wurlod wrote:Ah, that's rather a different problem. It is solved using a "fork join" technique - split your data into two streams using a Copy stage, run one stream through the Aggregator stage, and re-unite them using a Join stage. The join key and the grouping key are probably the same.

Code: Select all

------> Copy stage ---------------> Join Stage ------>
          |                               ^
          |                               |
          +------> Aggregator Stage ------+
The data is coming in two different streams via two different tables and summing through the aggregator , how can I use the Join stage because then I will not get the values when they differ. I have joined them using Funnel but the problem is they are becoming one col and I am unable to calculate the difference between them or either split the columns. The design that we have implemented is as follows:

Code: Select all


 Table 1------Aggregator


 Table 2------Aggregator

[code]

 I have not depicted the flow after aggregator as I am not sure that this is a correct way. As mentioned above we have implemented Funnel after aggregator and then TRANSFORMER.

 Any sort of help is solicited.

Thanks

Posted: Tue Oct 30, 2007 10:30 am
by pravin1581
pravin1581 wrote:
ray.wurlod wrote:Ah, that's rather a different problem. It is solved using a "fork join" technique - split your data into two streams using a Copy stage, run one stream through the Aggregator stage, and re-unite them using a Join stage. The join key and the grouping key are probably the same.

Code: Select all

------> Copy stage ---------------> Join Stage ------>
          |                               ^
          |                               |
          +------> Aggregator Stage ------+
The data is coming in two different streams via two different tables and summing through the aggregator , how can I use the Join stage because then I will not get the values when they differ. I have joined them using Funnel but the problem is they are becoming one col and I am unable to calculate the difference between them or either split the columns. The design that we have implemented is as follows:

Code: Select all


 Table 1------Aggregator


 Table 2------Aggregator

[code]

 I have not depicted the flow after aggregator as I am not sure that this is a correct way. As mentioned above we have implemented Funnel after aggregator and then TRANSFORMER.

 Any sort of help is solicited.

Thanks

I have managed to achieve the sumed values and their difference in diff columns in a file but the design is becoming very cumbersome.

Code: Select all


 Table 1-------> Aggr-------->
                                           |
                                          Join ------->Transformer----File
                                           |    
 Table 2-------->Aggr-------->

Transformer is used for calculating the difference between the 2 sums.Another problem is we will not get any data in the file when the join keys (date) dont match. Hence Join is not doing the job always.