Matching Row Count

pravin1581 · Post by **pravin1581** » Thu Oct 25, 2007 2:23 pm

Hi All,

I need to develop a job that will give me the number of rows pulled from the source tables and the number of rows loaded into the final table and whether they are equal or not.

ray.wurlod · Post by **ray.wurlod** » Thu Oct 25, 2007 2:57 pm

Excellent. Noting your use of the first person personal pronoun, I say "go for it".

pravin1581 · Post by **pravin1581** » Thu Oct 25, 2007 3:14 pm

ray.wurlod wrote:Excellent. Noting your use of the first person personal pronoun, I say "go for it".

Any sort of help will be appreciated.

ray.wurlod · Post by **ray.wurlod** » Thu Oct 25, 2007 9:54 pm

Create an after-job subroutine that uses DSGetLinkInfo() to retrieve the row counts, then compares them. What you need to do with that number appears not to be in your specification.

sanath1234 · Post by **sanath1234** » Fri Oct 26, 2007 12:19 am

pravin1581 wrote:Hi All,

I need to develop a job that will give me the number of rows pulled from the source tables and the number of rows loaded into the final table and whether they are equal or not.

use peek stage or an environment variable called APT_DUMP_SCORE which will return the number of records u collected with this u can solve the problem...

say cheers always................

bikan · Post by **bikan** » Fri Oct 26, 2007 12:56 am

You might probably be interested in getting the count from the command line utility, if you want to use the counts for file balancing.

Use dsjob -linkinfo <project> <job> <stage> <link> to get the link count

bikan · Post by **bikan** » Fri Oct 26, 2007 12:57 am

You might probably be interested in getting the count from the command line utility, if you want to use the counts for file balancing.

Use dsjob -linkinfo <project> <job> <stage> <link> to get the link count

pravin1581 · Post by **pravin1581** » Mon Oct 29, 2007 8:06 am

bikan wrote:You might probably be interested in getting the count from the command line utility, if you want to use the counts for file balancing.

Use dsjob -linkinfo <project> <job> <stage> <link> to get the link count

But using dsjob gives the record count from a particular link.

ray.wurlod · Post by **ray.wurlod** » Mon Oct 29, 2007 10:48 pm

So? Using a consistent stage and link naming convention in your jobs can ease that difficulty immensely. For example contemplate suffixes such as _INP, _OUT, _REJ, _LKP (or prefixes if you prefer) on your links that connect to the "outside world".

pravin1581 · Post by **pravin1581** » Tue Oct 30, 2007 5:43 am

ray.wurlod wrote:So? Using a consistent stage and link naming convention in your jobs can ease that difficulty immensely. For example contemplate suffixes such as _INP, _OUT, _REJ, _LKP (or prefixes if you prefer) on your links that connect to the "outside world".

I think, I misinterpreted the problem. The situation is I need to verify the summation of columns from 2 tables. I have approached the problem in this way : We have taken the reqd. column from the table and by an aggregator we have found out the sum same for the coulmn also and now we need to compare the carry that information in the file, values of both the columns and their difference in summation.

ray.wurlod · Post by **ray.wurlod** » Tue Oct 30, 2007 7:32 am

Ah, that's rather a different problem. It is solved using a "fork join" technique - split your data into two streams using a Copy stage, run one stream through the Aggregator stage, and re-unite them using a Join stage. The join key and the grouping key are probably the same.

Code: Select all

------> Copy stage ---------------> Join Stage ------>
          |                               ^
          |                               |
          +------> Aggregator Stage ------+

pravin1581 · Post by **pravin1581** » Tue Oct 30, 2007 8:41 am

ray.wurlod wrote:Ah, that's rather a different problem. It is solved using a "fork join" technique - split your data into two streams using a Copy stage, run one stream through the Aggregator stage, and re-unite them using a Join stage. The join key and the grouping key are probably the same.
Code: Select all
------> Copy stage ---------------> Join Stage ------>
          |                               ^
          |                               |
          +------> Aggregator Stage ------+

The data is coming in two different streams via two different tables and summing through the aggregator , how can I use the Join stage because then I will not get the values when they differ. I have joined them using Funnel but the problem is they are becoming one col and I am unable to calculate the difference between them or either split the columns. The design that we have implemented is as follows:

Code: Select all


 Table 1------Aggregator


 Table 2------Aggregator

[code]

 I have not depicted the flow after aggregator as I am not sure that this is a correct way. As mentioned above we have implemented Funnel after aggregator and then TRANSFORMER.

 Any sort of help is solicited.

Thanks

pravin1581 · Post by **pravin1581** » Tue Oct 30, 2007 10:30 am

pravin1581 wrote:
ray.wurlod wrote:Ah, that's rather a different problem. It is solved using a "fork join" technique - split your data into two streams using a Copy stage, run one stream through the Aggregator stage, and re-unite them using a Join stage. The join key and the grouping key are probably the same.
Code: Select all
------> Copy stage ---------------> Join Stage ------>
          |                               ^
          |                               |
          +------> Aggregator Stage ------+
The data is coming in two different streams via two different tables and summing through the aggregator , how can I use the Join stage because then I will not get the values when they differ. I have joined them using Funnel but the problem is they are becoming one col and I am unable to calculate the difference between them or either split the columns. The design that we have implemented is as follows:
Code: Select all
 Table 1------Aggregator


 Table 2------Aggregator

[code]

 I have not depicted the flow after aggregator as I am not sure that this is a correct way. As mentioned above we have implemented Funnel after aggregator and then TRANSFORMER.

 Any sort of help is solicited.

Thanks

I have managed to achieve the sumed values and their difference in diff columns in a file but the design is becoming very cumbersome.

Code: Select all


 Table 1-------> Aggr-------->
                                           |
                                          Join ------->Transformer----File
                                           |    
 Table 2-------->Aggr-------->

Transformer is used for calculating the difference between the 2 sums.Another problem is we will not get any data in the file when the join keys (date) dont match. Hence Join is not doing the job always.

DSXchange

Matching Row Count

Matching Row Count

Re: Matching Row Count