get row count

A forum for discussing DataStage<sup>®</sup> basics. If you're not sure where your question goes, start here.

Moderators: chulett, rschirm, roy

bobbysridhar
Premium Member
Premium Member
Posts: 41
Joined: Sun Mar 09, 2008 8:12 pm

get row count

Post by bobbysridhar »

How to get the row count when the link has 0 input records,
I am using @outrownum to get the row count but when there is no records I am not able to get anything. I want to have a row which says the number of records for that particular link is zero. I need to populate it in audit table at the end, whether there is records or not.
if number of records are 0 then I need to get a row with totalrecords=0 and start time,Jobname etc.
k.v.sreedhar
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

You can get all link rows counts after the job completes to populate your audit table and that would include any zero counts as well. Trying to do all that "in job" isn't really possible if you have no input.
-craig

"You can never have too many knives" -- Logan Nine Fingers
bobbysridhar
Premium Member
Premium Member
Posts: 41
Joined: Sun Mar 09, 2008 8:12 pm

Post by bobbysridhar »

In tfm for total records I am using @outrownum fn. then sorting total records in desc and then taking the row @inrownum=1 from output of sort stage.this is working fine when data is there.when there is no rows coming from tfm output then I an not able get total records as 0. I tried @outrownun>=0 as constrain
In tfm but still not getting total records are 0.
k.v.sreedhar
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

I hate any attempts to track row counts inside a job while it is running. I think this approach is fundamentally flawed as it does not work for links with 0 rows, jobs that abort or are stopped and it can end up with the wrong result if the developer doesn't understand parallel partitioning.

Under the 8.7 release the best way to get link row counts is to turn on the Operations Console job logging and use the JobRunLink table to get row counts for links or the DataLocator table for writes to files and databases.

Under 8.1 and 8.5 you can get job row counts from the job reports that can be generated after each job run. The ETLStats package from Kim Duke is an example of how to generate job reports and read the results. This is trickier than using the Operations Console database because it means reading the results from XML files rather than a database table.

Job audit and job row counting presents a very strong business case for upgrading to 8.7.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

bobbysridhar wrote:In tfm but still not getting total records are 0.
... and you won't. Zero records in = zero records out. That's why you can't do this in job but should (as I already noted) do it as a separate process after job.
Last edited by chulett on Mon Jul 16, 2012 10:42 pm, edited 1 time in total.
-craig

"You can never have too many knives" -- Logan Nine Fingers
kkmreddy
Participant
Posts: 11
Joined: Thu Jun 28, 2012 3:20 pm

Post by kkmreddy »

I Tried same once with following approach , My gola was to get the count of rows during tht particular run and get this count value in one of the column . So i did as follow ,
1.From Transformer defined a variable COUNTER with values as '1'
2.Aggregate grouping on this COUNTER and using Count function. Since all the rows have counter=1, count function resulted no.of rows processed .

I hope this will help you ...
bobbysridhar
Premium Member
Premium Member
Posts: 41
Joined: Sun Mar 09, 2008 8:12 pm

Post by bobbysridhar »

So,etl stats from Kim duke is the only option left?
Please let me know
k.v.sreedhar
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

You can use ETLStats or write your own. Under 8.1 and 8.5 you should be able to turn on job reports at the project level - this will create operational metadata in XML files to a reporting directory on your server. You can then use a DataStage job to get the row counts out of those reports. ETLStats provides an example of how to do that.
bobbysridhar
Premium Member
Premium Member
Posts: 41
Joined: Sun Mar 09, 2008 8:12 pm

Post by bobbysridhar »

I am using dsjobrepot after-job-subroutine and getting all job reports to specific directory in txt format. But creating jobname_timestamp.
But not able to club individual job report files into one file,because every time they are getting created with different timestamps in unix.as I am not an expert in unix, i am not able to get the files specific to that sequence and club them to one file.there I got struck.
Your help is really appreciated with regards to this.
k.v.sreedhar
ntr
Participant
Posts: 110
Joined: Fri Dec 18, 2009 10:30 am
Location: pune

Post by ntr »

kkmreddy wrote:I Tried same once with following approach , My gola was to get the count of rows during tht particular run and get this count value in one of the column . So i did as follow ,
1.From Transformer defined a variable COUNTER with values as '1'
2.Aggregate grouping on this COUNTER and using Count function. Since all the rows have counter=1, count function resulted no.of rows processed .
can you please explain more?
Last edited by ntr on Tue Jul 17, 2012 3:33 am, edited 1 time in total.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

bobbysridhar wrote:i am not able to get the files specific to that sequence and club them to one file.there I got struck.
I don't see the need to "club" them together at all. Can you not simply load them all using a File Pattern?
-craig

"You can never have too many knives" -- Logan Nine Fingers
bobbysridhar
Premium Member
Premium Member
Posts: 41
Joined: Sun Mar 09, 2008 8:12 pm

Post by bobbysridhar »

can you tell me how to grep them, becuase I am not expert in UNIX and the files are generated with filename_currenttimestamp.
Please help me in grep those files for each sequence run.
I have morethan 10 jobs in the sequence
k.v.sreedhar
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Sorry but why do you think you need to 'grep' them? For what purpose? Perhaps if you explained in detail what exactly you are trying to accomplish, we could do a better job of helping you get there.
-craig

"You can never have too many knives" -- Logan Nine Fingers
bobbysridhar
Premium Member
Premium Member
Posts: 41
Joined: Sun Mar 09, 2008 8:12 pm

Post by bobbysridhar »

I need to populate audit table.
The columns I need to populate are
start time
end time
elapsed time
reject records
insert record
update records
elapsed time.
For this I am doing the above said.
k.v.sreedhar
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Post the full contents of one of the job reports.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply