How to count rows extracted from MS Sql table
Moderators: chulett, rschirm, roy
How to count rows extracted from MS Sql table
I'm doing a full extract of all rows from a MS SQL table and then building a flat file from it. I want to delete all of the rows from the table when I'm confident that the number of rows in the flat file are equal to the number of rows in the table.
What would I do to get a row count of the file I created?
Regards,
CB
What would I do to get a row count of the file I created?
Regards,
CB
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
DataStage keeps a count of the rows sent to the file, which you can subsequently interrogate using DSGetLinkInfo() or DSGetLogEntry() functions. Or you can use DSExecute to invoke the UNIX command wc -l filename
If you prefer to use a job sequence, you can use a Routine Activity for the former and a Command Activity for the latter.
Or you could use a shell script that counts the rows in the flat file, compares that to the result from SELECT COUNT(*) FROM tablename; and proceeds on that basis. The shell script could be invoked from DataStage if that's what you'd like to do.
There are other solutions. These will get you started.
If you prefer to use a job sequence, you can use a Routine Activity for the former and a Command Activity for the latter.
Or you could use a shell script that counts the rows in the flat file, compares that to the result from SELECT COUNT(*) FROM tablename; and proceeds on that basis. The shell script could be invoked from DataStage if that's what you'd like to do.
There are other solutions. These will get you started.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Thanks for your reply.
Would I put the DSGetLinkInfo() in the transformer step that's connected to my MS SQL (dynamic RDBMS) stage?
Thanks,
CB
Would I put the DSGetLinkInfo() in the transformer step that's connected to my MS SQL (dynamic RDBMS) stage?
Thanks,
CB
ray.wurlod wrote:DataStage keeps a count of the rows sent to the file, which you can subsequently interrogate using DSGetLinkInfo() or DSGetLogEntry() functions. Or you can use DSExecute to invoke the UNIX command wc -l filename
If you prefer to use a job sequence, you can use a Routine Activity for the former and a Command Activity for the latter.
Or you could use a shell script that counts the rows in the flat file, compares that to the result from SELECT COUNT(*) FROM tablename; and proceeds on that basis. The shell script could be invoked from DataStage if that's what you'd like to do.
There are other solutions. These will get you started.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
The word "subsequently" in my earlier post was intentional. You can't determine the final number of rows (lines in a file) until it's completely loaded.
This means that you would not use DSGetLinkInfo() in a Transformer stage. Two reasons; the current count increments for every row processed, and it's available more cheaply using the system variable @OUTROWNUM.
What you probably need to do is capture the number of rows (lines) from both places in a job sequence, and make the decision about whether to proceed in that job sequence. Conditionally start the job activity that loads the table.
This means that you would not use DSGetLinkInfo() in a Transformer stage. Two reasons; the current count increments for every row processed, and it's available more cheaply using the system variable @OUTROWNUM.
What you probably need to do is capture the number of rows (lines) from both places in a job sequence, and make the decision about whether to proceed in that job sequence. Conditionally start the job activity that loads the table.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Just to add on to what Ray had said, you can use post job subroutine in which you can count the number of records in the file. If it is equal to the number of rows got from the table (can be drawn from link info) you can issue a truncate command to the table either from the routine itself (for which you need licensed ODBC database driver) or you can design a job that truncates the table which can be called conditionally from the post job subroutine itself. Ray, did this make sense ??
Vignesh.
Vignesh.
Re: How to count rows extracted from MS Sql table
count is not a <b>reliable</b> way of checking if the operation iscbres00 wrote: I want to delete all of the rows from the table when I'm confident that the number of rows in the flat file are equal to the number of rows in the table.
Regards,
CB
successful . you have to have something like this to be safe :D
1.set Job Warning to 0
2.NOT (SeqFileLink.REJECTED) AND RaiseDSTransformError_subroutine
-
- Participant
- Posts: 3593
- Joined: Thu Jan 23, 2003 5:25 pm
- Location: Australia, Melbourne
- Contact:
Post your code if you want it debugged. Check the spelling of your link name (case sensitive).
Certus Solutions
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn
It is a good practise to Do a select coun(*) from loaded_table where "Give a condition which meet the last load".... And then compare it with the unix file.
Regards
Siva
Listening to the Learned
"The most precious wealth is the wealth acquired by the ear Indeed, of all wealth that wealth is the crown." - Thirukural By Thiruvalluvar
Siva
Listening to the Learned
"The most precious wealth is the wealth acquired by the ear Indeed, of all wealth that wealth is the crown." - Thirukural By Thiruvalluvar
As Vince noted, we'd need to see your code. Everything would need to be spelled and 'cased' correctly for it to work. Get any of the parameters wrong and you'll get one of a variety of negative return codes.shaimil wrote:I'm calling the command from a sequencer after the job in question has run.
I have a generic routine that a sequencer can call to get a link row count, it needs to know everything the function call needs in order to do this - job name, stage name and link name to check. You need to attach to the job in question and then issue the call to DSGetLinkInfo. Don't forget to detach from the job when you are done. Pass back the row count as the Answer from the routine. I'm assuming you are trying to do something similar.
I disagree with the notion that it's a 'good practice' to play the 'select count(*)' game. Perhaps if you are trying to get a count of what you should have extracted (as opposed to what you actually did extract) - but you can't always repeat the same sql and get the same answer. If you can then fine, but if the goal is to find out what you really extracted during a given process, then the method discussed here is the better practice. IMHO.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Look in JOBCONTROL.H (in the dsinclude directory) to learn that -1 means "invalid job handle".
You have not properly specified the first argument (the job handle) for DSGetLinkInfo.
Or you have used DSJ.ME as the first argument for DSGetLinkInfo in a routine but not loaded the contents of the JOBCONTROL.H header file.
You do this with the declaration $INCLUDE DSINCLUDE JOBCONTROL.H, but better (to avoid double declaration of constants) is:
So, where - exactly - are you trying to use DSGetLinkInfo ?
You have not properly specified the first argument (the job handle) for DSGetLinkInfo.
Or you have used DSJ.ME as the first argument for DSGetLinkInfo in a routine but not loaded the contents of the JOBCONTROL.H header file.
You do this with the declaration $INCLUDE DSINCLUDE JOBCONTROL.H, but better (to avoid double declaration of constants) is:
Code: Select all
$IFNDEF JOBCONTROL.H
$INCLUDE DSINCLUDE JOBCONTROL.H
$ENDIF
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.