To get loaded count in DB2 table

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
satishm
Participant
Posts: 22
Joined: Wed Aug 20, 2008 6:37 am

To get loaded count in DB2 table

Post by satishm »

Hi Friends,
can some one help me out to write a routine or any other method to know whether all the records in the load dataset are properly loaded into DB2 table or not?
For eg: Suppose there are around 100 records in the load dataset and only 95 got loaded and job is successfully completed.
Is there any method to trace these 5 missing records.Even job log donot give such information.
Write method on table is Load append / Insert

Note: Full scan of the table again should be avoided to achieve this.

Regards,
Satish.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Not really. If there are errors in the load then the job will show those errors, if they don't show up there, then I would look at duplicate keys overwriting existing records as a possible reason.

You could use a reject link and insert/update method to check.
satishm
Participant
Posts: 22
Joined: Wed Aug 20, 2008 6:37 am

Post by satishm »

ArndW wrote:Not really. If there are errors in the load then the job will show those errors, if they don't show up there, then I would look at duplicate keys overwriting existing records as a possible reason.

You could use a reject link and insert/update method to check.
Thanks for your quick reply,

As far as I know,reject link is possible only for update /insert.But job on which we are going to perform this check are exicying job running in production and no modification to them is acceptable :-(
If it would have been possible, link count of that reject link would have solved the problem

So was searching for an alternative>Appreciate your help.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Not being able to test it with reject links makes it tough; can you duplicate the production data on another DataStage machine in order to test it? What about duplicate primary keys?
satishm
Participant
Posts: 22
Joined: Wed Aug 20, 2008 6:37 am

Post by satishm »

ArndW wrote:Not being able to test it with reject links makes it tough; can you duplicate the production data on another DataStage machine in order to test it? What about duplicate primary keys? ...
No Not really !!
But yes reject link test will solve it.But want a logic or code written, which will be used as an automated process after every load job in future if successful.
Reject link testing will be manual or can say adhoc testing, but cannot be implemented in production coz all load jobs are insert or append where pulling reject link is not allowed in datastage.

Is there any after job routine, which can perform this ?
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

DataStage only reports how many records it sends to the database. If you can't use a reject link (because of Load method, for example) then you need some other mechanism, such as reviewing the log of the bulk loader. You can, of course, automate this if you know the format of that log.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Worst case, if you've got some mechanism to identify records per load (timestamp, batch/load number, etc) then you can compare the counts from the source and the target as an 'after' process.
-craig

"You can never have too many knives" -- Logan Nine Fingers
andrewn
Premium Member
Premium Member
Posts: 14
Joined: Tue Jul 10, 2007 3:19 am
Location: UK

Post by andrewn »

You didn't say what version of DB2 you're using but in DB2 LUW you should be able to specify an EXCEPTION table to capture rows that are not loaded (because they are duplicates for example).


(Or in DB2 z/OS you can specify a DISCARD file which captures rows which are not loaded)

You would need another DataStage job to check the exception table to see if it had any rows, indicating that the previous job had failed to load all the rows.
Sreenivasulu
Premium Member
Premium Member
Posts: 892
Joined: Thu Oct 16, 2003 5:18 am

Post by Sreenivasulu »

Simple soluton - may not fit
Do a count(*) with an alias and get the number of records loaded

Regards
Sreeni
satishm
Participant
Posts: 22
Joined: Wed Aug 20, 2008 6:37 am

Post by satishm »

Sreenivasulu wrote:Simple soluton - may not fit
Do a count(*) with an alias and get the number of records loaded

Regards
Sreeni
No Sreeni as mentioned , we want to avoid scan of entire table , as its a daily job and loads the table with millions of records and may degrade the system performance.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Dump the records' key values into a temporary table as well as into the real table. Then do a count query on the join of the two (based on primary key and therefore resolved rapidly in the indexes) and see whether it matches the count reported by DataStage.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply