Page 1 of 1

To get loaded count in DB2 table

Posted: Sun Nov 01, 2009 12:00 pm
by satishm
Hi Friends,
can some one help me out to write a routine or any other method to know whether all the records in the load dataset are properly loaded into DB2 table or not?
For eg: Suppose there are around 100 records in the load dataset and only 95 got loaded and job is successfully completed.
Is there any method to trace these 5 missing records.Even job log donot give such information.
Write method on table is Load append / Insert

Note: Full scan of the table again should be avoided to achieve this.

Regards,
Satish.

Posted: Sun Nov 01, 2009 12:02 pm
by ArndW
Not really. If there are errors in the load then the job will show those errors, if they don't show up there, then I would look at duplicate keys overwriting existing records as a possible reason.

You could use a reject link and insert/update method to check.

Posted: Sun Nov 01, 2009 12:06 pm
by satishm
ArndW wrote:Not really. If there are errors in the load then the job will show those errors, if they don't show up there, then I would look at duplicate keys overwriting existing records as a possible reason.

You could use a reject link and insert/update method to check.
Thanks for your quick reply,

As far as I know,reject link is possible only for update /insert.But job on which we are going to perform this check are exicying job running in production and no modification to them is acceptable :-(
If it would have been possible, link count of that reject link would have solved the problem

So was searching for an alternative>Appreciate your help.

Posted: Sun Nov 01, 2009 12:18 pm
by ArndW
Not being able to test it with reject links makes it tough; can you duplicate the production data on another DataStage machine in order to test it? What about duplicate primary keys?

Posted: Sun Nov 01, 2009 12:24 pm
by satishm
ArndW wrote:Not being able to test it with reject links makes it tough; can you duplicate the production data on another DataStage machine in order to test it? What about duplicate primary keys? ...
No Not really !!
But yes reject link test will solve it.But want a logic or code written, which will be used as an automated process after every load job in future if successful.
Reject link testing will be manual or can say adhoc testing, but cannot be implemented in production coz all load jobs are insert or append where pulling reject link is not allowed in datastage.

Is there any after job routine, which can perform this ?

Posted: Sun Nov 01, 2009 2:05 pm
by ray.wurlod
DataStage only reports how many records it sends to the database. If you can't use a reject link (because of Load method, for example) then you need some other mechanism, such as reviewing the log of the bulk loader. You can, of course, automate this if you know the format of that log.

Posted: Sun Nov 01, 2009 6:17 pm
by chulett
Worst case, if you've got some mechanism to identify records per load (timestamp, batch/load number, etc) then you can compare the counts from the source and the target as an 'after' process.

Posted: Tue Nov 03, 2009 3:51 am
by andrewn
You didn't say what version of DB2 you're using but in DB2 LUW you should be able to specify an EXCEPTION table to capture rows that are not loaded (because they are duplicates for example).


(Or in DB2 z/OS you can specify a DISCARD file which captures rows which are not loaded)

You would need another DataStage job to check the exception table to see if it had any rows, indicating that the previous job had failed to load all the rows.

Posted: Tue Nov 03, 2009 12:26 pm
by Sreenivasulu
Simple soluton - may not fit
Do a count(*) with an alias and get the number of records loaded

Regards
Sreeni

Posted: Tue Nov 03, 2009 12:43 pm
by satishm
Sreenivasulu wrote:Simple soluton - may not fit
Do a count(*) with an alias and get the number of records loaded

Regards
Sreeni
No Sreeni as mentioned , we want to avoid scan of entire table , as its a daily job and loads the table with millions of records and may degrade the system performance.

Posted: Tue Nov 03, 2009 2:14 pm
by ray.wurlod
Dump the records' key values into a temporary table as well as into the real table. Then do a count query on the join of the two (based on primary key and therefore resolved rapidly in the indexes) and see whether it matches the count reported by DataStage.