Teradata table record count mismatch performance statistics

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
Nagin
Charter Member
Charter Member
Posts: 89
Joined: Thu Jan 26, 2006 12:37 pm

Teradata table record count mismatch performance statistics

Post by Nagin »

I am loading a sequential file which has about 2 million rows into a teradata table. I am creating the table through the ETL. The ETL runs fine with couple of warnings about running sequentially and stuff and the job finishes saying that all the rows have been imported successfully and 0 records rejected.

In the job, if I look at performance statistics, it shows me a number (the number i would like to see) which does not match with the total number of records in the table. when i run select count(*) on the table it is showing me less record count then what the performance statistics in the job is displaying. Any of you ran into this type of problem?

Please share your thoughts on this.

Thanks for the help.
JoshGeorge
Participant
Posts: 612
Joined: Thu May 03, 2007 4:59 am
Location: Melbourne

Post by JoshGeorge »

To insert 2 million records you might have used one of the database stages which call one of the Teradata Utilities like MultiLoad, FastLoad... DataStage writes all the records into a named pipe or file and loading to Teradata table happens using one of the utility mentioned above. Records rejected while loading are not reported to DataStage by the utility. In DataStage 7x or lower versions these reject count is not captured and reported. The count you saw is the number of records DataStage trustfully passed to named pipe or file.
Joshy George
<a href="http://www.linkedin.com/in/joshygeorge1" ><img src="http://www.linkedin.com/img/webpromo/bt ... _80x15.gif" width="80" height="15" border="0"></a>
Nagin
Charter Member
Charter Member
Posts: 89
Joined: Thu Jan 26, 2006 12:37 pm

Post by Nagin »

JoshGeorge wrote:To insert 2 million records you might have used one of the database stages which call one of the Teradata Utilities like MultiLoad, FastLoad... DataStage writes all the records into a named pipe or file and loading to Teradata table happens using one of the utility mentioned above. Records rejected while loading are not reported to DataStage by the utility. In DataStage 7x or lower versions these reject count is not captured and reported. The count you saw is the number of records DataStage trustfully passed to named pipe or file.
Thanks for the reply Josh.
As you said, the total count on the table is the number of records successfully passed.
I was using Teradata Enterprise stage and the performance statistics shows a different (lower) count than what actually got inserted in the table. No row rejects were reported in the log. And, the real bad part is the log has a info event saying that all the records have been written successfully, which leaves you more confused as you are not seeing all the rows in the table and no warnings in the log.

Then, I tried with Teradata Multiload stage for my target, it gave me all the rows that were getting rejected as warning, finally it turned out to be bad data that was causing all these problems.

Another thing I observed is, Teradata Enterprise stge does not have insert/update option in the properties. It has create, replace, append, trucate options but when I am creating a table and loading a flat file, which also has updates, how is this hadled automatically?

Thanks again.
Ramani
Participant
Posts: 58
Joined: Mon Oct 08, 2007 1:51 am

Post by Ramani »

based on your comments, I get two things.
1- the stage automatically does insert or update based on the keys defined in the table.
2- try to rebuild the indexes in the table, some times, the index rows might not have updated properly and the select count may bring the counts from index rows for faster counting.
Thanks
Ramani
Participant
Posts: 58
Joined: Mon Oct 08, 2007 1:51 am

Post by Ramani »

based on your comments, I get two things.
1- the stage automatically does insert or update based on the keys defined in the table.
2- try to rebuild the indexes in the table, some times, the index rows might not have updated properly and the select count may bring the counts from index rows for faster counting.
Thanks
Post Reply