Inserts more records than the source has

Abraham · Post by **Abraham** » Wed Mar 26, 2014 4:40 pm

Hi,

I hope someone could help me

I have a parallel job that extracts from a Oracle Table and inserts into Netezza Table
But if I put the query
Select count(*) from (
select field1,field2...fieldn from schema.table
)
then a click "view data" in the Stage "Oracle Connector"
The result is over 11 million
but if I execute the job using

select field1, field2... fieldn from schema.table

It inserts over 21 million into Netezza Table

I looked at the Director log and it hasn't any warnings
It looks like if the table had more records
"Number of rows fetched on the current node: 21138554. "
"Number of rows inserted: 21,138,554"
I asked to DBA and the schema.table has over 11 million like the count shows

I've tried the Oracle Connector and the Oracle Enterprise with same results

Help me, Please!!

ray.wurlod · Post by **ray.wurlod** » Wed Mar 26, 2014 6:43 pm

Welcome aboard.

What is the row count reported from the Oracle Connector stage (in the job log)?
Are you really selecting from the correct Oracle schema and table in your DataStage job? Double check, then triple check.

How many nodes are there in your parallel execution configuration file?

Abraham · Post by **Abraham** » Wed Mar 26, 2014 7:03 pm

Hi Ray!
Thanks for your answer,

Yes.. its the same DB and schema and table
The log reports 21138554 records and its the same inserted in the target Netezza.

Here I put some of the Log lines
The connector was configured to run in parallel mode on 2 nodes, but the partitioned reads were not enabled. The connector will run in sequential mode.
Source: Number of rows fetched on the current node: 21138554.
Target: Number of rows inserted: 21,138,554

Maybe the reason can be, that is configured to run in parallel on 2 nodes?
Where can I configure that?

Regards

prasson_ibm · Post by **prasson_ibm** » Thu Mar 27, 2014 4:40 am

Abraham wrote:The connector was configured to run in parallel mode on 2 nodes, but the partitioned reads were not enabled. The connector will run in sequential mode.
Source: Number of rows fetched on the current node: 21138554.
Target: Number of rows inserted: 21,138,554

Hi,
Your source connector stage is running in sequential mode so running a job on 2 nodes is not an issue.
Please make sure that you are writing correct select query. Can you post your select query and the query you are using for count the number of rows.

prasson_ibm · Post by **prasson_ibm** » Thu Mar 27, 2014 4:41 am

You can create a Test job with source as oracle connector stage and target as peek stage.

write below select query in connector stage:-

Code: Select all

select count(*) as ROW_CNT from ( 
select field1,field2...fieldn from schema.table 
)

in coumn tab define ROW_CNT as bigint and check the result.

priyadarshikunal · Post by **priyadarshikunal** » Thu Mar 27, 2014 6:50 am

Are you sure that you are connecting to same database from SQL Client and from DataStage?

ray.wurlod · Post by **ray.wurlod** » Thu Mar 27, 2014 1:35 pm

What count do you get running Test #3 or Test #5 through a non-DataStage client, such as TOAD?

thompsonp · Post by **thompsonp** » Fri Mar 28, 2014 2:44 am

Can you also ask the DBA to check what sql is actually being run by the Database to ensure your query is not being rewritten by the optimiser or some trigger, policy etc?

Abraham · Post by **Abraham** » Mon Apr 14, 2014 10:11 am

Hi, finally we got the solution... it was a bug in the database.
Now my job loads the correct number of records

Thanks for your suggestions