duplicate record for key column in ODBC stage

san_deep · Post by **san_deep** » Fri Jan 12, 2007 1:33 am

Hi All,

I am facing a problem while using ODBC stage in my job.
i am getting duplicate records while retrieving the data through ODBC stage for a primary key column in database.
In data base the column is the key column and i am checking the key option in ODBC stage.
I am getting the same value of the column after 14 records,and the issue is same when i am changing the order of the column also in ODBC stage.

Waiting for some useful inputs as i have to revert back to client with probable problem and soluntion for that ie. whether its a stage problem or Schema problem.

with regards,
sandeep

ray.wurlod · Post by **ray.wurlod** » Fri Jan 12, 2007 2:31 am

IMPORT the table definition and use that. Chances are that you've missed one key column, or more than one.

san_deep · Post by **san_deep** » Fri Jan 12, 2007 4:58 am

i tried with importing the table definition and it is still gigving the same problem of duplicates.
My table has only one primary key.

ray.wurlod · Post by **ray.wurlod** » Fri Jan 12, 2007 5:03 am

By definition duplicates are impossible in a primary key. Execute the following query in the source table:

Code: Select all

SELECT Key, COUNT(*) FROM tablename GROUP BY Key HAVING COUNT(*) > 1;

If this returns any rows, castigate your DBA for failing to create a UNIQUE index on the primary key.

san_deep · Post by **san_deep** » Fri Jan 12, 2007 6:15 am

Ray, Thank yiu for your response. When I am executing the query in the database I am not getting any rows, meaning there are no duplicates in the table. But then, the odbc stage is giving me the same value for the PK after 40 rows, other column values are changing.

I dont know why this is happening, any thoughts?

chulett · Post by **chulett** » Fri Jan 12, 2007 7:14 am

Odd. Are you selecting from more than one table by chance? Can you post your SQL?

san_deep · Post by **san_deep** » Fri Jan 12, 2007 7:56 am

SELECT
CLMNT_ID,
SBSCR_ID,
CLMNT_FST_NM,
CLMNT_LST_NM,
DEPN_NBR,
TO_CHAR(CLMNT_BTH_DT,'YYYYMMDD') CLMNT_BTH_DT,
GDR_CD,
CLMNT_REL_CD,
DATA_SOURCE_CD,
MIDL_NM,
INDV_ID,
MBR_ID
FROM
Tablename

the above is the SQL query i am giving through User defined sql in ODBC stage.
CLMNT_ID is the KEy column in the database, but ODBC is giving me same duplicate records after around 40 distinct records.

chulett · Post by **chulett** » Fri Jan 12, 2007 8:12 am

As noted, I don't see how in the heck that is possible. What other stages are you using in the job?

DSguru2B · Post by **DSguru2B** » Fri Jan 12, 2007 9:31 am

What happens when you use generated sql?
Also try this: Define two columns key, count and execute Rays query in the odbc stage. What do you see?

ray.wurlod · Post by **ray.wurlod** » Fri Jan 12, 2007 3:51 pm

Are you using Entire partitioning, by any chance?

Even if you're not, the ODBC Enterprise stage emulates parallelism by creating as many connections to the data source as there are partitions for it to run in. This explains the duplicates; you are running the same query on each connection.

Change the execution mode of the ODBC Enterprise stage to sequential and/or the partitioning algorithm (on the input of the downstream parallel stage) to Hash on the key.

san_deep · Post by **san_deep** » Thu Feb 15, 2007 7:33 am

thanks for all iputs you have given.
i was able to resolve this issue by converting that to char by TO_CHAR in the user defined sql.
I don't know why it behaving like that, but i am bein gable to get the required result.
thnaks a lot to all of you.

san_deep · Post by **san_deep** » Thu Feb 15, 2007 7:34 am

thanks for all the inputs you have given.
i was able to resolve this issue by converting that to char by TO_CHAR in the user defined sql.
I don't know why it behaving like that, but i am bein gable to get the required result.
thnaks a lot to all of you.

ray.wurlod · Post by **ray.wurlod** » Thu Feb 15, 2007 2:00 pm

Were you perchance using Entire partitioning? That will definitely cause duplicates.