Merge Statement SQL Server ODBC Connector

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
nvalia
Premium Member
Premium Member
Posts: 180
Joined: Thu May 26, 2005 6:44 am

Merge Statement SQL Server ODBC Connector

Post by nvalia »

Hi,
DS 8.7 on Windows and DB is SQL Server 2008 (both Src and Tgt)

We have Stage table and Final table. We Truncate and load the Stage table from a Flat file and then Update or Insert the Final table, based on the Primary Key matching between the 2 (SCD Type 1)

Volume..the Stage Table gets daily about 500K records while the final table has 15 Million records..So we end up updating about 400K and Inserting 100K..but this is taking about 4 mins through Datastage (if executed by a SP via SSIS it only take 45 Secs) as we need to run it on Single Node to avoid DeadLock...Currently using Update then Insert option in the ODBC Connecter stage

Can we use the SQL server Merge statement instead (manually writting the query as the ODBC Connector does not seem to have this option)..
and would it be better from a performance perspective as well?

Thanks,
NV
BI-RMA
Premium Member
Premium Member
Posts: 463
Joined: Sun Nov 01, 2009 3:55 pm
Location: Hamburg

Post by BI-RMA »

Hi nvalia,

what you can do, of course, is use a SQL-Server stored-procedure running your Merge-Statement and call that from DataStage. You can also run any valid SQL-Statement as a Before- or After-SQL from the ODBC-Connector-Stage. A stored-procedure is probably a better solution, because that gives you some options for error-handling - just in case your stageing-table is empty for some reason, or you encounter some other condition that you can't monitor from DataStage when using this approach.

By the way: Comparing DataStage with SSIS when using the same SQL-Server-instance as source and target is quite unfair. SSIS is practically plugged into that database, while DataStage first has to import the data to write it back to the database afterwards. If SSIS would not excel in this, the tool would be completely useless.
"It is not the lucky ones are grateful.
There are the grateful those are happy." Francis Bacon
kwwilliams
Participant
Posts: 437
Joined: Fri Oct 21, 2005 10:00 pm

Post by kwwilliams »

1. First deadlocking the database is more complicated than just running somethign on one node. At a mininmum you can decrease the array and transaction size to decrease the number of records effected by each node performing an upsert. The SQl Server table itself can be tuned by your DBA.

2. If you don't have the balanced optimizer it sounds like you are a canidate for that product. When moving from a stage to a target table within the same database server it will move the data without pulling it back to the DataStage Server. This is often referred to as ELT or Push Down processing. I would suggest looking at the product, but if you want to see how it works ... create thsi simple job:

Row Generator --> ODBC Connector

In the row generator create one column, Dummy, and create one row of data. In the ODBC stage put a custom sql statement and paste your merge statement into the ODBC Connector. Compile the job and run.
nvalia
Premium Member
Premium Member
Posts: 180
Joined: Thu May 26, 2005 6:44 am

Post by nvalia »

Thanks All..

As suggested, I used a Row Generator to create 1 record with 1field (Appt Num) and in the ODBC connector stage I am writing a User Defined Update SQL using this as key for the update along with 1=2 in the where clause to NOT update anything. Running on 1 Node Configuration

The Merge statement is defined in the Before Sql (Node) Statements with a couple of other Before SQL commands..This Merge executes as expected but I get a couple of warnings as "Unable to determine association between statement parameters and table columns. The connector will not be able to obtain external schema and only limited schema reconciliation will be performed"

How can I remove this warning?

Thanks,
NV
BI-RMA
Premium Member
Premium Member
Posts: 463
Joined: Sun Nov 01, 2009 3:55 pm
Location: Hamburg

Post by BI-RMA »

The kind of operation performed in the SQL-section of your ODBC-connector is pretty meaningless. So why do you use an update-statement when you can just as well use a select-statement and write the result to a copy-stage with no output-link or to a peek?

An update-statement usually needs at least a key-column and an update-column. Both columns need to be actual columns of your target-table. Even if you do not intend to change anything, your statement needs to be acurate in relationship to the metadata of the target-table.
"It is not the lucky ones are grateful.
There are the grateful those are happy." Francis Bacon
nvalia
Premium Member
Premium Member
Posts: 180
Joined: Thu May 26, 2005 6:44 am

Post by nvalia »

I was providing valid columns existing in the actual table in the update query

On trying the Select option in a Source ODBC connector and writting the o/p to a Copy stage, the Merge in the Before SQL works fine (no warnings) but for some reason the After SQL (I need to do something) is not executing even though log says executed successfully..

I also have the setting for the property to Yes to Fail on Error when for AFTER SQL statements..Not sure if this something to do with using ODBC connector as Source..I know hat should be not be the case..

What could be the reason?
nvalia
Premium Member
Premium Member
Posts: 180
Joined: Thu May 26, 2005 6:44 am

Post by nvalia »

Using the RowGenerator for 1 Dummy source row with 1 column and using a Delete statement (deletes nothing) in the Target ODBC Conn stage worked..

Thanks for your help
BI-RMA
Premium Member
Premium Member
Posts: 463
Joined: Sun Nov 01, 2009 3:55 pm
Location: Hamburg

Post by BI-RMA »

Hi nvalia,

for the Select-option to work with an After-SQL, You have to make sure that a row is actually retrieved. Otherwise the Stage assumes there is nothing to do and skips the After-Sql.
"It is not the lucky ones are grateful.
There are the grateful those are happy." Francis Bacon
nvalia
Premium Member
Premium Member
Posts: 180
Joined: Thu May 26, 2005 6:44 am

Post by nvalia »

Hi BI-RMA,

The select statement does retrieve 1 row..In fact I also tried to return 100 rows for testing but the After Sql still did not execute!
Post Reply