DRS stage and user defined SQL (update stmt)

Xanadu · Post by **Xanadu** » Wed Aug 25, 2004 7:12 am

Hello all,
I am using a DRS stage and writing a update stmt as a "before" SQL stmt...
this is the statement I am writing :

update TableA
set Col17=:17
where Col1=( select max(Col1) from Table A where Col2 =:2)

It generates an error :
Incorrect syntax near ':'.

Isn;t this how values are to be passed in a DRS stage, I mean using the ':' ? :-S
Am I missing something ?

Thanks
-Xan

Xanadu · Post by **Xanadu** » Wed Aug 25, 2004 8:00 am

Just an update:

I actually moved that update stmt from "before" level and wrote an user-defined SQL stmt to write to the database.
and now I get this error :
*********
SQL data type '%1' defined on column %2 is not supported.
*********

the user-defined SQL is :

update TableA
set Col17=:17
where Col1=( select max(Col1) from Table A where Col2 =:2)

insert into TableA
(col1,col2.....col18)
values (:1,:2..........:17)

This error occurs only with the insert stmt (Now the update stmt works fine. And this insert stmt is EXACTLY same as the stmt generated by Datastage if I dont use the user-defined SQL.
TO add that update stmt I change the update action to user-defined SQL and this error comes up...

any inputs ?
Thanks
-Xan

sumitgulati · Post by **sumitgulati** » Thu Aug 26, 2004 4:32 pm

Remove the update statement and run the job with DataStage generated insert statement. If it works fine then revert back the changes and just put a semicolon (";") after your update statement. Then try running the job.

I am not sure if it will work but just give it a shot.

Regards,
Sumit

Xanadu · Post by **Xanadu** » Thu Aug 26, 2004 5:51 pm

Thanks for ur response Sumit.
but no it doesn't work. My database is unable to recognise the ":" symbol when I use the user defined SQL...I guess when i say user defined SQL, DS "passes the the query directly to the database...."
As I said the same query that is "generated" by Datastage doesn't work if I put it in "user defined SQL"..

thanks
-Xan

vmcburney · Post by **vmcburney** » Thu Aug 26, 2004 7:27 pm

You are on the right track. Your first attempt to use the Before SQL tab wont work because the SQL here is run once per execution of the job and not once per row. You cannot bind row columns to before SQL because it isn't executed for each row.

I would split your database writes into two output streams, one doing the inserts and the other doing the updates. Set the link ordering so the update happens first and set the array size to 1 so it commits after each row.

Now with your update stream you can add an extra transformer and do a lookup to retrieve the Col1 values. Consider extracting via a group by statement all Col2, max(Col1) values into a hash file, this will be at least 100 times faster then trying to do an aggregated lookup for each individual row.

Do a hash file lookup against Col2 to retrieve the Col1 value and make this the key field of your output. Send Col1 and Col17 values to an update stage, with Col1 as the key field it will automatically create your update statement for you with Col1 as the lookup and Col17 as the update.

This is a more efficient and more maintainable approach. Custom SQL for database output stages are notoriously hard to support in the long term as the developers who follow you struggle to work out the syntax.

chulett · Post by **chulett** » Thu Aug 26, 2004 9:00 pm

The other classic problem you are running into is trying to use different sets of columns in the same link, which is only an issue when you switch to User Defined SQL.

If you have 18 columns defined in your link, you must have parameter markers for all 18 columns in the SQL statement. If you need to insert 18 fields into a record but only use 2 in the update statement, they must be on seperate links and use the appropriate number of columns (and key fields) for each one.

Split your stream as Vince recommends. Ideally, you should be checking to know whether you need to do inserts or updates, rather than letting the database cover your butt if you get it wrong. BTW, you've never mentioned what database you are accessing via the DRS stage - Oracle?

vmcburney wrote:This is a more efficient and more maintainable approach. Custom SQL for database output stages are notoriously hard to support in the long term as the developers who follow you struggle to work out the syntax.

I completely agree with Vince. The people I mentor get taken to task when they use User Defined SQL anywhere and it isn't absolutely necessary to support the query needed. Custom SQL is a total pain in the patootie to maintain or to make changes to (eclipsed only by the Fully Generated option, yurk) and it's too prone to error as well. Stick with Column Generated, you'll be much happier in the long run - and so will those who come after you.