Page 1 of 1

Load data into SQL server

Posted: Thu Dec 20, 2007 12:53 am
by sonia jacob
Hi,

On a daily basis 20 million needs to updated/ inserted into a table in SQL server 2005 database. These tables have foreign key references too.

I do a delete and then insert. I have set the APT_RDBMS_COMMIT_ROWS to 10,000. I have user defined SQL statement. But it takes 5 hours to complete the load.

DBA on viewing the Performance at the Database side mentioned that the records are being comitted one row at a time.

I have gone through the forum, searching related posts. i saw details on reading as below. This was for DB2 and was not sure whether its applicable to SQL server

select *
from myschema.mybigtable
where nodenumber(account_id) = current node

I have split the processing and insert jobs and I see that the processing completes in 30 minutes while the INSERTS takes 5 hrs :(

Any suggestions would greatly appreciated. Thanks in advance

Posted: Thu Dec 20, 2007 1:45 am
by ray.wurlod
What stage type are you using? If an Enterprise stage, what write mode are you using? Have you considered writing to a text file (or an FTP stage) and loading the SQL Server table using the bulk loading utility?

Posted: Thu Dec 20, 2007 6:58 am
by sonia jacob
ray.wurlod wrote:What stage type are you using? If an Enterprise stage, what write mode are you using? Have you considered writing to a text file (or an FTP stage) and loading the SQL Server table using the bulk loa ...
ODBC Enterprice
Write Method = Upsert
Upsert Mode = Insert only

I did think about bulk load, but did not experiment with the same. I was hoping to resolve the issue with the "COMMIT LEVEL" / number of records commiting at a time and hoping that the issue would go away.

Like indicated in my original post, I have two jobs, 1. that does all the lookups, validations and creates a Dataset. 2. This Dataset is used to load the table. In some of the cases wherein the record count of the source file went beyond 20 million, the DS environment would not allow me to create a Dataset itself (runs out of space).

Also to add to all the issues I am currently facing, we received a new file with 91 million records. And the job failed at the MERGE Stage (Input partitioned as HASH on a single key column) / LOOKUP Stage

Posted: Thu Dec 20, 2007 3:10 pm
by ray.wurlod
I don't have access to enterprise edition at the moment. Does the ODBC Enterprise stage have another write mode? Something like "Write" or "Direct"?

Please start a separate thread for the merge failure. This will assist future searchers. Don't forget to supply as much information as possible, in particular job design and error/warning messages.

Posted: Fri Dec 21, 2007 12:11 pm
by sonia jacob
ray.wurlod wrote:I don't have access to enterprise edition at the moment. Does the ODBC Enterprise stage have another write mode? Something like "Write" or "Direct"?

Please start a separate thread for the merge f ...
Yes, there is a write method - "WRITE" with Write Modes - Append, Create, Replace, Truncate

The reason for me using UPSERT is to have user defined SQL statements, wherin I convert a varchar to integer.

I will start a new thread on MERGE.

Thanks

Posted: Fri Dec 21, 2007 1:26 pm
by ray.wurlod
If you convert the data type with a Modify stage you don't need SQL and can use Write write mode.

Posted: Tue Jan 22, 2008 12:30 pm
by sonia jacob
ray.wurlod wrote:If you convert the data type with a Modify stage you don't need SQL and can use Write write mode. ...
Thanks Ray. Here is what we tried

a. DBA created a new QA enviroment with more space and indepedant from other apps
b. Tables created without index (foreign and primary key constraints)

c. ETL jobs modified Write Method as suggested by you


After all the changes we got to load around 2 million records in 40 mts. This is better than 90 mts., what we saw a while back.

I am leaving this post still open, as we are running couple of test now.

Thanks again.

Posted: Tue Jan 22, 2008 9:26 pm
by sonia jacob
sonia jacob wrote:
ray.wurlod wrote:If you convert the data type with a Modify stage you don't need SQL and can use Write write mode. ...
Thanks Ray. Here is what we tried

a. DBA created a new QA enviroment with more space and indepedant from other apps
b. Tables created without index (foreign and primary key constraints)

c. ETL jobs modified Write Method as suggested by you


After all the changes we got to load around 2 million records in 40 mts. This is better than 90 mts., what we saw a while back.

I am leaving this post still open, as we are running couple of test now.

Thanks again.
Like indicated in the previous post, we tried out couple of test runs.
1. With 2 million with Write Method took 40 minutes
2. With 4 million records with Upsert method took 90 mts.

But with the monitoring turned on, the DBA reported that they see similar commit level as before. :cry:
Here is a sample statement they see over and over again, when ETL inserts records to the database.
exec sp_execute 1,'value1','value2','value3'

Any idea as to this behaviour of SQL server.??