odbc connector running too slow

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
AmeyJoshi14
Participant
Posts: 334
Joined: Fri Dec 01, 2006 5:17 am
Location: Texas

odbc connector running too slow

Post by AmeyJoshi14 »

DS Version 8.7

Hi Experts,


We are using ODBC connector to fetch the record from AS/400 table( through driver VMdb225.so) and insert into SQL Server again by using ODBC connector(through driver VMsqls25.so).
Its taking approximately 24 minutes to load 1.2 million records. :x

I have tried to change the record count value from 0 to 20000 and array size from 2000 to 200000 but still the average execution time is approximately 24 minutes for the same table.

I also ran the job in 2 nodes as well as on 4 nodes, but there is no significant change in performance( still it takes around 24 min). :shock:

Is there any other parameter which we have to change inorder to increase the performance?

Appreciate all your help!!
Thanks in Advance!!
PaulVL
Premium Member
Premium Member
Posts: 1315
Joined: Fri Dec 17, 2010 4:36 pm

Post by PaulVL »

Are you doing a partitioned read from your source?

Have you parallelized your stage for inserting into SQL server?
BI-RMA
Premium Member
Premium Member
Posts: 463
Joined: Sun Nov 01, 2009 3:55 pm
Location: Hamburg

Post by BI-RMA »

Are there any active indexes on the target-table while loading?
"It is not the lucky ones are grateful.
There are the grateful those are happy." Francis Bacon
kwwilliams
Participant
Posts: 437
Joined: Fri Oct 21, 2005 10:00 pm

Post by kwwilliams »

Isolate your problem first. Is it the read or the write that is causing the bottleneck. The easiest way to isolate the issue is break the job into two different jobs, one that reads a table to a data set and another that writes to SQL Server.

Then work on the issue optimizing the read and/or write depending upon which presents itself as the problem.

To really help with tuning past that point, you will need to provide some details that explain what you are doing more thoroughly now. Is your read a select without a where clause? Have you tried the parallelization options in the ODBC Connector for read? For your load are you truncating the table and then loading it or are you performing updates?? For straight inserts the biggest gain I have found is setting the odbc.ini entry to perform bullk insert.

Based upon your driver the bulk load optiosn will work. I spell it out in an earlier post:

http://dsxchange.com/viewtopic.php?t=14 ... light=odbc

That is just one change and may not even be related to the issue you are having which is why it is important that you start out by isolating the issue first.
AmeyJoshi14
Participant
Posts: 334
Joined: Fri Dec 01, 2006 5:17 am
Location: Texas

Post by AmeyJoshi14 »

BI-RMA wrote:Are there any active indexes on the target-table while loading?
Thanks for the help!

There are no index on the target table. It is simple truncate and load process....
AmeyJoshi14
Participant
Posts: 334
Joined: Fri Dec 01, 2006 5:17 am
Location: Texas

Post by AmeyJoshi14 »

kwwilliams wrote:Then work on the issue optimizing the read and/or write depending upon which presents itself as the problem.

To really help with tuning past that point, you will need to provide some details that explain what you are doing more thoroughly now. Is your read a select without a where clause? Have you tried the parallelization options in the ODBC Connector for read? For your load are you truncating the table and then loading it or are you performing updates?? For straight inserts the biggest gain I have found is setting the odbc.ini entry to perform bullk insert.
Thanks for the help!

Our process is truncate and load. There are no updates, indexes on the table. This process is to pull all the records(no where clause) from AS/400 and populate into SQL Server.
I will isolate the read and write process and will check which process is taking lot of time.
Thanks again for all the help.
http://findingjobsindatastage.blogspot.com/
Theory is when you know all and nothing works. Practice is when all works and nobody knows why. In this case we have put together theory and practice: nothing works. and nobody knows why! (Albert Einstein)
AmeyJoshi14
Participant
Posts: 334
Joined: Fri Dec 01, 2006 5:17 am
Location: Texas

Post by AmeyJoshi14 »

Thanks Keith for the suggestions...

Below are the modifications which I have done to the job:-
1. Change in the DNS to enable bulk load

Code: Select all

EnableBulkLoad=1 
BulkBinaryThreshold=32 
BulkCharacterThreshold=-1 
BulkLoadBatchSize=1024 
BulkLoadOptions=0 
2. Reading parallelly from the source by "Enable Partitioned reads" = Yes and "partition read methods"=Modulus

3. Configuration file = 12 nodes

By making these modifications the job ran in 4 minutes. :D
Also I seperated the job to find the bottleneck and its AS/400 read job which takes long time (3 min:42 sec) :roll:
Is there any other way where in we can decrease the read time from AS/400(through odbc connector)? :?:

Appreciate all your help on this!!!

Thanks!
PaulVL
Premium Member
Premium Member
Posts: 1315
Joined: Fri Dec 17, 2010 4:36 pm

Post by PaulVL »

Optimize the SQL.

Only extract the columns that you need.

Talk to your DBA to see if the table can be "reorged or runstat". (I'm not a DBA but I know they can work some magic here and there to improve performance as well.)

===========

You should have made one change at a time, to see which one made the difference.
Post Reply