Hello all,
We have a multi-instance dataset to Oracle job which runs slow if I run it against all database instances in parallel. However it runs quite fast when I run it against single database instance using bulk load?
My job design is pretty straight forward.
DataSet ----> Transformer ----> Oracle Connector (Bulk Load)
I am not using copy stage to load multiple databases as i am doing multi-instance.
Do you have any ideas to get the same throughput as if i am running against single database instance while doing multi-instance run in parallel?
Thanks & Regards
Koti
Multi-instance dataset to Oracle job - slow
Moderators: chulett, rschirm, roy
-
- Premium Member
- Posts: 258
- Joined: Tue Jul 04, 2006 10:35 pm
- Location: Toronto
Re: Multi-instance dataset to Oracle job - slow
Hi,
As a first step, you can or ask your admins to monitor the resources on Datastage server and the Oracle server to find out the bottle neck and then take the necessary actions.
As a first step, you can or ask your admins to monitor the resources on Datastage server and the Oracle server to find out the bottle neck and then take the necessary actions.
Cheers,
Samyam
Samyam
Yes that's Right Craig
Yes that's Right Craig, I am loading the same table in separate databases
Hi,
As you said, the connections are from one host to multiple target host. There might be bottleneck at the Datastage server side.
It may hit the limit of Datastage server host bandwidth. (I pretty sure, the bandwidth usage won't reach to 100MiB/s, if you are on gigabit network. The actual maximum bandwidth of function of the network latency and TCP recevie/send buffer on both side. ). If you reached 1/3 of the eithernet bandwidth (30MiB/s on giga ethernet), I suggest you contact network admin to tune the network performace.
Another thing I noticed is that when use BulkLoad (with DB2), the varchar field are transferred through network in FULL-length.
For example, if I have a table to load with BULD load by DB2 connector.
The table structure is like:
num int,
desc varchar(255)
The source is also a DB2 connection from another database. And the maximun length of value of desc is 10
on Datastage host, I can see the outbound network usage be 10+ times than the inbound.
That means, if you have a big varchar field, you need either tune the varchar maximum length, or tune the network bandwidth, or avoid to use BULK.
On Linux/Unix, network bandwidth usage can be seen by:
sar -n DEV 3 #instance , every 3 second
sar -n DEV # history.
As you said, the connections are from one host to multiple target host. There might be bottleneck at the Datastage server side.
It may hit the limit of Datastage server host bandwidth. (I pretty sure, the bandwidth usage won't reach to 100MiB/s, if you are on gigabit network. The actual maximum bandwidth of function of the network latency and TCP recevie/send buffer on both side. ). If you reached 1/3 of the eithernet bandwidth (30MiB/s on giga ethernet), I suggest you contact network admin to tune the network performace.
Another thing I noticed is that when use BulkLoad (with DB2), the varchar field are transferred through network in FULL-length.
For example, if I have a table to load with BULD load by DB2 connector.
The table structure is like:
num int,
desc varchar(255)
The source is also a DB2 connection from another database. And the maximun length of value of desc is 10
on Datastage host, I can see the outbound network usage be 10+ times than the inbound.
That means, if you have a big varchar field, you need either tune the varchar maximum length, or tune the network bandwidth, or avoid to use BULK.
On Linux/Unix, network bandwidth usage can be seen by:
sar -n DEV 3 #instance , every 3 second
sar -n DEV # history.