Improve performance with OraBulk

RAI ROUSES · Post by **RAI ROUSES** » Mon Jan 02, 2012 12:15 pm

Hello,
I have a job in Datastage version 7.5 that are making the join of three tables in OCI and then performs the loading of 50 million records to an Oracle table.

The source and target is in Oracle 11G and in the source we are using dblink.

Given the poor performance of the Insert and the large volume of data, we replaced the OCI stage for OraBulk, but without success, the performance is identical to the insert.

Does anyone have any suggestions to optimize the performance of the job?

pandeesh · Post by **pandeesh** » Mon Jan 02, 2012 12:28 pm

Have you tried the same in Unix script using sqlldr?
If so, how long it takes?

qt_ky · Post by **qt_ky** » Mon Jan 02, 2012 12:46 pm

Try landing the data to disk in the final stage of the first job with the joins. Then try a second job reading the data from local disk to do bulk load. Can you tell any difference in load time?

pandeesh · Post by **pandeesh** » Mon Jan 02, 2012 12:58 pm

"dblink" also increases the time.
Can you redesign your job without dblink?
Apart from that ,there are lot of parameters, database server location..etc..

chulett · Post by **chulett** » Wed Jan 04, 2012 7:24 am

He meant "factors" rather than literally "parameters".

Sqlldr isn't a magic bullet that makes any load faster, you need to determine if it is applicable depending on the nature of the load being performed. And how to improve any bulk load would be a good conversation to have with your DBA. They should be aware of all of the relevant factors at play in your specific case.

RAI ROUSES · Post by **RAI ROUSES** » Mon Jan 16, 2012 6:32 am

For the tests that we had made we had verified that the poor performance of the job is due to two factors:
- The query uses a dblink to source that determines the performance of the job
- The recording is too large, has more than 200 columns, and a large part of columns have the format varchar2(255)

There is no way to reduce the size of the field to load to the table, so the only way to increase the performance of the job will be to optimize the query of source. So the OraBulk can have a larger volume of data to load.
The gurus agree?

v2kmadhav · Post by **v2kmadhav** » Mon Jan 16, 2012 6:56 am

Rai, Sorry if its too late in the day to comment on this ..

Did you consider redoing the join in datastage, making best use of the parallelism etc.. perhaps even ading a few parallel hints on your source query

Also, you could write data into mulitple files in your first job and then have a clever script to load them using sqlldr parallely into the tables.

qt_ky · Post by **qt_ky** » Mon Jan 16, 2012 7:31 am

Did you try creating a separate job to do the Oracle load?

What is poor performance?

How long do the extract/join steps take?

How long does the load take?