Problem of performances with DB2

auvray.muriel · Post by **auvray.muriel** » Tue Feb 24, 2004 1:12 pm

Hello,

I have a problem of performances during the treatments between a flat file and DB2. (approximately 100 lines per seconds).

In my treatments, I read the file (1 300 000 lines), there are one or two lookups, and then I write in the table (update or insert).

I am obliged to pass by update or insert, because I have data for which I must insert a date according to a criterion, therefore not of possibility of passing by the loader.

Does somebody have infos?

Thank you very much

Muriel

ogmios · Post by **ogmios** » Tue Feb 24, 2004 2:36 pm

100 rows per second seem to be about right when you use a couple of lookups. As a test (on a test database) remove the lookups and check the speed, depending on your system you should be able to get between 1000 and 2000 rows per second.

The trick we usually do is to load the table to a (staging) table in DB2 without any lookups and then in another job do the lookups in a single big sql query using a select on the table you loaded and 2 left outer joins to bring in the lookup data (so using 1 stage with a big query), this way you can achieve speeds between 200 and 800 rows per second.

Hint: you probably can't now because of the updates but never ever use the DataStage DB2 bulk loader. This stage has some very big disadvantages.

Ogmios

roy · Post by **roy** » Wed Feb 25, 2004 1:04 am

Hi,
2 lookups dropping to 100 rows per second from over 1000 seems curious to me, unless you don't preload the hash files to memory.
IMHO, you can do 1 of the following:
1. D&C = try to split the process to several prallel ones.
2. try to figure out why your preformance is so poor and try to improve it.
for the latter we need more info to try and give you ideas on what might be wrong.

IHTH,

auvray.muriel · Post by **auvray.muriel** » Thu Feb 26, 2004 12:52 pm

Thank you with all.

I will wish to know the very large disadvantages of the DB2 bulk loader.

I carried out a test such as (Update or Insert and commit = 1000) :

Sequential File --------> Transformer ----------> Plugin DB2

the performances are the same ones (200 lines second)

It does not have no transformations there complexes in transforming it.

Is there a parameter setting particular to set up on DB2 (for recall all envirronement is on NT ==> Datastage and DB2)?

Thanks

Muriel[/code]

ogmios · Post by **ogmios** » Thu Feb 26, 2004 1:36 pm

Disadvantages of the DB2 bulk loader stage:
* makes a control file that clearly shows the passwords
* Has problems with non local DB2 systems (tries to write to directory structures that it fetches from the local system, when the local and remote system do not have the same layout....)
* Does not expose all possible switches of db2load, which can really affect perfomance
* Has the unusual habit of not aborting the job when the db2 load fails, just a warning. I have no idea why they built it this way.

Ogmios

ogmios · Post by **ogmios** » Thu Feb 26, 2004 1:40 pm

For the performance issues, this is where the black magic starts.

What I "usually" do: load a file with small width (little columns) and do a straight load in DB2 (without transformations or constraints). This will give you a good idea of the maximum throughput possible.

If it does not differ much from your 200 rows per seconds you would have to go searching elsewhere maybe involving your DBA's.

And be careful with transformations as even seemingly harmless little transformations can be a big bottleneck. Also wide rows can cause performance to degrade drastically (the more data DataStage has to processs the slower it is).

For the lookups are you using "ODBC lookups" or via hash files?

Ogmios

auvray.muriel · Post by **auvray.muriel** » Thu Feb 26, 2004 1:53 pm

I will test the loading without constraint and transformations.

For the lookups, I pass by the hashfiles, and there I do not have problems of performances.

The only problems of performances, are at the time of the writing in the data base...

Muriel

ray.wurlod · Post by **ray.wurlod** » Thu Feb 26, 2004 3:26 pm

ogmios wrote:* Has the unusual habit of not aborting the job when the db2 load fails, just a warning. I have no idea why they built it this way.

It is one of my personal standards that nothing ever aborts. Nothing even generates a warning unless something needs to be warned about, in which case it is picked up in the controlling process and handles in an appropriate fashion - which might include not doing any further processing, but will include notifying someone that there's been a problem.
No need ever to have to reset jobs, unless DataStage has detected a condition that even it can't handle.
In the case of bulk loaders, the controlling job checks the bulk loader's log and/or error file, even if there have been no warnings. It's usually enough to grep for a keyword like ERROR.

DSXchange

Problem of performances with DB2

Problem of performances with DB2

Re: Problem of performances with DB2