Page 1 of 1

Data Transfer rate is too low

Posted: Fri Jan 21, 2005 2:20 am
by Nripendra Chand
Hi all,

in one my job data is going from sequential file to a hash file. It is simply one to one mapping. I have specified following specifications in Hash stage:
Create Hash File: Checked
Clear before creation: Checked
Rest option are defaulted.
But data transfer b/w these two stages is 300 to 400 rows per seconds which is too low. There are 100000 records in Sequential file.
I'm unable to find the root cause.
I need some suggestions to improve performance in this case.

Regards,
Nripendra Chand

Posted: Fri Jan 21, 2005 2:33 am
by Luk
maybe there are some warnings generating to your log. I noticed that generating many warnings decresing performance.

LUK

Posted: Fri Jan 21, 2005 2:46 am
by Nripendra Chand
No, just 2 warnings are coming. and i have put auto purge log file option for this job. Even then data transfer rate is too low.

Posted: Fri Jan 21, 2005 3:01 am
by ArndW
When you create the table use a larger minimum modulus - some prime number like 997 should be better than the default of 1. If performance is very important for this you can go a step further and choose a specific hash algorithm and file type; there are other threads here which describe that in better details than I can do.

Posted: Fri Jan 21, 2005 3:55 am
by ray.wurlod
First, what rate do you get if you add a constraint of @FALSE into the Transformer stage? This will prevent any rows from being loaded into the hashed file, but will demonstrate an upper limit on the rate at which DataStage can read a sequential file on your platform.

Second, remove the constraint, and enable write cache in the Hashed File stage. This eliminates the need to write to random locations on disk; all writes are to memory (provided the hashed file is not too large, and at only 1 lakh records it shouldn't be) then the groups (pages) of the hashed file are flushed to disk in sequential order.

Finally, as Arnd suggested, calculate the desired final size of the hashed file and specify that size in the creation options. You can use the Hashed File Calculator on your DataStage CD (unsupported options) to help you to calculate the size, and the tuning parameters. You might also consider using a static hashed file, depending on the pattern of key values in your data.

Contrary to what Arnd suggested, a power of 2 is actually the best choice for MINIMUM.MODULUS with dynamic hashed files. A prime number is generally thought to be best for static hashed files, though it is arguable that a power or multiple of 10 works best for Type 2 (and for dynamic hashed files with SEQ.NUM hashing algorithm). All of that assumes that the record sizes are reasonably similar.