Is there a way to force a record by record flush for sequential files?
In c, I'd use fflush( seqfile );
I would like a better idea of the output rate - is it steady? bursty?
My job is buffering about 300 records between flushes,
and at 3 to 5 minutes per flush, that seems like a coarse
average.
I suppose get a better rate with a peek-stage and pulling time stamps
from the job log. *shrug* I'm just looking for ideas
("more than 1 way to do it" and all that).
John G.
sequential files - flush option?
Moderators: chulett, rschirm, roy
Buffering is always a tradeoff - by increasing the buffer sizes you also increase throughput but, in the case of failure, you will lose more data. Since DataStage jobs generally are all-or-nothing there is no benefit in force-writing data more frequently. But if you do require more frequent flushing to disk you can play with the APT settings for:
APT_BUFFER_MAXIMUM_MEMORY, APT_BUFFER_MAXIMUM_TIMEOUT, APT_BUFFER_DISK_WRITE_INCREMENT, APT_BUFFERING_POLICY
APT_BUFFER_MAXIMUM_MEMORY, APT_BUFFER_MAXIMUM_TIMEOUT, APT_BUFFER_DISK_WRITE_INCREMENT, APT_BUFFERING_POLICY
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>
Ahhh. That helps my perspective.
Thank you!
I was focusing on designer seqfile options, like this:
Thank you!
I was focusing on designer seqfile options, like this:
- Output->Properties->Source->File=xyz.dat
Output->Properties->Source->ReadMethod=...
ArndW wrote:APT_BUFFER_MAXIMUM_MEMORY,
APT_BUFFER_MAXIMUM_TIMEOUT,
APT_BUFFER_DISK_WRITE_INCREMENT,
APT_BUFFERING_POLICY