Problem in getting data from Oracle to DS : Too Slow

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Probably worth starting a new thread on this one.
Quick answer: hashed files work best when the average record size is small (less than 10% of group size) with the smallest possible variance, and when there is maximum variability (randomness?) in key values.
To determine the counts of distinct values in each column (field) assumes that there is a constant structure in all records, which is not a requirement in hashed files; indeed, the repository structure depends on this. It is sufficient to gather statistics on physical record sizes.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply