system cached hash files for realtime EDW solution

michaeld · Post by **michaeld** » Wed Apr 29, 2009 1:42 pm

Can anybody comment on how effective system cached hash files are in production? Are they reliable for large lookups (10 milion records)? We're thinking of using them for a realtime EDW solution instead of sparse lookups.

Any idea why they are not supported in parallel jobs?

ray.wurlod · Post by **ray.wurlod** » Wed Apr 29, 2009 5:22 pm

They are not supported in parallel jobs because the hashing algorithms are buried deep within the DataStage server engine, which is not necessarily accessible from every node on which a parallel job may be executing.

System cached won't make any difference if only one job is accessing the hashed file, and won't make any difference if the hashed file is too large to load into your read cache (in which case cache won't be used at all).