system cached hash files for realtime EDW solution

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
michaeld
Premium Member
Premium Member
Posts: 88
Joined: Tue Apr 04, 2006 8:42 am
Location: Toronto, Canada

system cached hash files for realtime EDW solution

Post by michaeld »

Can anybody comment on how effective system cached hash files are in production? Are they reliable for large lookups (10 milion records)? We're thinking of using them for a realtime EDW solution instead of sparse lookups.

Any idea why they are not supported in parallel jobs?
Mike
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

They are not supported in parallel jobs because the hashing algorithms are buried deep within the DataStage server engine, which is not necessarily accessible from every node on which a parallel job may be executing.

System cached won't make any difference if only one job is accessing the hashed file, and won't make any difference if the hashed file is too large to load into your read cache (in which case cache won't be used at all).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply