Page 1 of 1

Hashed File

Posted: Mon Apr 14, 2008 8:45 pm
by Pk39565
If i am having lookup on hashed file in multiple jons will performance impact badly?

Ofcourse if we do parallel activity the performance may reduce but question here will this impace severly?

Posted: Mon Apr 14, 2008 8:57 pm
by ray.wurlod
Compared to what?

What other strategy do you propose for doing the checking of keys/returning of looked-up values? What "impact on performance" would your alternative strategy have? Indeed, what do you mean by "performance" in an ETL context?

Posted: Mon Apr 14, 2008 9:07 pm
by Pk39565
Suppose

If we are making a look up on a hased file A in a multiple jobs and all running parallel

Or
Do we have to run job which are making a look up on hashed file A in a sequential manner,

Which one of the abovve is best?

Posted: Mon Apr 14, 2008 9:58 pm
by ray.wurlod
Best according to what criteria?

If your criterion is total execution time being as short as possible, and you have the power in the machine, then as many jobs as you like can run at the same time and read from the same hashed file. There is no problem, or overhead, with doing this.

If you are enabling read cache, the question then arises as to whether or not you've set up shared caching capability. Read dsdskche.pdf (one of your installed DataStage manuals) for more information on this concept.

Posted: Mon Apr 14, 2008 10:01 pm
by Pk39565
Thanks ray..