Page 1 of 1

hashed file

Posted: Mon May 29, 2006 4:30 am
by suresh_dsx
hi guys ,

i have job like this



seq file(10records)-------->tranformer------>tranformer------>target

............................................||....................||
............................... hashed fil (same .
.............................. records what
.............................. i have in source)


lookup with two keys


i am getting output...i got a quary , is it neccssary preload the data in to the memory in this situation.

i know (total no of distict rows/total no of rows in the table)....
if i use that option it occupy lot of memory....performance


can any one help...

Posted: Mon May 29, 2006 7:21 am
by kcbland
Preload is only beneficial for performance if the same rows are going to be referenced multiple times. If the reference only occurs once for each row in the reference, it's not beneficial.

The more times a row is referenced, the more time you have "saved" by not doing the mechnical reference, but used the memory reference.

Posted: Mon May 29, 2006 2:52 pm
by ray.wurlod
For only 10 rows to be processed you may as well perform lookups directly against the target table, unless you are constructing a design that is staging data (which is not apparent from your design). Or, since you know these 10 rows, just preload their key values into the hashed file. For so few rows memory cache would not make any difference; the hashed file would only require a single page or so, and thus would tend to be resident in memory when in use.

Hint
When posting job designs, surround in Code tags and use Preview to get the layout right.