Page 1 of 1

key expression in transformer

Posted: Wed Jul 30, 2003 11:18 pm
by spracht
Say, you had a hash lookup with 3 key columns. You would want to gradually loosen the matching criteria from all the 3 columns to 1 column, but still using the same hashfile. How would you have to specify the key expression in a transformer, if you wanted to unconditionally evaluate the matching against this column to 'true' or 'found'. Is this possible at all?

Stephan

Posted: Thu Jul 31, 2003 1:00 am
by ray.wurlod
You need three separate lookups, one based on three key columns, one based on two key columns and one based on one key column. Or perhaps six.
The technique is to lie to DataStage about what the key columns are.
If you specify that there are three key columns, you will need three reference key expressions.
If you specify that there are two key columns (even though there are really three key columns in the hashed file), you will need two reference key expressions.
If you specify that there is one key column (even though there are really three key columns in the hashed file), you will need one reference key expression.
Note, however, that you can not use the read cache unless every column of the hashed file is referred to. You will be limited to disk access speeds.

Worse than that, unless you index the individual columns (and, perhaps, the two-column combinations that you want to use but in this case you will have had to have created the hashed file as a UV table) you will suffer table-scan access speeds.

But, yes, what you want to do is possible.

Ray Wurlod
Education and Consulting Services
ABN 57 092 448 518

Posted: Thu Jul 31, 2003 2:14 pm
by spracht
Ray

I assume, that it is sufficient to uncheck the 'Preload file to memory option' to not use the read cache you are talking about?

Nevertheless, DataStage keeps refusing to do as you suggested.


Stephan

Posted: Thu Jul 31, 2003 5:19 pm
by ray.wurlod
Try using a UV stage rather than a hashed file stage at least for the partial key lookups. Data Source Name is "localuv". A hashed file stage insists on the entire key (I should have mentioned this; it's been discussed on this forum in the past).

And, yes, clearing the pre-load to memory check box disables read caching. This is reported in the job log.

Posted: Tue Aug 05, 2003 6:57 am
by Paul Preston
Hi

when creating and using uv tables for use in lookups you specify which columns form the key. Please be aware that this does not mean that an index is created on those columns and if there are many rows in the uv table then performance will suffer. It is possible to create an index on the uv table using the "create index" syntax at the command level interface. Creating indexes on the columns you want to use for lookup might improve your performance alot if you have anything other than small amounts of data in the uv table.

Posted: Tue Aug 05, 2003 7:06 am
by spracht
I'm a bit reluctant to use uv-stages. On our system (HP-UX 11), they tend to cause even more problems than hashfile stages, especially regarding 'Abnormal termination'.

Stephan

Posted: Tue Aug 05, 2003 3:59 pm
by ray.wurlod
That's very odd, because they refer to the same thing. A hashed file is how a UniVerse table is implemented. The only real difference is that, if you create/access it via a UV stage, it has entries in the system tables, and it enforces data types.
If you really must do this, you don't really have much choice; the hashed file stage accesses via the entire primary key.

Posted: Tue Aug 05, 2003 4:28 pm
by tonystark622
Stephan,

You could create one new field - it would contain the contents of the two columns (separated by something like a tilde). This could be done when you create the hash file. Then build two new indexes. One on the two column key field and one on the single column that you need to loosen to.

You should be able to use the hash each of the key sets your require.

Hope this helps,
Tony