key expression in transformer

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
spracht
Participant
Posts: 105
Joined: Tue Apr 15, 2003 11:30 pm
Location: Germany

key expression in transformer

Post by spracht »

Say, you had a hash lookup with 3 key columns. You would want to gradually loosen the matching criteria from all the 3 columns to 1 column, but still using the same hashfile. How would you have to specify the key expression in a transformer, if you wanted to unconditionally evaluate the matching against this column to 'true' or 'found'. Is this possible at all?

Stephan
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

You need three separate lookups, one based on three key columns, one based on two key columns and one based on one key column. Or perhaps six.
The technique is to lie to DataStage about what the key columns are.
If you specify that there are three key columns, you will need three reference key expressions.
If you specify that there are two key columns (even though there are really three key columns in the hashed file), you will need two reference key expressions.
If you specify that there is one key column (even though there are really three key columns in the hashed file), you will need one reference key expression.
Note, however, that you can not use the read cache unless every column of the hashed file is referred to. You will be limited to disk access speeds.

Worse than that, unless you index the individual columns (and, perhaps, the two-column combinations that you want to use but in this case you will have had to have created the hashed file as a UV table) you will suffer table-scan access speeds.

But, yes, what you want to do is possible.

Ray Wurlod
Education and Consulting Services
ABN 57 092 448 518
spracht
Participant
Posts: 105
Joined: Tue Apr 15, 2003 11:30 pm
Location: Germany

Post by spracht »

Ray

I assume, that it is sufficient to uncheck the 'Preload file to memory option' to not use the read cache you are talking about?

Nevertheless, DataStage keeps refusing to do as you suggested.


Stephan
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Try using a UV stage rather than a hashed file stage at least for the partial key lookups. Data Source Name is "localuv". A hashed file stage insists on the entire key (I should have mentioned this; it's been discussed on this forum in the past).

And, yes, clearing the pre-load to memory check box disables read caching. This is reported in the job log.
Paul Preston
Participant
Posts: 24
Joined: Wed Apr 02, 2003 7:09 am
Location: United Kingdom

Post by Paul Preston »

Hi

when creating and using uv tables for use in lookups you specify which columns form the key. Please be aware that this does not mean that an index is created on those columns and if there are many rows in the uv table then performance will suffer. It is possible to create an index on the uv table using the "create index" syntax at the command level interface. Creating indexes on the columns you want to use for lookup might improve your performance alot if you have anything other than small amounts of data in the uv table.
spracht
Participant
Posts: 105
Joined: Tue Apr 15, 2003 11:30 pm
Location: Germany

Post by spracht »

I'm a bit reluctant to use uv-stages. On our system (HP-UX 11), they tend to cause even more problems than hashfile stages, especially regarding 'Abnormal termination'.

Stephan
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

That's very odd, because they refer to the same thing. A hashed file is how a UniVerse table is implemented. The only real difference is that, if you create/access it via a UV stage, it has entries in the system tables, and it enforces data types.
If you really must do this, you don't really have much choice; the hashed file stage accesses via the entire primary key.
tonystark622
Premium Member
Premium Member
Posts: 483
Joined: Thu Jun 12, 2003 4:47 pm
Location: St. Louis, Missouri USA

Post by tonystark622 »

Stephan,

You could create one new field - it would contain the contents of the two columns (separated by something like a tilde). This could be done when you create the hash file. Then build two new indexes. One on the two column key field and one on the single column that you need to loosen to.

You should be able to use the hash each of the key sets your require.

Hope this helps,
Tony
Post Reply