Lookup 1 Key in a 2 Key Hash

endyl13 · Post by **endyl13** » Thu Jul 17, 2003 7:11 am

Hi,

My Hash File has 2 Keys (Hash.A and Hash.B), I like to do a lookup from a data file with the only 1 key (Data.A).
How can I get all value from the hash where the Hash.A = Data.A ?
What value should I put in the transformer when looking up to Hash.B ?
I tried to put @TRUE or Not(@NULL) or Not(0) without any success...

Thanks...

Regards

ND

inter5566 · Post by **inter5566** » Thu Jul 17, 2003 8:56 am

ND,
I don't think you will be able to do what your trying with a hash file. I may stand corrected after some of the more enlightened members chime in on this. To resolve your situation you may want to switch to feeding your transformer stage from a UV stage. With the UV stage, you can load it using one set of key columns and then reference it by a different set of keys. You can also obtain multiple key matches to increase your stream output (multirow lookup).

Hope this helps,

Steve

chulett · Post by **chulett** » Thu Jul 17, 2003 9:12 am

Depending on exactly how the Hash was created, you may be laboring under a misconception.

If you simply specified two columns as 'Key' when the hash was created, you do not have two separate keys in the hash - you have a hash with a single composite primary key. This should mean that the combination of A & B will be enforced to be unique and this can be easily tested by pushing some data into the hash and seeing what you end up with.

Sounds like what you may need is a Primary Key and an Alternate Index on the hash, and this is something that needs to be manually created, as far as I know. As Steve posted, there are other stages that allow multiple row returns, which is more than likely what you'd get when doing a lookup on a non-unique index. I don't have the syntax off the top of my head but you can search the forum and archive, or wait for more enlightened folks to chime in. [:)]

-craig

tonystark622 · Post by **tonystark622** » Thu Jul 17, 2003 9:15 am

ND,

If you store the hash file in the account, rather than in a directory, you can access the hash file using a Universe stage and putting the hash file name as the table name. You can specify just the one field as a key field in the meta data in the Universe stage and do the lookup that way.

Good Luck,
Tony

inter5566 · Post by **inter5566** » Thu Jul 17, 2003 9:40 am

Tony,

Thanks for the tip. It would be nice if we could do this from a hash stored in a directory path. We use some fairly large hash tables and prefer to clean them up from the UNIX side using a general cleaning script.

Thanks, it helped

Steve

chulett · Post by **chulett** » Thu Jul 17, 2003 9:47 am

I'm pretty sure you can, you just have to add a VOC entry for the Hash manually. Something like that, anyway. [:I] Ray has detailed this process in the past.

-craig

tonystark622 · Post by **tonystark622** » Thu Jul 17, 2003 10:12 am

Steve,

As Craig said, you can do this with the hash file in a directory other than account, but you have to create a VOC entry for it. Ray explained how to do this one time, but I don't have it handy. And I'm still learning Universe (still learning DataStage for that matter ), so I don't have this off the top of my head either...

Try searching the archives for hash and VOC maybe???

Also, as someone already stated, this lookup method can be VERY slow if you can't convince it to use existing indexes and you may have to create secondary indexes. Curiously enough, Universe requires you to CREATE.INDEX then BUILD.INDEX before you can use an index.

Good Luck,
Tony

kduke · Post by **kduke** » Thu Jul 17, 2003 11:27 am

ND

The VOC is like SYS.ALL_TABLES in Oracle. It controls which hash files are accessable directly by a project. You can create VOC records in the editor called ED or use SETFILE. If you created a hash file in the project for Customer then the VOC record would look like this at TCL:

ED VOC Customer
F
Customer
D_Customer

Below this project you would see these files. There would be a Customer directory with 3 files in it .Type30, DATA.30 and OVER.30. This is where the hash file data is stored. This is because the default hash file is DYNAMIC or type 30. The D_Customer file stores the column names for this hash file. You can view these columns names by typing in "LIST DICT Customer".

To access hash files that were created in a specific directory like "/u2" then the VOC record looks like this:

F
/u2/Customer
/u2/D_Customer

You can get help on ED by typing in "HELP ED" or "?" once you are in ED. The same is true for "HELP SETFILE".

Kim.

ray.wurlod · Post by **ray.wurlod** » Thu Jul 17, 2003 6:44 pm

There is a single command you can use, called SETFILE. It will prompt for any argument not given. Syntax is:

SETFILE pathname localname [OVERWRITING]

pathname is the pathname of the hashed file (data portion)
localname is the unqualified name that goes in the VOC file

Example

SETFILE /data03/hashedfiles/hshPerson hshPerson OVERWRITING

Ray Wurlod
Education and Consulting Services
ABN 57 092 448 518

ray.wurlod · Post by **ray.wurlod** » Thu Jul 17, 2003 6:50 pm

To answer the original question, all you have to do is lie. Don't declare Hash.B as a key column (in this job's Transformer stage).

However, this introduces the possibility that there is more than one row in the hashed file that meets your criterion (Data.A = Hash.A). A hashed file stage will only return the first matching row.

If you want all rows, use a UV stage rather than a hashed file stage, and check the multi-row result set check box. This will generate one row of output for each found row.

The other issue is performance. Because you're not accessing the hashed file with its entire key, it will not use hashing but will use a table scan instead (slow). You may need to create an index on the Hash.A column.

If the hashed file was created from a UV stage, a standard SQL statement is used, for example:
CREATE INDEX HASHA_IDX ON hashfile(Hash.A);

If the hashed file was created from a Hashed File stage, a slightly different method is used, involving two non-SQL commands:
CREATE.INDEX hashfile Hash.A
BUILD.INDEX hashfile Hash.A

Ray Wurlod
Education and Consulting Services
ABN 57 092 448 518

endyl13 · Post by **endyl13** » Thu Jul 17, 2003 8:51 pm

Thank you all for the detail explanations...
I have never tried the UV stage before, but I am going to try that approach.
Thanks again...

[;)]

Regards

ND