Lookup 1 Key in a 2 Key Hash
Moderators: chulett, rschirm, roy
Lookup 1 Key in a 2 Key Hash
Hi,
My Hash File has 2 Keys (Hash.A and Hash.B), I like to do a lookup from a data file with the only 1 key (Data.A).
How can I get all value from the hash where the Hash.A = Data.A ?
What value should I put in the transformer when looking up to Hash.B ?
I tried to put @TRUE or Not(@NULL) or Not(0) without any success...
Thanks...
Regards
ND
My Hash File has 2 Keys (Hash.A and Hash.B), I like to do a lookup from a data file with the only 1 key (Data.A).
How can I get all value from the hash where the Hash.A = Data.A ?
What value should I put in the transformer when looking up to Hash.B ?
I tried to put @TRUE or Not(@NULL) or Not(0) without any success...
Thanks...
Regards
ND
ND,
I don't think you will be able to do what your trying with a hash file. I may stand corrected after some of the more enlightened members chime in on this. To resolve your situation you may want to switch to feeding your transformer stage from a UV stage. With the UV stage, you can load it using one set of key columns and then reference it by a different set of keys. You can also obtain multiple key matches to increase your stream output (multirow lookup).
Hope this helps,
Steve
I don't think you will be able to do what your trying with a hash file. I may stand corrected after some of the more enlightened members chime in on this. To resolve your situation you may want to switch to feeding your transformer stage from a UV stage. With the UV stage, you can load it using one set of key columns and then reference it by a different set of keys. You can also obtain multiple key matches to increase your stream output (multirow lookup).
Hope this helps,
Steve
Depending on exactly how the Hash was created, you may be laboring under a misconception.
If you simply specified two columns as 'Key' when the hash was created, you do not have two separate keys in the hash - you have a hash with a single composite primary key. This should mean that the combination of A & B will be enforced to be unique and this can be easily tested by pushing some data into the hash and seeing what you end up with.
Sounds like what you may need is a Primary Key and an Alternate Index on the hash, and this is something that needs to be manually created, as far as I know. As Steve posted, there are other stages that allow multiple row returns, which is more than likely what you'd get when doing a lookup on a non-unique index. I don't have the syntax off the top of my head but you can search the forum and archive, or wait for more enlightened folks to chime in. [:)]
-craig
If you simply specified two columns as 'Key' when the hash was created, you do not have two separate keys in the hash - you have a hash with a single composite primary key. This should mean that the combination of A & B will be enforced to be unique and this can be easily tested by pushing some data into the hash and seeing what you end up with.
Sounds like what you may need is a Primary Key and an Alternate Index on the hash, and this is something that needs to be manually created, as far as I know. As Steve posted, there are other stages that allow multiple row returns, which is more than likely what you'd get when doing a lookup on a non-unique index. I don't have the syntax off the top of my head but you can search the forum and archive, or wait for more enlightened folks to chime in. [:)]
-craig
-
- Premium Member
- Posts: 483
- Joined: Thu Jun 12, 2003 4:47 pm
- Location: St. Louis, Missouri USA
ND,
If you store the hash file in the account, rather than in a directory, you can access the hash file using a Universe stage and putting the hash file name as the table name. You can specify just the one field as a key field in the meta data in the Universe stage and do the lookup that way.
Good Luck,
Tony
If you store the hash file in the account, rather than in a directory, you can access the hash file using a Universe stage and putting the hash file name as the table name. You can specify just the one field as a key field in the meta data in the Universe stage and do the lookup that way.
Good Luck,
Tony
-
- Premium Member
- Posts: 483
- Joined: Thu Jun 12, 2003 4:47 pm
- Location: St. Louis, Missouri USA
Steve,
As Craig said, you can do this with the hash file in a directory other than account, but you have to create a VOC entry for it. Ray explained how to do this one time, but I don't have it handy. And I'm still learning Universe (still learning DataStage for that matter ), so I don't have this off the top of my head either...
Try searching the archives for hash and VOC maybe???
Also, as someone already stated, this lookup method can be VERY slow if you can't convince it to use existing indexes and you may have to create secondary indexes. Curiously enough, Universe requires you to CREATE.INDEX then BUILD.INDEX before you can use an index.
Good Luck,
Tony
As Craig said, you can do this with the hash file in a directory other than account, but you have to create a VOC entry for it. Ray explained how to do this one time, but I don't have it handy. And I'm still learning Universe (still learning DataStage for that matter ), so I don't have this off the top of my head either...
Try searching the archives for hash and VOC maybe???
Also, as someone already stated, this lookup method can be VERY slow if you can't convince it to use existing indexes and you may have to create secondary indexes. Curiously enough, Universe requires you to CREATE.INDEX then BUILD.INDEX before you can use an index.
Good Luck,
Tony
ND
The VOC is like SYS.ALL_TABLES in Oracle. It controls which hash files are accessable directly by a project. You can create VOC records in the editor called ED or use SETFILE. If you created a hash file in the project for Customer then the VOC record would look like this at TCL:
ED VOC Customer
F
Customer
D_Customer
Below this project you would see these files. There would be a Customer directory with 3 files in it .Type30, DATA.30 and OVER.30. This is where the hash file data is stored. This is because the default hash file is DYNAMIC or type 30. The D_Customer file stores the column names for this hash file. You can view these columns names by typing in "LIST DICT Customer".
To access hash files that were created in a specific directory like "/u2" then the VOC record looks like this:
F
/u2/Customer
/u2/D_Customer
You can get help on ED by typing in "HELP ED" or "?" once you are in ED. The same is true for "HELP SETFILE".
Kim.
The VOC is like SYS.ALL_TABLES in Oracle. It controls which hash files are accessable directly by a project. You can create VOC records in the editor called ED or use SETFILE. If you created a hash file in the project for Customer then the VOC record would look like this at TCL:
ED VOC Customer
F
Customer
D_Customer
Below this project you would see these files. There would be a Customer directory with 3 files in it .Type30, DATA.30 and OVER.30. This is where the hash file data is stored. This is because the default hash file is DYNAMIC or type 30. The D_Customer file stores the column names for this hash file. You can view these columns names by typing in "LIST DICT Customer".
To access hash files that were created in a specific directory like "/u2" then the VOC record looks like this:
F
/u2/Customer
/u2/D_Customer
You can get help on ED by typing in "HELP ED" or "?" once you are in ED. The same is true for "HELP SETFILE".
Kim.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
There is a single command you can use, called SETFILE. It will prompt for any argument not given. Syntax is:
SETFILE pathname localname [OVERWRITING]
pathname is the pathname of the hashed file (data portion)
localname is the unqualified name that goes in the VOC file
Example
SETFILE /data03/hashedfiles/hshPerson hshPerson OVERWRITING
Ray Wurlod
Education and Consulting Services
ABN 57 092 448 518
SETFILE pathname localname [OVERWRITING]
pathname is the pathname of the hashed file (data portion)
localname is the unqualified name that goes in the VOC file
Example
SETFILE /data03/hashedfiles/hshPerson hshPerson OVERWRITING
Ray Wurlod
Education and Consulting Services
ABN 57 092 448 518
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
To answer the original question, all you have to do is lie. Don't declare Hash.B as a key column (in this job's Transformer stage).
However, this introduces the possibility that there is more than one row in the hashed file that meets your criterion (Data.A = Hash.A). A hashed file stage will only return the first matching row.
If you want all rows, use a UV stage rather than a hashed file stage, and check the multi-row result set check box. This will generate one row of output for each found row.
The other issue is performance. Because you're not accessing the hashed file with its entire key, it will not use hashing but will use a table scan instead (slow). You may need to create an index on the Hash.A column.
If the hashed file was created from a UV stage, a standard SQL statement is used, for example:
CREATE INDEX HASHA_IDX ON hashfile(Hash.A);
If the hashed file was created from a Hashed File stage, a slightly different method is used, involving two non-SQL commands:
CREATE.INDEX hashfile Hash.A
BUILD.INDEX hashfile Hash.A
Ray Wurlod
Education and Consulting Services
ABN 57 092 448 518
However, this introduces the possibility that there is more than one row in the hashed file that meets your criterion (Data.A = Hash.A). A hashed file stage will only return the first matching row.
If you want all rows, use a UV stage rather than a hashed file stage, and check the multi-row result set check box. This will generate one row of output for each found row.
The other issue is performance. Because you're not accessing the hashed file with its entire key, it will not use hashing but will use a table scan instead (slow). You may need to create an index on the Hash.A column.
If the hashed file was created from a UV stage, a standard SQL statement is used, for example:
CREATE INDEX HASHA_IDX ON hashfile(Hash.A);
If the hashed file was created from a Hashed File stage, a slightly different method is used, involving two non-SQL commands:
CREATE.INDEX hashfile Hash.A
BUILD.INDEX hashfile Hash.A
Ray Wurlod
Education and Consulting Services
ABN 57 092 448 518